9 Comments
Oct 13, 2023Liked by Maarten Grootendorst

Article is so good and helpful. Informative.

Expand full comment

Thanks for the article, Maarten. It really helps to dive into the AI.

I'm trying to analyse 6,6K reviews about retail companies, that I webscrapped from the internet. And the idea was to take out keywords with KeyLLM from each review. (I have two tesla p40 at my home pc, so I think it will not be long in time). For the next step I want to visualize on a umap, like you made in a previous article. But for now I have received a message on a first step, that maximum context length (512) exceeded.

Could you please tell me if this limit can be bypassed? Some reviews, that I analyse, are close to 4K tokens.

Expand full comment

Thanks.

Is it possible to use it to anonymize any documents ?

Let’s suppose i want to anonymize some docs for a list of topics, let’s assume a have also a small dictionary for these topics ( to be complete)

How will you deal with this use case please ?

Expand full comment

Thanks for this nice article. Few questions on the part where you have mentioned "We assume that documents that are highly similar will have the same keywords, so there would be no need to extract keywords for all documents", 1> How do you choose which among all semantically close documents will keyLLM use for key word extraction? 2> What would be threshold for similarity check for the grouping the similar documents, is that something which we can configure?

Expand full comment