ilya_lev on Exploring Language Models

9 Comments

Nov 1, 2023

Thanks for the article, Maarten. It really helps to dive into the AI.

I'm trying to analyse 6,6K reviews about retail companies, that I webscrapped from the internet. And the idea was to take out keywords with KeyLLM from each review. (I have two tesla p40 at my home pc, so I think it will not be long in time). For the next step I want to visualize on a umap, like you made in a previous article. But for now I have received a message on a first step, that maximum context length (512) exceeded.

Could you please tell me if this limit can be bypassed? Some reviews, that I analyse, are close to 4K tokens.

Expand full comment

Reply (1)

Maarten Grootendorst

Nov 1, 2023

You would have to either chunk the reviews in order to make sure that you adhere to the token length or increase the token length. You can find some parameters of ctransformers here: https://github.com/marella/ctransformers#config

Expand full comment