. A shorter query, such as a single sentence or phrase, will concentrate on specifics and may be better suited for matching against sentence-level embeddings. A longer query that spans more than one sentence or a paragraph may be more in tune with embeddings at the paragraph or document level because it is likely looking for broader context or themes
? For instance, sentence-transformer models work well on individual sentences, but a model like text-embedding-ada-002 performs better on chunks containing 256 or 512 tokens
Here’s an example for performing fixed-sized chunking with LangChain: text = ”…”
#your textaren’t going to be exactly the same size, they’ll still “aspire” to be of a similar size
you can either use multiple indices or a single index with multiple namespaces.
