Chunkers

An exploration of JavaScript text splitters

Using llm-chunk

The maximum number of characters in a chunk

The minimum number of characters in a chunk

The number of overlap characters

Split the text by paragraphs or sentences

Using @langchain/textsplitters

The LangChain text splitter to use

The number of characters per chunk

The number of characters in the overlap between chunks

Using LlamaIndex

Choose the LlamaIndex splitter you want to use

The number of tokens per chunk

The number of tokens in the overlap between chunks

Using semantic-chunking

The maximum number of tokens in a chunk

The minimum cosine similarity required for two sentences to be included in the same chunk.