Enhancing Efficiency in Large Language Model Applications through Chunking Strategies

A M Nandeesh
3 min readJul 8, 2024

--

In the realm of Natural Language Processing (NLP), optimizing the processing of large data files is critical for improving the efficiency and effectiveness of applications leveraging Large Language Models (LLMs) and Generative AI (GenAI) techniques. One fundamental technique that plays a pivotal role in this optimization is chunking. Chunking involves breaking down extensive text or data into smaller, more manageable segments, thereby allowing LLMs to handle and process information more effectively. This process is particularly essential for Retrieve and Generate (RAG) based applications, where efficient data handling directly impacts the quality and speed of information retrieval and content generation.

Introduction to Chunking Strategies

The goal of chunking strategies is to provide LLMs with precisely the information they need for a specific task, ensuring that the processing is efficient and aligned with the application’s requirements. Let’s explore different levels of chunking strategies that serve as foundational approaches for developing Retrieve and Generate (RAG) based LLM applications.

Level 1: Fixed Size Chunking

Fixed Size Chunking is the most basic method where text is segmented into chunks of a predetermined number of characters, irrespective of the content or structure. This straightforward approach is beneficial for initial data processing tasks.

Implementation in Frameworks:

- Langchain and Llamaindex frameworks offer classes such as CharacterTextSplitter and SentenceSplitter.

Key Concepts:

  • Text Splitting: Segmentation based on single characters.
  • Chunk Size: Specified by the number of characters per chunk.
  • Chunk Overlap: Control over the overlap between sequential chunks.
  • Separator: Optional character(s) used for splitting (default is an empty string).

Level 2: Recursive Chunking

Building on Fixed Size Chunking, Recursive Chunking takes into account the hierarchical structure of text and iteratively splits it using predefined separators. This method enhances segmentation by considering the natural breaks in text structure.

Implementation in Frameworks:

  • Langchain framework provides the RecursiveCharacterTextSplitter class.

Default Separators:

  • (“\n\n”, “\n”, “ “, “”) for various structural breaks in text.

Level 3: Document Based Chunking

Document Based Chunking is tailored for structured documents, effectively segmenting based on inherent document structures like Markdown formatting, code blocks in programming languages, or tabular data.

Implementation Guidelines:

  • MarkdownTextSplitter: For Markdown documents.
  • PythonCodeTextSplitter: For code-based documents.
  • HTML or CSV formats: To preserve tabular relationships.
  • Multi-Modal Integration: Utilising models like GPT-4 vision for processing images and generating summaries or embeddings.

Level 4: Semantic Chunking

Semantic Chunking goes beyond structural breaks and focuses on extracting meaning from text through embeddings. It groups segments that share semantic similarities, enhancing the contextual relevance of processed data.

Implementation Features:

  • Llamindex’s SemanticSplitterNodeParse class.

Key Concepts:

  • Buffer Size: Initial window size for chunking.
  • Breakpoint Percentile Threshold: Criterion for splitting based on semantic relevance.
  • Embedding Model: Used to derive semantic information.

Level 5: Agentic Chunking

The most advanced strategy, Agentic Chunking, integrates LLM capabilities to dynamically determine chunk content and size based on contextual cues and semantic propositions within the text.

Implementation Approach:

  • Propositional Retrieval: Initial chunk creation based on standalone statements.
  • Langchain’s Propositional-Retrieval Template: For structuring propositions.
  • LLM-Based Decision Making: Determines chunk inclusion or creation based on contextual relevance.

Conclusion

Implementing effective chunking strategies is crucial for optimising the efficiency and performance of LLM applications. Each level of chunking strategy offers unique advantages, from basic structural segmentation to advanced semantic processing and dynamic content creation. By choosing the right chunking approach based on application requirements, developers can significantly enhance the capabilities of their LLM systems, ensuring they efficiently retrieve and generate relevant information.

What chunking strategies have you employed in your NLP/RAG/GenAI projects? Share your experiences and insights!

--

--