RAG Chunking Strategies

Chunking is the step that turns parsed documents into searchable units. A RAG system does not usually retrieve a whole document. It retrieves chunks. That means chunking directly affects what the retriever can find, what the reranker can compare, and what the LLM can use as context.

Short Answer

There is no single best chunking strategy.

A good chunking strategy depends on:

document structure
document length
question type
retrieval method
embedding model
metadata quality
answer format
context window budget

The practical goal is not to find the perfect chunk size. The goal is to make each chunk contain enough meaning to be retrieved and enough context to be answered.

Too Small

The chunk may match the query, but it may not contain enough surrounding context to answer correctly.

Too Large

The chunk may contain the answer, but the embedding may become noisy because too many unrelated ideas are mixed together.

Why Chunking Matters

Chunking controls the unit of retrieval.

If the correct answer is split across multiple chunks, retrieval may only return half of the evidence. If unrelated rules are mixed into the same chunk, the LLM may read conflicting context.

Chunking is not only a storage problem. It affects the whole RAG pipeline.

Stage	How Chunking Affects It
Embedding	The vector represents the chunk, not the whole document
Retrieval	Search returns chunks based on chunk-level similarity
Reranking	Candidate chunks are compared against the user question
LLM Context	The model only sees the selected chunks
Debugging	Engineers inspect chunk-level evidence

A weak chunking strategy can make a good retriever look bad. It can also make the LLM look unreliable when the real problem is that the answer was cut away from its context.

Strategy 1: Fixed-Size Chunking

Fixed-size chunking splits text by a fixed number of characters, words, or tokens.

Example:

Setting	Meaning
Chunk size	500 tokens
Overlap	50 tokens
Split rule	Keep cutting until the document ends

This is the simplest strategy. It is easy to implement and easy to test.

When It Works

Fixed-size chunking works reasonably well when documents are plain text and do not have strong structure.

Use it for:

notes
transcripts
loose articles
raw text dumps
early baseline experiments

Main Weakness

The splitter does not understand meaning.

It may cut through a paragraph, a table, a list, or a section. That can separate a rule from its condition.

For example, it may keep this in one chunk:

Customers can request a refund within 14 days after purchase.

But the next chunk may contain the condition:

This only applies if they have completed less than 20% of the course content.

Each chunk is now incomplete.

Strategy 2: Fixed-Size With Overlap

Fixed-size with overlap is an improvement over pure fixed-size chunking.

Instead of cutting the text into isolated blocks, each chunk shares some text with the previous chunk.

Example:

Chunk	Content Range
Chunk 1	Tokens 1-500
Chunk 2	Tokens 451-950
Chunk 3	Tokens 901-1400

The overlap reduces the risk of cutting important context at the boundary.

When It Works

Use fixed-size with overlap when:

you need a fast baseline
the source text is not well structured
the documents are long
answers may appear near chunk boundaries
you want predictable chunk length

This is often the first practical strategy because it is simple, stable, and easy to compare.

Main Weakness

Overlap increases storage and retrieval noise.

The same sentence may appear in multiple chunks. This can cause duplicate retrieval results and waste context window budget.

Strategy 3: Paragraph-Based Chunking

Paragraph-based chunking splits text by natural paragraph boundaries.

This strategy respects the author's original writing structure better than fixed-size splitting.

When It Works

Use paragraph-based chunking for:

blog posts
essays
documentation pages
policy text
explanation-heavy content

Paragraphs usually contain one local idea. That makes them good candidates for embedding and retrieval.

Main Weakness

Paragraph length is not consistent.

Some paragraphs are too short to be useful. Some are too long and contain multiple ideas. A production system often needs extra rules, such as merging short paragraphs or splitting very long paragraphs.

Strategy 4: Heading-Aware Chunking

Heading-aware chunking uses document titles and section headings to guide the split.

Instead of treating text as one flat stream, it keeps the section structure.

Example chunk format:

Document: Refund and Cancellation Policy
Section: Subscription Cancellation

Monthly subscriptions can be cancelled at any time. The cancellation will stop the next billing cycle, but it does not refund the current active month.

This is usually stronger than plain paragraph chunking because the heading becomes part of the context.

When It Works

Use heading-aware chunking for:

API documentation
product manuals
policy documents
technical guides
knowledge base articles
structured Markdown or HTML pages

This strategy works well when the document already has meaningful headings.

Main Weakness

It depends on good parsing.

If the parser fails to detect headings, the chunker will build bad chunks. Heading-aware chunking is only as good as the parsed structure it receives.

Strategy 5: Recursive Chunking

Recursive chunking tries to split text using a hierarchy of separators.

A common order looks like this:

section
paragraph
sentence
token limit

The splitter first tries to preserve larger meaningful units. If a unit is too large, it recursively splits it into smaller units.

When It Works

Use recursive chunking when you want a strong general-purpose default.

It works well for many document types because it tries to respect structure while still enforcing a size limit.

Use it for:

mixed Markdown documents
documentation pages
support articles
semi-structured text
early production RAG systems

Main Weakness

It is still rule-based.

It does not truly understand meaning. It only follows split rules. If the document structure is messy, recursive chunking may still produce weak chunks.

Strategy 6: Semantic Chunking

Semantic chunking splits text based on meaning instead of only size or separators.

The idea is to group sentences or paragraphs that are semantically close, then start a new chunk when the topic shifts.

When It Works

Use semantic chunking when:

documents contain long sections with multiple topics
paragraph boundaries are weak
the same section mixes several different concepts
retrieval quality is poor with rule-based splitting

Semantic chunking can produce chunks that feel more natural to the embedding model because each chunk is more topically focused.

Main Weakness

It is more expensive and less predictable.

It may require embeddings or model calls during indexing. It can also be harder to debug because chunk boundaries are generated by similarity behavior, not simple rules.

Strategy 7: Parent-Child Chunking

Parent-child chunking separates the retrieval unit from the context unit.

The child chunk is small and searchable. The parent chunk is larger and used for final context.

Example:

Unit	Purpose
Child chunk	Used for embedding and retrieval
Parent chunk	Returned to the LLM after child match

This solves a common problem: small chunks retrieve well, but large chunks answer better.

When It Works

Use parent-child chunking when:

small chunks improve search accuracy
answers need surrounding context
sections are too large for direct embedding
the LLM needs a full policy rule, not one isolated sentence

For example, the retriever may match the child sentence about "no refund for current active month", but the system can return the full "Subscription Cancellation" section as parent context.

Main Weakness

It needs more metadata and careful linking.

Every child chunk must know its parent. If the relationship is wrong, retrieval may find the right sentence but return the wrong surrounding context.

Strategy 8: Table-Aware Chunking

Table-aware chunking preserves table rows, columns, and labels.

This matters because flattening a table into plain text can destroy the meaning.

Bad table chunk:

Single Course 14 days Less than 20% completed Monthly Subscription Before next billing cycle No refund for current active month

Better table chunk:

Refund Summary
Purchase Type: Single Course
Refund Window: 14 days
Important Condition: Less than 20% completed

When It Works

Use table-aware chunking for:

pricing tables
policy summaries
comparison tables
product spec sheets
financial reports
configuration matrices

Main Weakness

Table chunking needs custom logic.

Some tables should be chunked by row. Some should be chunked as a full table. Some need both: one chunk for each row and one chunk for the whole table summary.

How to Select the Right Strategy

The correct chunking strategy depends on the shape of the source document and the type of question you expect.

Situation	Better Strategy	Reason
Plain long text	Fixed-size with overlap	Fast baseline and predictable size
Markdown or HTML docs	Heading-aware or recursive	Preserves document structure
Policy documents	Heading-aware with parent-child	Rules need section context
Tables and specs	Table-aware	Preserves row-column meaning
Mixed-topic long sections	Semantic chunking	Splits by topic shift
FAQ pages	Question-answer pair chunking	Each pair is already a retrieval unit
Code documentation	Heading-aware with code block preservation	Avoids separating code from explanation
Large knowledge base	Recursive plus metadata filtering	Balances structure and scale

A good selection process usually starts simple, then becomes more specific after evaluation.

Start With a Baseline

Use recursive or fixed-size with overlap first. Measure whether the correct answer appears in top-k retrieval.

Inspect Failures

If the correct content exists but is not retrieved, inspect whether the chunk is too large, too small, or missing context.

Use Structure When Available

If the source has headings, tables, or sections, preserve them instead of flattening everything.

Add Complexity Only When Needed

Semantic or parent-child chunking is useful, but it adds indexing cost and debugging complexity.

Why There Is No Best Strategy

There is no best chunking strategy because chunking is a trade-off.

Each strategy optimizes for a different kind of retrieval behavior.

Trade-Off	Small Chunk	Large Chunk
Retrieval precision	Usually higher	Usually lower
Context completeness	Usually lower	Usually higher
Embedding noise	Lower	Higher
Risk of missing condition	Higher	Lower
Context window cost	Lower	Higher

Small chunks are easier to match but may lose the surrounding condition. Large chunks preserve context but may dilute the embedding.

The best strategy for one document type can be bad for another. A support FAQ, an API reference, a legal policy, and a product table should not be chunked the same way.

The practical question is not "Which chunking strategy is best?" The better question is "Which failure mode am I trying to reduce?"

Reusable Example: Chunking the Policy Document

From the previous log, we used this document:

Document Title: Refund and Cancellation Policy
Product: LearnPro Online Course Platform
Version: 2026.04
Owner: Billing Team

1. General Refund Rule
Customers can request a refund within 14 days after purchase if they have completed less than 20% of the course content.

2. Digital Course Activation
Once a customer downloads course materials or receives a completion certificate, the purchase is no longer refundable.

3. Subscription Cancellation
Monthly subscriptions can be cancelled at any time. The cancellation will stop the next billing cycle, but it does not refund the current active month.

4. Enterprise Customers
Enterprise customers with custom contracts should contact the account manager. Their refund terms follow the signed contract instead of the standard policy.

5. Support Contact
For billing issues, customers should contact billing-support@learnpro.example.

Refund Summary:
Purchase Type | Refund Window | Important Condition
Single Course | 14 days | Less than 20% completed
Monthly Subscription | Before next billing cycle | No refund for current active month
Enterprise Contract | Based on contract | Contact account manager

For this document, a good first strategy is heading-aware chunking with table-aware handling.

Example chunk:

{
  "chunk_id": "doc_refund_policy_learnpro_2026_04__subscription_cancellation",
  "document_id": "doc_refund_policy_learnpro_2026_04",
  "title": "Refund and Cancellation Policy",
  "section": "Subscription Cancellation",
  "text": "Monthly subscriptions can be cancelled at any time. The cancellation will stop the next billing cycle, but it does not refund the current active month.",
  "metadata": {
    "product": "LearnPro Online Course Platform",
    "domain": "billing",
    "document_type": "policy",
    "version": "2026.04"
  }
}

For the table, row-level chunks can make retrieval more precise:

{
  "chunk_id": "doc_refund_policy_learnpro_2026_04__refund_summary__monthly_subscription",
  "document_id": "doc_refund_policy_learnpro_2026_04",
  "title": "Refund and Cancellation Policy",
  "section": "Refund Summary",
  "text": "Refund Summary. Purchase Type: Monthly Subscription. Refund Window: Before next billing cycle. Important Condition: No refund for current active month.",
  "metadata": {
    "product": "LearnPro Online Course Platform",
    "domain": "billing",
    "document_type": "policy",
    "version": "2026.04",
    "block_type": "table_row"
  }
}

This structure keeps the chunk focused, traceable, and useful for retrieval.

What Is Commonly Used Now

In many practical RAG systems, the most common starting point is still recursive chunking or fixed-size chunking with overlap.

The reason is not that it is always the best. The reason is that it is easy to implement, easy to compare, and good enough for many early systems.

A more mature setup often moves toward:

recursive chunking for general documents
heading-aware chunking for structured documentation
table-aware chunking for tabular content
parent-child chunking when small retrieval units need larger answer context

So the common path is: start with a simple baseline, evaluate retrieval failures, then add structure-aware chunking where the data clearly needs it.

The Main Principle

Chunking is not about cutting text into equal pieces. It is about choosing the right retrieval unit.

There is no universal best strategy because different documents fail in different ways. Some need smaller chunks for precision. Some need larger chunks for context. Some need headings. Some need table preservation. Some need parent-child relationships.

The practical rule is simple: choose the chunking strategy based on the failure you want to reduce, then prove it with retrieval evaluation.

Chunking 是把 parsed document 转换成可搜索单元的步骤。RAG 系统一般不会直接检索整份文档，而是检索 chunk。所以 chunking 会直接影响 retriever 找到什么、reranker 比较什么，以及 LLM 最后能拿到什么上下文。

简短答案

没有一个永远最好的 chunking strategy。

好的 chunking strategy 取决于：

文档结构
文档长度
问题类型
检索方式
embedding model
metadata 质量
回答格式
context window 预算

实际目标不是找到完美的 chunk size。真正目标是让每个 chunk 同时具备两个条件：可以被准确检索，并且包含足够上下文来回答问题。

太小

chunk 可能可以匹配 query，但没有足够上下文让模型正确回答。

太大

chunk 可能包含答案，但因为混入太多无关信息，embedding 会变得不够聚焦。

为什么 Chunking 重要

Chunking 控制的是检索单位。

如果正确答案被切到多个 chunk 里，retrieval 可能只拿到一半证据。如果无关规则被混在同一个 chunk 里，LLM 可能会读到互相冲突的上下文。

Chunking 不只是存储问题。它会影响整条 RAG pipeline。

阶段	Chunking 如何影响它
Embedding	向量表示的是 chunk，不是整份文档
Retrieval	搜索是根据 chunk-level similarity 返回结果
Reranking	候选 chunk 会和用户问题比较
LLM Context	模型只能看到被选中的 chunks
Debugging	工程师检查的是 chunk-level evidence

差的 chunking strategy 会让好的 retriever 看起来很差。它也可能让 LLM 看起来不稳定，但真正问题其实是答案和上下文被切坏了。

策略一：固定大小切片

固定大小切片会按照固定字符数、词数或 token 数来切文本。

例子：

设置	含义
Chunk size	500 tokens
Overlap	50 tokens
Split rule	一直切到文档结束

这是最简单的策略。它容易实现，也容易测试。

什么时候适合

固定大小切片适合文档结构不明显的纯文本。

适合用于：

笔记
transcript
普通文章
原始文本 dump
早期 baseline 实验

主要弱点

splitter 不理解语义。

它可能会切断段落、表格、列表或章节。这样会把规则和条件分开。

比如它可能把这句话放在一个 chunk：

Customers can request a refund within 14 days after purchase.

但条件被切到下一个 chunk：

This only applies if they have completed less than 20% of the course content.

这样每个 chunk 都不完整。

策略二：固定大小加 Overlap

固定大小加 overlap 是纯固定切片的改良版。

它不会把每个 chunk 完全隔开，而是让相邻 chunk 共享一部分文本。

例子：

Chunk	Content Range
Chunk 1	Tokens 1-500
Chunk 2	Tokens 451-950
Chunk 3	Tokens 901-1400

overlap 可以降低重要上下文刚好被切断的风险。

什么时候适合

适合在这些情况使用：

需要快速 baseline
source text 结构不明显
文档很长
答案可能出现在 chunk 边界附近
希望 chunk 长度可预测

这通常是第一个实用策略，因为它简单、稳定，也容易比较。

主要弱点

overlap 会增加存储量和检索噪音。

同一句话可能出现在多个 chunk 里。这会导致重复检索结果，也会浪费 context window。

策略三：按段落切片

按段落切片会根据自然段落边界来切文本。

这个策略比固定大小切片更尊重作者原本的写作结构。

什么时候适合

适合用于：

blog posts
essays
documentation pages
policy text
解释型内容

段落通常会表达一个局部想法，所以适合作为 embedding 和 retrieval 的单位。

主要弱点

段落长度不稳定。

有些段落太短，没有足够信息。有些段落太长，里面包含多个想法。生产系统通常还需要额外规则，比如合并太短的段落，或者继续切开太长的段落。

策略四：Heading-Aware Chunking

heading-aware chunking 会使用文档标题和章节标题来决定切片。

它不会把文本当成一条扁平 stream，而是保留 section structure。

chunk 可以长这样：

Document: Refund and Cancellation Policy
Section: Subscription Cancellation

Monthly subscriptions can be cancelled at any time. The cancellation will stop the next billing cycle, but it does not refund the current active month.

这通常比普通段落切片更强，因为 heading 会变成上下文的一部分。

什么时候适合

适合用于：

API documentation
product manuals
policy documents
technical guides
knowledge base articles
结构良好的 Markdown 或 HTML 页面

当文档本来就有清楚标题时，这个策略很有用。

主要弱点

它依赖好的 parsing。

如果 parser 没有正确识别 heading，chunker 就会产出差的 chunk。heading-aware chunking 的质量取决于 parsed structure 的质量。

策略五：Recursive Chunking

recursive chunking 会按照一组层级 separator 逐层切文本。

常见顺序是：

section
paragraph
sentence
token limit

splitter 会先尝试保留较大的有意义单位。如果这个单位太大，它再递归切成更小的单位。

什么时候适合

如果你想要一个比较强的通用默认策略，可以先用 recursive chunking。

它适合很多文档类型，因为它会尽量保留结构，同时又会强制控制大小。

适合用于：

混合 Markdown 文档
documentation pages
support articles
半结构化文本
早期 production RAG systems

主要弱点

它本质上还是 rule-based。

它不是真的理解语义，只是跟着分隔规则走。如果文档结构很乱，recursive chunking 仍然可能产出不好的 chunk。

策略六：Semantic Chunking

semantic chunking 会根据语义来切文本，而不是只看大小或 separator。

它的想法是把语义接近的句子或段落放在一起，当主题开始变化时，就切出新的 chunk。

什么时候适合

适合在这些情况使用：

文档的 section 很长，而且包含多个主题
段落边界不可靠
同一个 section 混合了不同概念
rule-based splitting 的检索质量不好

semantic chunking 产出的 chunk 通常会更符合 embedding model 的使用方式，因为每个 chunk 的主题更集中。

主要弱点

它更贵，也更不稳定。

它可能需要在 indexing 阶段使用 embedding 或 model call。它也比较难调试，因为 chunk 边界不是简单规则切出来的，而是根据相似度行为生成的。

策略七：Parent-Child Chunking

parent-child chunking 会把检索单位和上下文单位分开。

child chunk 小，用来 embedding 和 retrieval。parent chunk 大，用来提供给 LLM 作为最终上下文。

例子：

单位	目的
Child chunk	用来 embedding 和 retrieval
Parent chunk	child 命中后返回给 LLM

这个策略解决了一个常见问题：小 chunk 好检索，大 chunk 好回答。

什么时候适合

适合在这些情况使用：

小 chunk 可以提高搜索准确度
答案需要周围上下文
section 太大，不适合直接 embedding
LLM 需要完整 policy rule，而不是一句孤立文字

比如 retriever 可能命中 “no refund for current active month” 这句 child sentence，但系统可以返回完整的 “Subscription Cancellation” section 作为 parent context。

主要弱点

它需要更多 metadata 和更仔细的 linking。

每个 child chunk 都必须知道自己的 parent。如果关系错了，retrieval 可能找到正确句子，但返回错误的上下文。

策略八：Table-Aware Chunking

table-aware chunking 会保留表格的 rows、columns 和 labels。

这很重要，因为把表格压成普通文字可能会破坏含义。

差的 table chunk：

Single Course 14 days Less than 20% completed Monthly Subscription Before next billing cycle No refund for current active month

更好的 table chunk：

Refund Summary
Purchase Type: Single Course
Refund Window: 14 days
Important Condition: Less than 20% completed

什么时候适合

适合用于：

pricing tables
policy summaries
comparison tables
product spec sheets
financial reports
configuration matrices

主要弱点

table chunking 需要自定义逻辑。

有些表格应该按 row 切。有些应该整张表保留。有些需要两种都做：每一 row 一个 chunk，再额外保留整张表的 summary chunk。

如何选择正确策略

正确的 chunking strategy 取决于 source document 的形状，以及你预期用户会问什么类型的问题。

情况	更适合的策略	原因
普通长文本	Fixed-size with overlap	快速 baseline，长度可预测
Markdown 或 HTML docs	Heading-aware 或 recursive	保留文档结构
Policy documents	Heading-aware with parent-child	规则通常需要章节上下文
Tables and specs	Table-aware	保留行列关系
混合主题长章节	Semantic chunking	按主题变化切开
FAQ pages	Question-answer pair chunking	每组问答本身就是检索单位
Code documentation	Heading-aware with code block preservation	避免代码和解释分离
大型知识库	Recursive plus metadata filtering	平衡结构和规模

好的选择流程通常是先简单，再根据 evaluation 结果变具体。

先做 Baseline

先用 recursive 或 fixed-size with overlap。检查正确答案是否出现在 top-k retrieval 里。

检查失败案例

如果正确内容存在但没有被检索到，检查 chunk 是太大、太小，还是缺少上下文。

有结构就保留结构

如果 source 有 headings、tables 或 sections，就不要全部压平成普通文本。

必要时才增加复杂度

semantic 或 parent-child chunking 有用，但会增加 indexing 成本和 debugging 难度。

为什么没有最佳策略

没有最好的 chunking strategy，因为 chunking 本质上是 trade-off。

每种策略都在优化不同的 retrieval behavior。

Trade-Off	小 Chunk	大 Chunk
Retrieval precision	通常更高	通常更低
Context completeness	通常更低	通常更高
Embedding noise	更低	更高
漏掉条件的风险	更高	更低
Context window cost	更低	更高

小 chunk 更容易匹配，但可能失去周围条件。大 chunk 保留上下文，但可能让 embedding 变得不够聚焦。

一个策略对某类文档很好，不代表它对所有文档都好。support FAQ、API reference、legal policy 和 product table 不应该用完全一样的切法。

所以真正的问题不是 “哪个 chunking strategy 最好？” 更好的问题是 “我现在想减少哪一种失败？”

可复用例子：切分 Policy Document

上一篇 log 使用了这份文档：

Document Title: Refund and Cancellation Policy
Product: LearnPro Online Course Platform
Version: 2026.04
Owner: Billing Team

1. General Refund Rule
Customers can request a refund within 14 days after purchase if they have completed less than 20% of the course content.

2. Digital Course Activation
Once a customer downloads course materials or receives a completion certificate, the purchase is no longer refundable.

3. Subscription Cancellation
Monthly subscriptions can be cancelled at any time. The cancellation will stop the next billing cycle, but it does not refund the current active month.

4. Enterprise Customers
Enterprise customers with custom contracts should contact the account manager. Their refund terms follow the signed contract instead of the standard policy.

5. Support Contact
For billing issues, customers should contact billing-support@learnpro.example.

Refund Summary:
Purchase Type | Refund Window | Important Condition
Single Course | 14 days | Less than 20% completed
Monthly Subscription | Before next billing cycle | No refund for current active month
Enterprise Contract | Based on contract | Contact account manager

对这份文档来说，比较好的第一版策略是 heading-aware chunking 加 table-aware handling。

section chunk 可以长这样：

{
  "chunk_id": "doc_refund_policy_learnpro_2026_04__subscription_cancellation",
  "document_id": "doc_refund_policy_learnpro_2026_04",
  "title": "Refund and Cancellation Policy",
  "section": "Subscription Cancellation",
  "text": "Monthly subscriptions can be cancelled at any time. The cancellation will stop the next billing cycle, but it does not refund the current active month.",
  "metadata": {
    "product": "LearnPro Online Course Platform",
    "domain": "billing",
    "document_type": "policy",
    "version": "2026.04"
  }
}

对表格来说，row-level chunk 可以让检索更精准：

{
  "chunk_id": "doc_refund_policy_learnpro_2026_04__refund_summary__monthly_subscription",
  "document_id": "doc_refund_policy_learnpro_2026_04",
  "title": "Refund and Cancellation Policy",
  "section": "Refund Summary",
  "text": "Refund Summary. Purchase Type: Monthly Subscription. Refund Window: Before next billing cycle. Important Condition: No refund for current active month.",
  "metadata": {
    "product": "LearnPro Online Course Platform",
    "domain": "billing",
    "document_type": "policy",
    "version": "2026.04",
    "block_type": "table_row"
  }
}

这种结构让 chunk 保持聚焦、可追踪，也更适合 retrieval。

现在常见的做法

在很多实际 RAG 系统里，最常见的起点仍然是 recursive chunking，或者 fixed-size chunking with overlap。

原因不是它们永远最好，而是它们容易实现、容易比较，而且对很多早期系统来说已经足够可用。

更成熟的系统通常会逐步走向：

general documents 使用 recursive chunking
structured documentation 使用 heading-aware chunking
tabular content 使用 table-aware chunking
当小检索单位需要大上下文时，使用 parent-child chunking

所以常见路线是：先用简单 baseline，评估 retrieval failure，再在资料明显需要结构时加入 structure-aware chunking。

核心原则

Chunking 不是把文字平均切开。它是在选择正确的 retrieval unit。

没有通用最佳策略，因为不同文档的失败方式不同。有些需要小 chunk 提高 precision。有些需要大 chunk 保留 context。有些需要 headings。有些需要保留表格。有些需要 parent-child 关系。

实用规则很简单：根据你想减少的失败类型选择 chunking strategy，然后用 retrieval evaluation 证明它真的有效。

Short Answer

Too Small

Too Large

Why Chunking Matters

Strategy 1: Fixed-Size Chunking

When It Works

Main Weakness

Strategy 2: Fixed-Size With Overlap

When It Works

Main Weakness

Strategy 3: Paragraph-Based Chunking

When It Works

Main Weakness

Strategy 4: Heading-Aware Chunking

When It Works

Main Weakness

Strategy 5: Recursive Chunking

When It Works

Main Weakness

Strategy 6: Semantic Chunking

When It Works

Main Weakness

Strategy 7: Parent-Child Chunking

When It Works

Main Weakness

Strategy 8: Table-Aware Chunking

When It Works

Main Weakness

How to Select the Right Strategy

Start With a Baseline

Inspect Failures

Use Structure When Available

Add Complexity Only When Needed

Why There Is No Best Strategy

Reusable Example: Chunking the Policy Document

What Is Commonly Used Now

The Main Principle

简短答案

太小

太大

为什么 Chunking 重要

策略一：固定大小切片

什么时候适合

主要弱点

策略二：固定大小加 Overlap

什么时候适合

主要弱点

策略三：按段落切片

什么时候适合

主要弱点

策略四：Heading-Aware Chunking

什么时候适合

主要弱点

策略五：Recursive Chunking

什么时候适合

主要弱点

策略六：Semantic Chunking

什么时候适合

主要弱点

策略七：Parent-Child Chunking

什么时候适合

主要弱点

策略八：Table-Aware Chunking

什么时候适合

主要弱点

如何选择正确策略

先做 Baseline

检查失败案例

有结构就保留结构

必要时才增加复杂度

为什么没有最佳策略

可复用例子：切分 Policy Document

现在常见的做法

核心原则

Step By Step Build Your RAG