vrashad commited on
Commit
4e13265
·
verified ·
1 Parent(s): ca0e25f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -19
README.md CHANGED
@@ -74,6 +74,25 @@ Stride: 255 tokens (50% overlap)
74
  - **Fallback mechanisms**: Intelligent splitting when no semantic boundaries found
75
  - **Combined limits**: Supports both token AND character limits simultaneously
76
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
  ## Quick Start
78
 
79
  ### Installation
@@ -963,25 +982,6 @@ Average tokens per chunk: 236.9
963
  - Semantic boundaries preserved
964
  - No text loss or duplication
965
 
966
-
967
-
968
- # Use Cases
969
-
970
- ## Perfect for RAG Systems
971
- - **Vector Databases**: Ensure chunks fit embedding model limits
972
- - **Search Applications**: Optimal chunk sizes for retrieval
973
- - **Question Answering**: Maintain semantic coherence
974
-
975
- ## Document Processing
976
- - **Academic Papers**: Respect section and paragraph boundaries
977
- - **Legal Documents**: Maintain clause integrity
978
- - **News Articles**: Preserve story flow and context
979
-
980
- ## Content Management
981
- - **CMS Integration**: Automatic content segmentation
982
- - **API Limits**: Respect external service constraints
983
- - **Storage Optimization**: Consistent chunk sizes for databases
984
-
985
  ---
986
 
987
  # Chunking Strategies
 
74
  - **Fallback mechanisms**: Intelligent splitting when no semantic boundaries found
75
  - **Combined limits**: Supports both token AND character limits simultaneously
76
 
77
+
78
+ # Use Cases
79
+
80
+ ## Perfect for RAG Systems
81
+ - **Vector Databases**: Ensure chunks fit embedding model limits
82
+ - **Search Applications**: Optimal chunk sizes for retrieval
83
+ - **Question Answering**: Maintain semantic coherence
84
+
85
+ ## Document Processing
86
+ - **Academic Papers**: Respect section and paragraph boundaries
87
+ - **Legal Documents**: Maintain clause integrity
88
+ - **News Articles**: Preserve story flow and context
89
+
90
+ ## Content Management
91
+ - **CMS Integration**: Automatic content segmentation
92
+ - **API Limits**: Respect external service constraints
93
+ - **Storage Optimization**: Consistent chunk sizes for databases
94
+
95
+
96
  ## Quick Start
97
 
98
  ### Installation
 
982
  - Semantic boundaries preserved
983
  - No text loss or duplication
984
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
985
  ---
986
 
987
  # Chunking Strategies