Asher
commited on
Commit
·
146ac84
1
Parent(s):
c864579
doc: minor fix.
Browse files
README.md
CHANGED
@@ -294,7 +294,7 @@ You can build and run vLLM from source after merging this pull request into your
|
|
294 |
|
295 |
### Model Context Length Support
|
296 |
|
297 |
-
The Hunyuan A13B model supports a maximum context length of **256K tokens (262,144
|
298 |
|
299 |
#### Extending Context Length to 256K
|
300 |
|
|
|
294 |
|
295 |
### Model Context Length Support
|
296 |
|
297 |
+
The Hunyuan A13B model supports a maximum context length of **256K tokens (262,144 tokens)**. However, due to GPU memory constraints on most hardware setups, the default configuration in `config.json` limits the context length to **32K tokens** to prevent out-of-memory (OOM) errors.
|
298 |
|
299 |
#### Extending Context Length to 256K
|
300 |
|