metadata
library_name: transformers
tags:
- pruna-ai
Model Card for PrunaAI/test-tiny-random-llama4-smashed
This model was created using the pruna library. Pruna is a model optimization framework built for developers, enabling you to deliver more efficient models with minimal implementation overhead.
Usage
First things first, you need to install the pruna library:
pip install pruna
You can use the transformers library to load the model or use the pruna library to load the model using the following code:
from pruna import PrunaModel
loaded_model = PrunaModel.from_hub(
"PrunaAI/test-tiny-random-llama4-smashed"
)
After loading the model, you can use the inference methods of the original model. Take a look at the documentation for more usage information.
Smash Configuration
The compression configuration of the model is stored in the smash_config.json
file, which describes the optimization methods that were applied to the model.
{
"batcher": null,
"cacher": null,
"compiler": null,
"pruner": null,
"quantizer": null,
"batch_size": 1,
"device": "cpu",
"save_fns": [],
"load_fns": [
"transformers"
],
"reapply_after_load": {
"pruner": null,
"quantizer": null,
"cacher": null,
"compiler": null,
"batcher": null
}
}