metadata

library_name: transformers
tags:
  - pruna-ai

Model Card for PrunaAI/test-tiny-random-llama4-smashed

This model was created using the pruna library. Pruna is a model optimization framework built for developers, enabling you to deliver more efficient models with minimal implementation overhead.

Usage

First things first, you need to install the pruna library:

pip install pruna

You can use the transformers library to load the model or use the pruna library to load the model using the following code:

from pruna import PrunaModel

loaded_model = PrunaModel.from_hub(
    "PrunaAI/test-tiny-random-llama4-smashed"
)

After loading the model, you can use the inference methods of the original model. Take a look at the documentation for more usage information.

Smash Configuration

The compression configuration of the model is stored in the smash_config.json file, which describes the optimization methods that were applied to the model.

{
    "batcher": null,
    "cacher": null,
    "compiler": null,
    "pruner": null,
    "quantizer": null,
    "batch_size": 1,
    "device": "cpu",
    "save_fns": [],
    "load_fns": [
        "transformers"
    ],
    "reapply_after_load": {
        "pruner": null,
        "quantizer": null,
        "cacher": null,
        "compiler": null,
        "batcher": null
    }
}

PrunaAI
/

test-load-tiny-random-llama4-smashed

Model Card for PrunaAI/test-tiny-random-llama4-smashed

Usage

Smash Configuration

🌍 Join the Pruna AI community!