keras
/

qwen2.5_3b_en

Text Generation

Model card Files Files and versions Community

qwen2.5_3b_en / README.md

Divyasreepat's picture

Upload folder using huggingface_hub

00fd396 verified about 2 months ago

|

913 Bytes

	---
	library_name: keras-hub
	pipeline_tag: text-generation
	---
	This is a [`Qwen` model](https://keras.io/api/keras_hub/models/qwen) uploaded using the KerasHub library and can be used with JAX, TensorFlow, and PyTorch backends.
	This model is related to a `CausalLM` task.

	Model config:
	* name: qwen_backbone
	* trainable: True
	* vocabulary_size: 151936
	* num_layers: 36
	* num_query_heads: 16
	* hidden_dim: 2048
	* intermediate_dim: 11008
	* rope_max_wavelength: 1000000.0
	* rope_scaling_factor: 1.0
	* num_key_value_heads: 2
	* layer_norm_epsilon: 1e-06
	* dropout: 0
	* tie_word_embeddings: True
	* use_sliding_window_attention: [False]
	* sliding_window_size: 32768

	This model card has been generated automatically and should be completed by the model author. See [Model Cards documentation](https://huggingface.co/docs/hub/model-cards) for more information.