Initial model upload

fb5b6d0 30 days ago

4.91 kB

	---
	base_model: meta-llama/Llama-3.1-8B-Instruct
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- llama
	- trl
	- sft
	datasets:
	- rasa/command-generation-calm-v2
	pipeline_tag: text-generation
	---

	# Model Card for Command Generator

	<!-- Provide a quick summary of what the model is/does. -->

	This is a Dialogue Understanding (DU) model developed by Rasa.
	It can be used to power assistants built with the [Conversational AI with Language Models (CALM) approach](https://rasa.com/docs/rasa-pro/calm) developed by Rasa.

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->
	This model takes as input a transcript of an ongoing conversation between an AI assistant and a user,
	as well as structured information about the assistant's business logic.
	As output, it produces a short sequence of [commands](https://rasa.com/docs/rasa-pro/concepts/dialogue-understanding#command-reference)
	(typically 1-3) from the following list:

	* `start flow flow_name`
	* `set slot slot_name slot_value`
	* `cancel flow`
	* `disambiguate flows flow_name1 flow_name2 ... flow_name_n`
	* `provide info`
	* `offtopic reply`
	* `hand over`

	Note that this model can only produce commands to be interpreted by Rasa.
	It cannot be used to generate arbitrary text.

	The Command Generator translates user messages into this internal grammar, allowing CALM to progress the conversation.

	Examples:

	> I want to transfer money

	`start flow transfer_money`

	> I want to transfer $55 to John

	```
	start flow transfer_money
	set slot recipient John
	set slot amount 55
	```

	- Developed by: Rasa Technologies
	- Model type: Text Generation
	- Language(s) (NLP): English
	- License: Apache 2.0
	- Finetuned from model [optional]: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)


	## Uses

	<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

	The Command Generator is used as part of an AI assistant developed with Rasa's CALM paradigm.
	Typical use cases include customer-facing chatbots, voice assistants, IVR systems, and internal chatbots in large organizations.

	### Direct Use

	This model can be directly used as part of the command generator component if the flows of your CALM assistant are similar to flows used in the [rasa-calm-demo assistant](https://github.com/RasaHQ/rasa-calm-demo).

	### Downstream Use [optional]

	The model can also be used as a base model to fine-tune further on your own assistant's data using the [fine-tuning recipe feature](https://rasa.com/docs/rasa-pro/building-assistants/fine-tuning-recipe#step-2-prepare-the-fine-tuning-dataset) available in rasa pro

	### Out-of-Scope Use

	Since the model has been explicitly fine-tuned to output the grammar of commands, it shouldn't be used to generate any other free form content.

	## Bias, Risks, and Limitations

	<!-- This section is meant to convey both technical and sociotechnical limitations. -->

	The Command Generator interprets conversations and translates user messages into commands.
	These commands are processed by Rasa to advance the conversations.
	This model does not generate text to be sent to and end user, and is incapable of generating problematic
	or harmful text.

	However, as with any pre-trained model, its predictions are susceptible to bias.
	For example, the accuracy of the model varies with the language used. The authors have tested the performance on English but haven't tried the model in any other language.


	## Training Details

	### Training Data

	<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

	Trained on the `train` split of [rasa/command-generation-calm-v2](https://huggingface.co/datasets/rasa/command-generation-calm-v2).

	### Training Procedure

	Trained using the notebook available [here](https://github.com/RasaHQ/notebooks/blob/main/cmd_gen_finetuning.ipynb). Used a single A100 GPU with 40GB VRAM.

	## Evaluation

	<!-- This section describes the evaluation protocols and provides the results. -->

	### Testing Data, Factors & Metrics

	#### Testing Data

	Evaluated on the `test` split of [rasa/command-generation-calm-v2](https://huggingface.co/datasets/rasa/command-generation-calm-v2).


	#### Metrics

	F1 score per command type (StartFlow, SetSlot, etc.) is the main metric chosen to evaluate the model on the test split.
	This helps us understand which commands the model has learnt well and the ones that the model needs more training on.

	### Results

	To be added

	## Model Card Contact

	If you have questions about the dataset, please reach out to us on the [Rasa forum](https://forum.rasa.com/c/rasa-pro-calm/36).