|
--- |
|
license: mit |
|
datasets: |
|
- ibleducation/ibl-khanacademy-transcripts |
|
language: |
|
- en |
|
library_name: transformers |
|
--- |
|
|
|
# ibleducation/ibl-neural-edu-content-7B |
|
ibleducation/ibl-neural-edu-content-7B is a model finetuned on top of mistralai/Mistral-7B-v0.1 |
|
|
|
|
|
The model is finetuned to generate appropriate subtitles that can further be used with video generators to create tutorial videos. |
|
The content of generated response includes full timestamps as well as content. The content provides a full tutorial to a topic passed as input. |
|
|
|
|
|
## Example Conversations |
|
1. Question: Information Theory \ |
|
Answer: |
|
```code |
|
WEBVTT |
|
Kind: captions |
|
Language: en |
|
|
|
00:00:00.320 --> 00:00:01.153 |
|
- [Instructor] In this video, |
|
|
|
00:00:01.153 --> 00:00:03.270 |
|
we're going to talk about information. |
|
|
|
00:00:03.270 --> 00:00:05.100 |
|
And I know what you're thinking. |
|
|
|
00:00:05.100 --> 00:00:07.030 |
|
You're thinking, I know what information is. |
|
|
|
00:00:07.030 --> 00:00:08.860 |
|
I read the newspaper every day. |
|
|
|
00:00:08.860 --> 00:00:10.860 |
|
I watch TV shows. |
|
|
|
00:00:10.860 --> 00:00:12.690 |
|
I talk to my friends. |
|
|
|
00:00:12.690 --> 00:00:14.520 |
|
I know what information is. |
|
|
|
00:00:14.520 --> 00:00:16.450 |
|
But what we're going to |
|
talk about in this video |
|
|
|
00:00:16.450 --> 00:00:18.280 |
|
is a very specific definition |
|
|
|
00:00:18.280 --> 00:00:20.150 |
|
of what information is. |
|
|
|
00:00:20.150 --> 00:00:22.150 |
|
And it's a very mathematical definition. |
|
|
|
00:00:22.150 --> 00:00:24.150 |
|
And it's a very specific definition |
|
[.... content shortened for brevity ...] |
|
``` |
|
|
|
|
|
|
|
## Model Details |
|
|
|
- **Developed by:** [IBL Education](https://ibl.ai) |
|
- **Model type:** [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) |
|
- **Base Model:** [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) |
|
- **Language:** English |
|
- **Finetuned from weights:** [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) |
|
- **Finetuned on data:** |
|
- [ibleducation/ibl-khanacademy-transcripts](https://huggingface.co/datasets/ibleducation/ibl-khanacademy-transcripts) |
|
- **Model License:** MIT |
|
|
|
## How to Get Started with the Model |
|
|
|
### Install the necessary packages |
|
|
|
Requires: [transformers](https://pypi.org/project/transformers/) > 4.35.0 |
|
```shell |
|
pip install transformers |
|
pip install accelerate |
|
``` |
|
### You can then try the following example code |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import transformers |
|
import torch |
|
|
|
model_id = "ibleducation/ibl-neural-edu-content-7B" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
device_map="auto", |
|
) |
|
pipeline = transformers.pipeline( |
|
"text-generation", |
|
model=model, |
|
tokenizer=tokenizer, |
|
) |
|
prompt = "<s>[INST]Information Theory[/INST] " |
|
|
|
response = pipeline(prompt) |
|
print(response['generated_text']) |
|
``` |
|
|
|
**Important** - Use the prompt template below: |
|
``` |
|
<s>[INST]{prompt}[/INST] |
|
``` |