File size: 3,089 Bytes
096e7d1
 
 
 
 
 
 
 
213bc3e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
deb5bcc
213bc3e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d693fb8
213bc3e
 
096e7d1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
---
license: mit
datasets:
- ibleducation/ibl-khanacademy-transcripts
language:
- en
library_name: transformers
---

# ibleducation/ibl-neural-edu-content-7B
ibleducation/ibl-neural-edu-content-7B is a model finetuned on top of mistralai/Mistral-7B-v0.1


The model is finetuned to generate appropriate subtitles that can further be used with video generators to create tutorial videos.
The content of generated response includes full timestamps as well as content. The content provides a full tutorial to a topic passed as input.


## Example Conversations
1. Question:  Information Theory \
   Answer:
   ```code
    WEBVTT
    Kind: captions
    Language: en
    
    00:00:00.320 --> 00:00:01.153
    - [Instructor] In this video,
    
    00:00:01.153 --> 00:00:03.270
    we're going to talk about information.
    
    00:00:03.270 --> 00:00:05.100
    And I know what you're thinking.
    
    00:00:05.100 --> 00:00:07.030
    You're thinking, I know what information is.
    
    00:00:07.030 --> 00:00:08.860
    I read the newspaper every day.
    
    00:00:08.860 --> 00:00:10.860
    I watch TV shows.
    
    00:00:10.860 --> 00:00:12.690
    I talk to my friends.
    
    00:00:12.690 --> 00:00:14.520
    I know what information is.
    
    00:00:14.520 --> 00:00:16.450
    But what we're going to
    talk about in this video
    
    00:00:16.450 --> 00:00:18.280
    is a very specific definition
    
    00:00:18.280 --> 00:00:20.150
    of what information is.
    
    00:00:20.150 --> 00:00:22.150
    And it's a very mathematical definition.
    
    00:00:22.150 --> 00:00:24.150
    And it's a very specific definition
   [.... content shortened for brevity ...]
    ```



## Model Details

- **Developed by:** [IBL Education](https://ibl.ai)
- **Model type:** [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
- **Base Model:** [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
- **Language:** English
- **Finetuned from weights:** [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
- **Finetuned on data:**
  - [ibleducation/ibl-khanacademy-transcripts](https://huggingface.co/datasets/ibleducation/ibl-khanacademy-transcripts)
- **Model License:** MIT

## How to Get Started with the Model

### Install the necessary packages

Requires: [transformers](https://pypi.org/project/transformers/) > 4.35.0 
```shell
pip install transformers
pip install accelerate
```
### You can then try the following example code

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import transformers
import torch

model_id = "ibleducation/ibl-neural-edu-content-7B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
  model_id,
  device_map="auto",
)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)
prompt = "<s>[INST]Information Theory[/INST] "

response = pipeline(prompt)
print(response['generated_text'])
```

**Important** - Use the prompt template below:
```
<s>[INST]{prompt}[/INST] 
```