ibleducation/ibl-neural-edu-content-7B

ibleducation/ibl-neural-edu-content-7B is a model finetuned on top of mistralai/Mistral-7B-v0.1

The model is finetuned to generate appropriate subtitles that can further be used with video generators to create tutorial videos. The content of generated response includes full timestamps as well as content. The content provides a full tutorial to a topic passed as input.

Example Conversations

Question: Information Theory
Answer:

 WEBVTT
 Kind: captions
 Language: en
 
 00:00:00.320 --> 00:00:01.153
 - [Instructor] In this video,
 
 00:00:01.153 --> 00:00:03.270
 we're going to talk about information.
 
 00:00:03.270 --> 00:00:05.100
 And I know what you're thinking.
 
 00:00:05.100 --> 00:00:07.030
 You're thinking, I know what information is.
 
 00:00:07.030 --> 00:00:08.860
 I read the newspaper every day.
 
 00:00:08.860 --> 00:00:10.860
 I watch TV shows.
 
 00:00:10.860 --> 00:00:12.690
 I talk to my friends.
 
 00:00:12.690 --> 00:00:14.520
 I know what information is.
 
 00:00:14.520 --> 00:00:16.450
 But what we're going to
 talk about in this video
 
 00:00:16.450 --> 00:00:18.280
 is a very specific definition
 
 00:00:18.280 --> 00:00:20.150
 of what information is.
 
 00:00:20.150 --> 00:00:22.150
 And it's a very mathematical definition.
 
 00:00:22.150 --> 00:00:24.150
 And it's a very specific definition
[.... content shortened for brevity ...]

Model Details

Developed by: IBL Education
Model type: Mistral-7B-v0.1
Base Model: Mistral-7B-v0.1
Language: English
Finetuned from weights: Mistral-7B-v0.1
Finetuned on data:
- ibleducation/ibl-khanacademy-transcripts
Model License: MIT

How to Get Started with the Model

Install the necessary packages

Requires: transformers > 4.35.0

pip install transformers
pip install accelerate

You can then try the following example code

from transformers import AutoModelForCausalLM, AutoTokenizer
import transformers
import torch

model_id = "ibleducation/ibl-neural-edu-content-7B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
  model_id,
  device_map="auto",
)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)
prompt = "<s>[INST]Information Theory[/INST] "

response = pipeline(prompt)
print(response['generated_text'])

Important - Use the prompt template below:

<s>[INST]{prompt}[/INST]

iblai
/

ibl-neural-edu-content-7B