CineAI
/

Llama32-3B-CoT

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

CineAI commited on 10 days ago

Commit

69f2c85

·

verified ·

1 Parent(s): 645b047

Update README.md

Files changed (1) hide show

README.md +23 -21

README.md CHANGED Viewed

@@ -149,27 +149,29 @@ datasets:
 # How to use
-  You can use it with a script
-  model, tokenizer = FastLanguageModel.from_pretrained(
-        model_name="CineAI/Llama32-3B-CoT",
-        max_seq_length=max_length,
-        dtype=dtype,
-        load_in_4bit=load_in_4bit
-    )
-  FastLanguageModel.for_inference(model)
-  inputs = tokenizer.apply_chat_template(
-      message,
-      tokenize = True,
-      add_generation_prompt = True, # Must add for generation
-      return_tensors = "pt",
-  ).to(device)
-  text_streamer = TextStreamer(tokenizer, skip_prompt = True)
-  _ = model.generate(input_ids = inputs, streamer = text_streamer, max_new_tokens = max_new_tokens,
-                    use_cache = True, temperature = temperature, min_p = min_p)
 # Uploaded  model

 # How to use
+You can use it with a script
+```python
+model, tokenizer = FastLanguageModel.from_pretrained(
+      model_name="CineAI/Llama32-3B-CoT",
+      max_seq_length=max_length,
+      dtype=dtype,
+      load_in_4bit=load_in_4bit
+  )
+FastLanguageModel.for_inference(model)
+inputs = tokenizer.apply_chat_template(
+    message,
+    tokenize = True,
+    add_generation_prompt = True, # Must add for generation
+    return_tensors = "pt",
+).to(device)
+text_streamer = TextStreamer(tokenizer, skip_prompt = True)
+_ = model.generate(input_ids = inputs, streamer = text_streamer, max_new_tokens = max_new_tokens,
+                  use_cache = True, temperature = temperature, min_p = min_p)
+```
 # Uploaded  model