wraps commited on
Commit
bd6b7af
·
verified ·
1 Parent(s): da69e6b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -1
README.md CHANGED
@@ -3,4 +3,51 @@ license: apache-2.0
3
  datasets:
4
  - wraps/flux1_dev-small
5
  base_model: vikhyatk/moondream2
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  datasets:
4
  - wraps/flux1_dev-small
5
  base_model: vikhyatk/moondream2
6
+ ---
7
+ # Moondream-Caption: Custom Small Vision Model based on Moondream2
8
+
9
+ Moondream-Caption is a custom small vision model based on [moondream2](https://huggingface.co/vikhyatk/moondream2) by vikhyatk. It has been fine-tuned on a specific dataset to enhance its image description capabilities.
10
+
11
+ ### Key Features:
12
+
13
+ - Based on the moondream2 architecture
14
+ - Fine-tuned for image caption generation
15
+ - Trained on a high-quality custom dataset
16
+
17
+ ## Dataset
18
+
19
+ The dataset used for training Moondream-Caption is specifically designed for image captioning tasks. It has the following characteristics:
20
+
21
+ - Images generated with flux1_dev
22
+ - Highly accurate and verified descriptive captions
23
+ - Wide variety of visual content
24
+
25
+
26
+ ## Usage
27
+
28
+ You can use Moondream-Caption for image captioning tasks by leveraging the Hugging Face Transformers library. Here's a quick example of how to generate captions for an image:
29
+
30
+ ```python
31
+ from transformers import AutoTokenizer, AutoModelForCausalLM
32
+ from PIL import Image
33
+
34
+ moondream = AutoModelForCausalLM.from_pretrained(
35
+ "wraps/moondream-caption", trust_remote_code=True
36
+ )
37
+ tokenizer = AutoTokenizer.from_pretrained("wraps/moondream-caption")
38
+
39
+ image = Image.open("path/to/your/image.jpg")
40
+ enc_image = moondream.encode_image(image)
41
+ caption = model.answer_question(enc_image, "Write a long caption for this image")
42
+
43
+ print(caption)
44
+ ```
45
+
46
+ ## Example
47
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/643fd05fdc984afcbbbb47d0/0o8Ev_eB69A-2uCqT3QV2.png)
48
+
49
+ **Output Caption**: A close-up portrait of a green alien with a large oval head, enormous black almond-shaped eyes, small nostrils, and a tiny mouth. The alien has a long, thin neck and is wearing a black t-shirt with white text that reads 'humans scare me'. The background shows a pale blue sky with soft, wispy clouds.
50
+
51
+ ## Limitations
52
+
53
+ While Moondream-Caption is designed to generate accurate and relevant image captions, it may not perform optimally on images that significantly differ from the training dataset. Additionally, the model may struggle with complex or abstract images that deviate from the dataset's content. Please open an issue on the model's repository if you encounter any limitations or issues.