sartajbhuvaji commited on
Commit
de2510f
·
verified ·
1 Parent(s): bc3483b

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -0
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model: openai-community/gpt2
6
+ pipeline_tag: text-generation
7
+ tags:
8
+ - art
9
+ ---
10
+
11
+
12
+ ## Model Card: GPT2_Shakespeare
13
+
14
+ ### Model Description
15
+ This model is a fine-tuned version of the GPT-2 base model, fine-tuned on a dataset consisting of works by William Shakespeare to generate text in his tone and style. The model is designed to generate coherent and contextually relevant text, mimicking the unique style and phrasing found in the dataset.
16
+
17
+ ### Model Details
18
+ - **Model Type:** GPT-2 ([Base](https://huggingface.co/openai-community/gpt2))
19
+ - **Training Dataset:** Works by William Shakespeare [Github](https://gist.githubusercontent.com/blakesanie/dde3a2b7e698f52f389532b4b52bc254/raw/76fe1b5e9efcf0d2afdfd78b0bfaa737ad0a67d3/shakespeare.txt)
20
+
21
+ - **Intended Use Cases:**
22
+ - Creative writing assistance
23
+ - Educational purposes for studying literary styles
24
+ - Text generation in the style of William Shakespeare
25
+
26
+ ### Usage
27
+ You can easily use this model to generate text in Python using the Hugging Face `transformers` library.
28
+
29
+ #### Installation
30
+ Ensure you have the `transformers` library installed:
31
+ ```bash
32
+ pip install transformers
33
+ ```
34
+ #### Inference
35
+
36
+ ```python
37
+ from transformers import GPT2LMHeadModel, GPT2Tokenizer
38
+
39
+ # Load the fine-tuned model and tokenizer
40
+ model_name = "sartajbhuvaji/gpt2_B_Shakespeare"
41
+ model = GPT2LMHeadModel.from_pretrained(model_name)
42
+ tokenizer = GPT2Tokenizer.from_pretrained(model_name)
43
+
44
+ # Prepare input text
45
+ input_text = "To be, or not to be, that is the question:"
46
+ input_ids = tokenizer.encode(input_text, return_tensors="pt")
47
+
48
+ # Generate text
49
+ output = model.generate(
50
+ input_ids,
51
+ max_length=200,
52
+ num_return_sequences=1,
53
+ no_repeat_ngram_size=2,
54
+ do_sample=True,
55
+ top_k=50,
56
+ top_p=0.95
57
+ )
58
+
59
+ # Decode the generated text
60
+ generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
61
+ print(generated_text)
62
+
63
+ ```
64
+
65
+ #### Limitations and Biases
66
+ This model has been trained on a specific dataset, and its responses will reflect the content and style of that dataset.
67
+ The model may generate text that reflects the biases present in the original data.
68
+ This model is not suitable for generating factual information or for use cases requiring highly accurate and unbiased outputs.
69
+
70
+ #### Ethical Considerations
71
+ Use this model responsibly. The text generated by the model should not be used for misleading or harmful purposes.
72
+ Note that this model might reflect historical biases inherent in the original text sources.
73
+ Acknowledgments
74
+ This model is based on the GPT-2 architecture by OpenAI and has been fine-tuned using the Hugging Face transformers library.