Text Generation
Transformers
PyTorch
Safetensors
English
rwkv
finance
Inference Endpoints
umuthopeyildirim commited on
Commit
2e2d4a0
·
verified ·
1 Parent(s): add6f19

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -0
README.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - gbharti/finance-alpaca
5
+ language:
6
+ - en
7
+ library_name: transformers
8
+ tags:
9
+ - finance
10
+ widget:
11
+ - text: >-
12
+ user: Hypothetical, can taxes ever cause a net loss on otherwise-profitable stocks?
13
+
14
+ bot:
15
+ example_title: Hypothetical
16
+ - text: >-
17
+ user: What are some signs that the stock market might crash?
18
+
19
+ bot:
20
+ example_title: Question 2
21
+ - text: >-
22
+ user: Where should I be investing my money?
23
+
24
+ bot:
25
+ example_title: Question
26
+ - text: >-
27
+ user: Is this headline positive or negative? Headline: Australian Tycoon
28
+ Forrest Shuts Nickel Mines After Prices Crash.
29
+
30
+ bot:
31
+ example_title: Sentiment analysis
32
+ - text: >-
33
+ user: Aluminum price per KG is 50$. Forecast max: +1$ min:+0.3$. What should
34
+ be the current price of aluminum?
35
+
36
+ bot:
37
+ example_title: Forecast
38
+ ---
39
+
40
+ # Fin-RWKV: Attention Free Financal Expert (WIP)
41
+ Fin-RWKV is a cutting-edge, attention-free model designed specifically for financial analysis and prediction. Developed as part of a MindsDB Hackathon, this model leverages the simplicity and efficiency of the RWKV architecture to process financial data, providing insights and forecasts with remarkable accuracy. Fin-RWKV is tailored for professionals and enthusiasts in the finance sector who seek to integrate advanced deep learning techniques into their financial analyses.
42
+
43
+ ## Use Cases
44
+ - Sentiment analysis
45
+ - Forecast
46
+ - Product Pricing
47
+
48
+ ## Features
49
+ - Attention-Free Architecture: Utilizes the RWKV (Recurrent Weighted Kernel-based) model, which bypasses the complexity of attention mechanisms while maintaining high performance.
50
+ - Lower Costs: 10x to over a 100x+ lower inference cost, 2x to 10x lower training cost
51
+ - Tinyyyy: Lightweight enough to run on CPUs in real-time bypassing the GPU - and is able to run on your laptop today
52
+ - Finance-Specific Training: Trained on the gbharti/finance-alpaca dataset, ensuring that the model is finely tuned for financial data analysis.
53
+ - Transformers Library Integration: Built on the popular 'transformers' library, ensuring easy integration with existing ML pipelines and applications.
54
+
55
+ ## How to use
56
+ ```py
57
+ from transformers import AutoTokenizer, AutoModelForCausalLM, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
58
+ from threading import Thread
59
+ import torch
60
+
61
+ tokenizer = AutoTokenizer.from_pretrained("umuthopeyildirim/fin-rwkv-1b5")
62
+ model = AutoModelForCausalLM.from_pretrained("umuthopeyildirim/fin-rwkv-1b5")
63
+
64
+ prompt = "user: Is this headline positive or negative? Headline: Australian Tycoon Forrest Shuts Nickel Mines After Prices Crash\nbot:"
65
+
66
+ # Tokenize the input
67
+ input_ids = tokenizer.encode(prompt, return_tensors="pt")
68
+
69
+ # Generate a response
70
+ output = model.generate(input_ids, max_length=333, num_return_sequences=1)
71
+
72
+ # Decode the output
73
+ generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
74
+
75
+ print(generated_text)
76
+ ```
77
+
78
+ ## Competing Against
79
+ | Name | Param Count | Cost | Inference Cost |
80
+ |---------------|-------------|------|----------------|
81
+ | Fin-RWKV | 1B5 | $3 | Free on HuggingFace 🤗 & Low-End CPU |
82
+ | [BloombergGPT](https://www.bloomberg.com/company/press/bloomberggpt-50-billion-parameter-llm-tuned-finance/) | 50 Billion | $1.3 million | Enterprise GPUs |
83
+ | [FinGPT](https://huggingface.co/FinGPT) | 7 Bilion | $302.4 | Consumer GPUs |
84
+
85
+
86
+ | Architecture | Status | Compute Efficiency | Largest Model | Trained Token | Link |
87
+ |--------------|--------|--------------------|---------------|---------------|------|
88
+ | (Fin)RWKV | In Production | O ( N ) | 14B | 500B++ (the pile+) | [Paper](https://arxiv.org/abs/2305.13048) |
89
+ | Ret Net (Microsoft) | Research | O ( N ) | 6.7B | 100B (mixed) | [Paper](https://arxiv.org/abs/2307.08621) |
90
+ | State Space (Stanford) | Prototype | O ( Log N ) | 355M | 15B (the pile, subset) | [Paper](https://arxiv.org/abs/2302.10866) |
91
+ | Liquid (MIT) | Research | - | <1M | - | [Paper](https://arxiv.org/abs/2302.10866) |
92
+ | Transformer Architecture (included for contrasting reference) | In Production | O ( N^2 ) | 800B (est) | 13T++ (est) | - |
93
+
94
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/631ea4247beada30465fa606/7vAOYsXH1vhTyh22o6jYB.png" width="500" alt="Inference computational cost vs. Number of tokens">
95
+
96
+ ## Stats for nerds
97
+ ### Training Config
98
+ - n_epoch: 100
99
+ - epoch_save_frequency: 10
100
+ - batch_size: 5
101
+ - ctx_len: 2000
102
+ - T_MAX: 384
103
+ - RWKV_FLOAT_MODE: fp16
104
+ - RWKV_DEEPSPEED: 0
105
+
106
+ ### Loss
107
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/631ea4247beada30465fa606/NvPKCBlbVhiVeeMpUAv2C.png" width="500" alt="Loss">
108
+
109
+ _Note: Needs more data and training, testing purposes only. Not recomended for production level deployment._
110
+ [Presentation](https://docs.google.com/presentation/d/1vNQ8Y5wwR0WXlO60fsXjkru5R9I0ZgykTmgag0B3Ato/edit?usp=sharing)