File size: 6,586 Bytes
0772425 cc3f8cb 0772425 cc3f8cb 0772425 cc3f8cb 0772425 ab70036 27b2af9 cc3f8cb 27b2af9 0772425 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 |
---
language:
- en
library_name: transformers
license: cc-by-4.0
tags:
- kl3m
- kl3m-002
- patent
- all the patents
- slm
date: '2024-03-12T00:00:00.000Z'
pipeline_tag: text-generation
widget:
- text: "# Title\n"
- temperature: 0.3
- do_sample: True
---
# All the Patents 170m Model
`kl3m-002-170m-patent` is a a (very) small language model (SLM) model fine-tuned from `kl3m-002-170m` to
generate "realistic" patent text. For more information about the base model,
please see [its model page](https://huggingface.co/alea-institute/kl3m-002-170m).
# All the Patents
## Why?
#### If a GPT2-sized model can generate a valid set of claims, should anyone be able to monopolize the invention?
At their heart, patents are a temporary, sanctioned monopoly on an invention through a license to sue. This monopoly
is justified by the public good created by encouraging innovation and the long-term impact of that innovation being
shared in the public domain.
Unfortunately, this worthy policy goal has been lost in the chaos and misuse of the patent system.
One of the most common sources of frustration is the granting of "obvious" patents. While some inventions are clearly novel
and non-obvious, many are not - but still slip through the examination process. These obvious but granted patents then
loom large over the market, creating a "thicket" that discourages use or subsequent invention in the area of the granted
patent. "Undoing" the grant of a patent is a costly and time-consuming process with possible negative consequences, and
so many of these patents simply sit as prior art on the books, even if the patentholder knows they could never enforce them.
Congress and various stakeholders have discussed and proposed changes over time, including most recently the
America Invents Act (AIA), but the problem of obvious patents persists.
But what if someone were to generate all the obvious inventions and make them public?
What if we shared the means of producing these obvious inventions so that everyone could help generate them on a normal CPU or consumer GPU?
And what if we could then make those obvious inventions easily searchable for anyone, including PTO examiners themselves, to use?
## How it Works
We start with a small, GPT2-sized large language model - [kl3m-170](https://273ventures.com/kl3m-the-first-legal-large-language-model/) - which was trained on a clean, copyright-free dataset.
This helps us ensure that generations do not include copyrighted text, which would allow third-parties to interfere with the project
via DMCA takedown requests.
Next, we fine-tune this model on two simultaneous tasks:
1. **Top-down drafting**: We start from the most abstract parts of the patent - the title and abstract - and then generate the detailed claims. This is a traditional next-token prediction order.
```text
# Patent
## Title
{title}
## Abstract
{abstract}
## Claims
1. {claim 1}
2. {claim 2}
...
```
2. **Bottom-up**: We start from the most detailed part of the patent - the claims - and then generate the abstract and title. This reversed order can be thought of as similar to traditional extractive/abstractive summarization tasks.
```text
# Patent
## Claims
1. {claim 1}
2. {claim 2}
...
## Abstract
{abstract}
## Title
{title}
```
Once this fine-tuning is complete, we can then generate new patents using either technique by prompting the model as follows:
1. **Top-down prompt**: `"# Patent\n\n## Title"`
2. **Bottom-up prompt**: `"# Patent\n\n## Claims"`
It's critical that generation occurs with sufficient randomness and diversity to ensure that the generated patents are not
simply reproductions of the training data. This is a key area of ongoing research and development.
**Much like the real process of invention, most of the "ideas" generated by this process will be either nonsense or
unpatentable otherwise. Our goal is to estimate the "hit rate" of the model and continue to improve the efficiency and
accessibility of the generation process so that the "cost per obvious invention" is as low as possible.**
## Current Status
This project is still in its infancy. We're doing R&D to develop prototype tools to demonstrate the possibility and
cost of generating and sharing these obvious inventions. This R&D is currently focused on data collection,
data curation, model training, and model evaluation.
## Generation
You can generate your own examples as follows. For a "complete" patent, you'll want to extend the `max_new_tokens` value to the biggest number you can fit in your available VRAM.
```python
import json
from transformers import pipeline
# Load the model and tokenizer on CPU
p = pipeline('text-generation', 'alea-institute/kl3m-002-170m-patent', device='cpu')
# Example usage on CPU
text = "# Patent\n\n## Title"
print(
json.dumps(
[
r.get("generated_text")
for r in p(text, do_sample=True, temperature=0.5, num_return_sequences=3, max_new_tokens=32)
],
indent=2
)
)
```
```json
[
"# Patent\n\n## Title\nMethod for manufacturing a temperature-controllable polyurethane composition and method",
"# Patent\n\n## Title\nElectronic device\n\n## Abstract\nAn electronic device includes a display panel and a",
"# Patent\n\n## Title\nMethods and devices for tissue repair using a neural network\n\n## Abstract"
]
```
### Related Material
* https://www.federalregister.gov/documents/2024/02/27/2024-03967/updated-guidance-for-making-a-proper-determination-of-obviousness
## License
This model was originally developed by 273 Ventures and has been donated to the ALEA Institute.
The model weights are released under the CC-BY 4.0 License.
## Contact
The KL3M model family is now maintained by the [ALEA Institute](https://aleainstitute.ai). For technical support, collaboration opportunities, or general inquiries:
- GitHub: https://github.com/alea-institute/kl3m-model-research
- Email: [email protected]
- Website: https://aleainstitute.ai
## Acknowledgments
Special thanks to 273 Ventures for developing and donating this model to the open-source community through the Alea Institute.
## Citation
Tokenizer, dataset, and model publications are pending.
## Contact
For any questions, please contact [ALEA Institute](https://aleainstitute.ai) at [[email protected]](mailto:[email protected]) or
create an issue on this repository or [GitHub](https://github.com/alea-institute/kl3m-model-research).
![https://aleainstitute.ai](https://aleainstitute.ai/images/alea-logo-ascii-1x1.png) |