hyunjongkimmath
commited on
Commit
·
5f072b7
1
Parent(s):
8c08a3b
Add model
Browse files- README.md +15 -124
- model.pkl +2 -2
- pyproject.toml +1 -1
README.md
CHANGED
@@ -1,141 +1,32 @@
|
|
1 |
---
|
2 |
-
license: gpl-2.0
|
3 |
tags:
|
4 |
- fastai
|
5 |
-
- multi-label-classification
|
6 |
-
- mathematics
|
7 |
-
- text-classification
|
8 |
-
- math
|
9 |
---
|
10 |
|
|
|
11 |
|
12 |
-
|
13 |
|
14 |
-
|
|
|
15 |
|
16 |
-
|
17 |
|
18 |
-
|
19 |
|
20 |
-
|
21 |
|
22 |
-
The model classifies whether a mathematical text is or contains the following common types of mathematical text: definition, notation, concept (i.e. theorems, propositions, corollaries, lemmas, etc.), proof, narrative (e.g. the text one encounters in the beginning of a chapter or section in a book or in between theorems), exercise, remark, example.
|
23 |
|
24 |
-
|
25 |
-
|
26 |
-
## Intended uses & limitations
|
27 |
-
|
28 |
-
This model is intended to take as input mathematical text that one might encounter in an undergraduate/graduate/research setting and output some tags concerning what kind of text the input is. The input text is also intended to take text of at most a few tens of thousands of characters long (or several pages of most undergraduate or graduate textbooks), but in practice, the author has experienced better results with shorter text.
|
29 |
-
|
30 |
-
This model was trained on a corpus mostly of algebra, algebraic geometry, arithmetic geometry, and number theory, which are the author's primary mathematical interests. The corpus also contains some text of topology and analysis.
|
31 |
-
|
32 |
-
This model is also trained on inputs which end with a line generally of the form `[^1]: <Last names of authors of the text>, <Some identification of the text>`. As such, text without such a line (or at least without just the beginning `[^1]:`) The author initially constructed the training pipeline to include such lines in the training data in hopes that the text identification would aid in training when the training data contained only a few hundred to a few thousand samples of text.
|
33 |
-
|
34 |
-
## How to use
|
35 |
-
|
36 |
-
The author pieced together the following code block based on his readings from [this blog post](https://huggingface.co/blog/fastai).
|
37 |
-
|
38 |
-
First load the model:
|
39 |
-
|
40 |
-
```python
|
41 |
-
from huggingface_hub import from_pretrained_fastai
|
42 |
-
repo_id = 'hyunjongkimmath/math_text_tag_categorization'
|
43 |
-
model = from_pretrained_fastai(repo_id)
|
44 |
-
```
|
45 |
-
|
46 |
-
### Example usage
|
47 |
-
|
48 |
-
The following are predictions that the model makes on small pieces of text that the author came up on the spot.
|
49 |
-
|
50 |
-
```python
|
51 |
-
model.predict(r"""
|
52 |
-
EXERCISE 1. Prove that a ring homomorphism $k \to k'$ of fields is injective.
|
53 |
-
"""
|
54 |
-
)
|
55 |
-
```
|
56 |
-
|
57 |
-
```python
|
58 |
-
((#1) ['#_meta/exercise'],
|
59 |
-
tensor([False, False, False, False, False, False, False, False, True, False,
|
60 |
-
False, False, False, False]),
|
61 |
-
tensor([1.6156e-03, 2.4018e-04, 8.5640e-03, 1.0265e-01, 1.3623e-05, 1.1528e-06,
|
62 |
-
2.0139e-02, 4.6183e-04, 9.9342e-01, 5.3418e-03, 2.1265e-04, 1.0663e-03,
|
63 |
-
2.8052e-03, 1.5590e-02]))
|
64 |
-
```
|
65 |
-
|
66 |
-
```python
|
67 |
-
model.predict(r"""
|
68 |
-
A scheme is a locally ringed space $(X, \mathscr{O}_X)$ such that $X$ has a cover $X = \bigcup_{i \in I} U_i$ by open subsets for which $(U_i, \mathscr{O}_X|_{U_i}$ is an affine scheme for every $i \in I$.
|
69 |
-
"""
|
70 |
-
)
|
71 |
-
```
|
72 |
-
|
73 |
-
```python
|
74 |
-
((#1) ['#_meta/definition'],
|
75 |
-
tensor([False, False, False, False, False, False, True, False, False, False,
|
76 |
-
False, False, False, False]),
|
77 |
-
tensor([1.5086e-04, 2.4751e-03, 3.2685e-03, 1.0054e-02, 2.2898e-09, 2.7758e-08,
|
78 |
-
9.8818e-01, 3.4790e-06, 2.0679e-04, 2.6567e-04, 8.2683e-03, 2.9197e-04,
|
79 |
-
3.8374e-05, 2.4789e-04]))
|
80 |
-
```
|
81 |
-
|
82 |
-
```python
|
83 |
-
model.predict(r"""
|
84 |
-
Theorem. $\mathbb{C}$ is algebraically closed.
|
85 |
-
"""
|
86 |
-
)
|
87 |
-
```
|
88 |
-
|
89 |
-
```python
|
90 |
-
((#1) ['#_meta/concept'],
|
91 |
-
tensor([False, False, False, True, False, False, False, False, False, False,
|
92 |
-
False, False, False, False]),
|
93 |
-
tensor([3.7847e-03, 2.3521e-03, 1.8541e-02, 9.3016e-01, 7.3878e-06, 1.2939e-04,
|
94 |
-
7.3363e-02, 4.7909e-04, 2.8213e-05, 7.7005e-03, 1.4716e-02, 1.4401e-02,
|
95 |
-
1.2660e-04, 3.5716e-02]))
|
96 |
-
```
|
97 |
-
|
98 |
-
```python
|
99 |
-
model.predict(r"""
|
100 |
-
Theorem. $\mathbb{C}$ is algebraically closed.
|
101 |
-
Proof. Do so algebraic topology stuff.
|
102 |
-
"""
|
103 |
-
)
|
104 |
-
```
|
105 |
-
|
106 |
-
```python
|
107 |
-
((#2) ['#_meta/concept','#_meta/proof'],
|
108 |
-
tensor([False, False, False, True, False, False, False, False, False, False,
|
109 |
-
False, True, False, False]),
|
110 |
-
tensor([4.8982e-03, 2.5899e-03, 4.1548e-02, 9.9772e-01, 1.4636e-05, 4.6300e-04,
|
111 |
-
6.7154e-02, 7.7806e-04, 4.3370e-05, 3.9953e-03, 4.8603e-03, 8.3507e-01,
|
112 |
-
1.3487e-03, 1.0530e-02]))
|
113 |
-
```
|
114 |
-
|
115 |
-
```python
|
116 |
-
model.predict(r"""
|
117 |
-
We write "\alpha+1" to denote the successor ordinal of "\alpha".
|
118 |
-
"""
|
119 |
-
```
|
120 |
-
|
121 |
-
Notice that the model correct identifies this text as containing a notation, but incorrectly identifies it as containing a concept.
|
122 |
-
|
123 |
-
```python
|
124 |
-
((#2) ['#_meta/concept','#_meta/notation'],
|
125 |
-
tensor([False, False, False, True, False, False, False, False, False, False,
|
126 |
-
True, False, False, False]),
|
127 |
-
tensor([3.1503e-02, 9.0384e-05, 3.4131e-02, 6.2074e-01, 3.1992e-03, 1.8337e-04,
|
128 |
-
6.9371e-02, 1.1152e-03, 1.4070e-02, 6.4971e-02, 6.8179e-01, 2.8263e-02,
|
129 |
-
7.2798e-04, 8.7313e-02]))
|
130 |
-
```
|
131 |
|
132 |
|
133 |
-
|
134 |
-
During training, the model has achieved over 95% accuracy on its validation dataset, which was chosen randomly from its entire dataset, according to fastai's [multi_accuracy](https://docs.fast.ai/metrics.html) metric.
|
135 |
|
136 |
-
##
|
137 |
-
|
138 |
|
139 |
-
|
|
|
140 |
|
141 |
-
|
|
|
|
1 |
---
|
|
|
2 |
tags:
|
3 |
- fastai
|
|
|
|
|
|
|
|
|
4 |
---
|
5 |
|
6 |
+
# Amazing!
|
7 |
|
8 |
+
🥳 Congratulations on hosting your fastai model on the Hugging Face Hub!
|
9 |
|
10 |
+
# Some next steps
|
11 |
+
1. Fill out this model card with more information (see the template below and the [documentation here](https://huggingface.co/docs/hub/model-repos))!
|
12 |
|
13 |
+
2. Create a demo in Gradio or Streamlit using 🤗 Spaces ([documentation here](https://huggingface.co/docs/hub/spaces)).
|
14 |
|
15 |
+
3. Join the fastai community on the [Fastai Discord](https://discord.com/invite/YKrxeNn)!
|
16 |
|
17 |
+
Greetings fellow fastlearner 🤝! Don't forget to delete this content from your model card.
|
18 |
|
|
|
19 |
|
20 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
|
23 |
+
# Model card
|
|
|
24 |
|
25 |
+
## Model description
|
26 |
+
More information needed
|
27 |
|
28 |
+
## Intended uses & limitations
|
29 |
+
More information needed
|
30 |
|
31 |
+
## Training and evaluation data
|
32 |
+
More information needed
|
model.pkl
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:3d7917f8f3625f0023d0ecbe095fcb52ab354e7430bc262fece0d6cb877743ca
|
3 |
+
size 166126233
|
pyproject.toml
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
[build-system]
|
2 |
-
requires = ["setuptools>=40.8.0", "wheel", "python=3.10.6", "fastai=2.7.
|
3 |
build-backend = "setuptools.build_meta:__legacy__"
|
|
|
1 |
[build-system]
|
2 |
+
requires = ["setuptools>=40.8.0", "wheel", "python=3.10.6", "fastai=2.7.10", "fastcore=1.5.27"]
|
3 |
build-backend = "setuptools.build_meta:__legacy__"
|