m7n commited on
Commit
d3f548e
·
verified ·
1 Parent(s): 132d5b7

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 1024,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,744 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:7828
8
+ - loss:TripletLoss
9
+ base_model: answerdotai/ModernBERT-large
10
+ widget:
11
+ - source_sentence: Hypersensitivity to Chironomidae (non-biting midges) has been a
12
+ problem in Northern Sudan since about and is probably due to the working of dams
13
+ which have produced lake-like conditions on parts of the Blue and main Niles where
14
+ breeding has evidently increased. Studies were undertaken to determine whether
15
+ this hypersensitivity is mediated by IgE. Sixteen Sudanese, with bronchial asthma
16
+ associated with exposure to the chironomid, Cladotanytarsus lewisi ('green nimitti'),
17
+ were investigated. All patients gave a positive immediate-type skin reaction to
18
+ an extract of the midge and the majority had markedly elevated concentrations
19
+ of circulating IgE. Serum from all patients passively sensitized human lung fragments
20
+ in vitro for the release of histamine and slow-reacting substance of anaphylaxis
21
+ by the 'nimitti' antigen. This tissue-sensitizing activity could be removed by
22
+ immunoabsorption with an anti-IgE. These results indicate that this widespread
23
+ and important hypersensitivity in the Sudan is IgE-mediated and thus may potentially
24
+ be treated by desensitization.
25
+ sentences:
26
+ - 'Fifty-eight poultry farmers were questioned for possible respiratory reactions
27
+ suggesting a hypersensitivity reaction to chicken antigens. In addition, their
28
+ sera were tested for the presence of antibodies against chicken antigens by several
29
+ immunologic techniques. Of the poultry farmers examined, % had hemagglutinating
30
+ antibodies in titers over : against chicken antigens in their sera while none
31
+ of ten controls had antibodies at a level of : . Furthermore, % of the poultry
32
+ farmers had precipitins against chicken antigens in their sera, but no precipitating
33
+ antibodies could be demonstrated in the sera from ten controls. No poultry farmer,
34
+ however, had symptoms characteristic of a hypersensitivity pneumonitis following
35
+ contact with chickens.'
36
+ - 'Variations in the time course of flowering and in its intensity were recorded
37
+ for regions in Israel along with variable climatic conditions. Such variations
38
+ had an effect on pollen release into the air and on pollen counts. Two approaches
39
+ for forecasting allergenic pollen dispersion were compared: ( ) a survey of floral
40
+ development and flowering intensity and ( ) a survey of airborne pollen. There
41
+ was a positive correlation between the survey of flowering and airborne pollen
42
+ counts, although pollen was occasionally found after the flowering season. Further,
43
+ it predicted fairly accurately the beginning of the flowering time of each species
44
+ in each regio, the expected flowering duration, and intensity. Conducting a survey
45
+ of flowering is a simple, easy and inexpensive operation. It is highly recommended
46
+ for regions with variable climates.'
47
+ - Deficiency of surfactant in alveoli leads to increased resistance to breathing.
48
+ Histamine is a mediator in allergic respiratory diseases. Though the bronchoconstrictor
49
+ effect of histamine is well recognised, histamine may have additional actions
50
+ that contribute to pathogenesis in these diseases. The present study aimed to
51
+ observe the effect of histamine on lecithin, a major component of alveolar surfactant.
52
+ Lecithin content in broncho-alveolar lavage (BAL) fluid of healthy adult male
53
+ rats was estimated by enzymatic method using Boehringer-Mannheim kits. Lecithin
54
+ content in these control animals was compared with that in three groups of healthy
55
+ adult male rats following subcutaneous administration of mg of histamine diphosphate
56
+ at minutes, minutes and minutes intervals, respectively. A significant reduction
57
+ in lecithin levels in BAL fluid was observed up to one hour after administration
58
+ of histamine. The results indicate a possible additional action of histamine in
59
+ the pathogenesis of allergic respiratory diseases.
60
+ - source_sentence: Educational training of a select group of construction workers
61
+ on a nuclear power plant demonstrates the value of both training and experience.
62
+ Training classes were given to three groups of workers involved in the installation
63
+ of surfacemounted plates. These workers were then given an exam that covered the
64
+ topics presented in the training sessions. Each of the three groups differed in
65
+ some respect, notably the time delay between the training sessions and the examinations
66
+ and the amount of experience the workers had in installing the surfacemounted
67
+ plates. Scores are highest for workers who took the examination shortly after
68
+ the training session was given. Among the experienced workers, the workers with
69
+ more work experience have higher scores. It is noted that experienced workers
70
+ differed from the inexperienced workers in the answers given for different types
71
+ of questions. Furthermore, the experienced workers perform better on those exam
72
+ questions that related to issues encountered in the workplace on a regular basis.
73
+ Results also show that workers understand their work better when they are regularly
74
+ informed about the reasons for work items being rejected.
75
+ sentences:
76
+ - The Basal Triassic Sandstone is the basal member of the Lower Triassic Kockatea
77
+ Shale. It is widely distributed adjacent to, and on the Precambrian Greenough
78
+ Block in the northern Perth Basin, Western Australia. This member is lowermost
79
+ Lower Triassic in age in the subsurface of the Don-gara gas field. In outcrop
80
+ on the Greenough Block, the member is represented by a thin basal conglomerate,
81
+ conformably overlain by upper Lower Triassic Kockatea Shale. In this stratigraphic
82
+ study, detailed well-to-well correlations and lithologic studies were integrated
83
+ to reconstruct the depositional history of the interval encompassing the Basal
84
+ Triassic Sandstone. This sandstone is a composite of near-shore marine, and strand
85
+ line accumulations deposited around the flanks and on the Greenough Block during
86
+ a Lower Triassic marine transgression. The sandstone bodies were deposited on
87
+ a drowned, topography of low relief, on progressively truncated Permian formations
88
+ and Precambrian basement. The topography was formed following uplift and tilting
89
+ of the Greenough Block and the overlying Lower Permian formations during mild
90
+ Upper Permian tectonism.
91
+ - Few people would contest the statement that 'training is an investment'. It is
92
+ an investment in the future, one which it is essential for industry to undertake
93
+ in order to meet the continually growing demand for skilled personnel to handle
94
+ the ever increasing pace of change. Rightly, then, training as an investment is
95
+ emphasised on basic training officer courses, it is the subject of many a Government
96
+ exhortation and prominent businessmen go out of their way to emphasise their belief
97
+ in it. It is widely publicised by the Industrial Training Boards. The slogan 'train
98
+ and gain' was not coined by an Industrial Training Board simply to advertise the
99
+ fact that they had been throwing money around as though it were going out of fashion,
100
+ it was also intended to emphasise the long term gains to be made by investing
101
+ in training. Looking even further back to the discussions which preceded the passing
102
+ of the Industrial Training Act, one of the objectives much stressed when the levy
103
+ and grant system was established was to spread the costs of this investment evenly
104
+ across industry. The firms who undertook training were to be subsidised by those
105
+ who benefited from it but did not undertake it. Talking and listening to training
106
+ practitioners it seems axiomatic that training is an investment and equally axiomatic
107
+ that, due to the inexplicable failure of management to recognise this, it is one
108
+ which is in danger of being seriously neglected. To doubt what is so evidently
109
+ the case is something which, in the training field, amounts to nothing less than
110
+ heresy.
111
+ - Out-of-sequence (OOS) construction is regarded as one of the most significant
112
+ factors contributing to construction inefficiencies and loss of labor productivity.
113
+ Nevertheless, no prior research efforts have attempted to study the impacts of
114
+ OOS work on productivity, cost, or even schedule. The goal of this paper is to
115
+ identify and study the causes and early warning signs of OOS work. Using an expert-based
116
+ analytic approach, the authors formed an industry panel of construction professionals
117
+ to gather the basic information related to OOS work and consequently surveyed
118
+ other industry respondents. The panel as well as the survey enabled the authors
119
+ to ( ) recognize the extent of OOS work, ( ) identify causes of OOS work and quantify
120
+ their characteristics, ( ) identify early warning signs of OOS work and investigate
121
+ their relationship with the occurrence of OOS events, ( ) classify the causes
122
+ and early warning signs of OOS into different categories, and ( ) investigate
123
+ the impacts of OOS work on project performance in terms of productivity, schedule,
124
+ cost, quality, and safety. Among many important and interesting results, it was
125
+ clear that late design deliverables represented the most important factor causing
126
+ OOS work, and late start of precommissioning activities was the most highly ranked
127
+ early warning sign for OOS work. This research adds to the body of knowledge by
128
+ providing unprecedented knowledge on why OOS work occurs and by providing a suite
129
+ of practices that contribute most to OOS work in construction projects. This paper
130
+ fills a significant gap in the literature and contributes in assisting project
131
+ participants in better understanding the overall industry perception on the different
132
+ causes, early warning signs, and impacts of OOS work.
133
+ - source_sentence: 'Asian-Pacific Economic LiteratureVolume , Issue p. - Book Review
134
+ Mineral Economics and Policy , J E Tilton and J I Guzman RFF Press, New York City,
135
+ Pp. . ISBN: - - - - Marian Radetzki, Marian Radetzki Lulea University of TechnologySearch
136
+ for more papers by this author Marian Radetzki, Marian Radetzki Lulea University
137
+ of TechnologySearch for more papers by this author First published: November the
138
+ full textAboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack
139
+ citation ShareShare Give accessShare full text accessShare full-text accessPlease
140
+ review our Terms and Conditions of Use and check box below to share full-text
141
+ version of article.I have read and accept the Wiley Online Library Terms and Conditions
142
+ of UseShareable LinkUse the link below to share a full-text version of this article
143
+ with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat
144
+ No abstract is available for this article. Volume00, Issue0November 0000Pages
145
+ - RelatedInformation'
146
+ sentences:
147
+ - 'The urgent need for very large increases in investment in infrastructure in most
148
+ developing countries in Asia is very clear. This paper surveys the challenges
149
+ facing policymakers in the region. Nearly all of the main concerns for policymakers
150
+ in Asia in addressing the global infrastructure imbalance are on the supply side.
151
+ In particular, there are seven related supplyside issues that are of high priority
152
+ for policymakers: selection and preparation of appropriate projects, finance,
153
+ pricing, access, governance and management, policy and regulatory policies, and
154
+ climate change. Governments and utilities need to improve their policies and performance
155
+ to build confidence among stakeholders. Access to infrastructure services needs
156
+ to be improved so that consumers will support realistic pricing policies, and
157
+ investors will be encouraged to provide finance for infrastructure sectors.'
158
+ - 'ADVERTISEMENT RETURN TO ISSUEPREVArticleThe Electromotive Determination of Small
159
+ Amounts of Ferric IronJ. F. KingJ. F. KingMore by J. F. King and R. N. WashburneR.
160
+ N. WashburneMore by R. N. WashburneCite this: J. Phys. Chem. , , , 00000000Publication
161
+ Date (Print):December , 0000Publication History Published online0 May 0000Published
162
+ inissue December 0000https://doi.org/ /j000000a000Request reuse permissions Article
163
+ Views00Altmetric-Citations0LEARN ABOUT THESE METRICSArticle Views are the COUNTER-compliant
164
+ sum of full text article downloads since November (both PDF and HTML) across all
165
+ institutions and individuals. These metrics are regularly updated to reflect usage
166
+ leading up to the last few days.Citations are the number of other articles citing
167
+ this article, calculated by Crossref and updated daily. Find more information
168
+ about Crossref citation counts.The Altmetric Attention Score is a quantitative
169
+ measure of the attention that a research article has received online. Clicking
170
+ on the donut icon will load a page at altmetric.com with additional details about
171
+ the score and the social media presence for the given article. Find more information
172
+ on the Altmetric Attention Score and how the score is calculated. Share Add toView
173
+ InAdd Full Text with ReferenceAdd Description ExportRISCitationCitation and abstractCitation
174
+ and referencesMore Options Share onFacebookTwitterWechatLinked InReddit PDF (
175
+ KB) Get e-Alerts'
176
+ - 'Research Article| June Malayan Iron Ore for Japan Virginia Thompson Virginia
177
+ Thompson Search for other works by this author on: This Site PubMed Google Scholar
178
+ Far Eastern Survey ( ) ( ): . Views Icon Views Article contents Figures & tables
179
+ Video Audio Supplementary Data Peer Review Share Icon Share Facebook Twitter LinkedIn
180
+ Email Tools Icon Tools Get Permissions Cite Icon Cite Search Site Citation Virginia
181
+ Thompson; Malayan Iron Ore for Japan. Far Eastern Survey June ; ( ): . doi: Download
182
+ citation file: Ris (Zotero) Reference Manager EasyBib Bookends Mendeley Papers
183
+ EndNote RefWorks BibTex toolbar search Search Dropdown Menu toolbar search search
184
+ input Search input auto suggest filter your search All ContentAsian Survey Search
185
+ This content is only available via PDF. Article PDF first page preview Close Modal
186
+ You do not currently have access to this content.'
187
+ - source_sentence: Elevated circulating endotoxin levels in the plasma of patients
188
+ with advanced hepatosplenic schistosomiasis caused by Schistosoma mansoni have
189
+ been reported, possibly caused by parasite egg-induced intestinal mucosal breaches
190
+ facilitating bacterial access to the bloodstream. Neither endotoxin levels in
191
+ people with S. mansoni but without hepatosplenic disease nor the impact of treatment
192
+ on endotoxin levels have been described. We used a methodically optimized Limulus
193
+ amebocyte lysate assay to measure plasma endotoxin in community-dwelling women
194
+ from an S. mansoni-endemic area without clinical hepatosplenic disease. We found
195
+ no difference in baseline mean plasma endotoxin levels between those with (n =
196
+ ) and without (n = ) infection ( versus EU/mL, P = ). Endotoxin levels did not
197
+ change in schistosome-infected women after successful treatment ( versus EU/mL,
198
+ P = ) and were not correlated with circulating anodic antigen or stool egg burden.
199
+ Our findings do not support the hypothesis that translocating eggs in S. mansoni
200
+ infection introduce bacterial sources of endotoxin to the circulation.
201
+ sentences:
202
+ - In January , a survey of intestinal parasitic infections was performed in a primary
203
+ school, central Thailand. Of stool samples, Blastocystis was identified with a
204
+ prevalence of %. Genetic characterization of Blastocystis showed subtype ( %)
205
+ and subtype ( %). Study of the water supply in this school was performed to find
206
+ the possible sources of Blastocystis. Blastocystis from one water sample was identified
207
+ as subtype , which had a nucleotide sequence of small subunit (SSU) ribosomal
208
+ RNA (rRNA) gene that was % identical to that of Blastocystis infected in schoolchildren.
209
+ Our information supports the evidence of water-borne transmission in this population.
210
+ - 'Background: The acute phase proteins (APPs) are plasma proteins whose concentration
211
+ rise or reduce in reaction to infection, inflammation or trauma (Baumann and Gauldie,
212
+ ). The circulating concentration of these proteins are related to the severity
213
+ of the underlying condition,thus quantification of their concentrations provide
214
+ a ready means of giving valuable clinical information and extent of the disease
215
+ processes (Thompson et. al., ). Materials and Methods: Serum levels of three acute
216
+ phase proteins (transferrin, a0-macroglobulin and haptoglobin) were determined
217
+ using single radial immuno-diffusion technique in one hundred and eight Nigerian
218
+ women aged between and years. Theywere made up of thirty pregnant women with urinary
219
+ schistosomiasis (P+USS), thirty-six pregnant women without USS (P-USS), eighteen
220
+ non-pregnant women with USS (NP+USS), and twenty-four healthy non-pregnant women
221
+ without USS (NP-USS) as controls.Results: The result shows that transferrin was
222
+ least in P-USS group and highest in NP+USS. The highest mean value of alpha- macroglobulin
223
+ was found in P+USS group and the least in NP+USS. Haptoglobin was significantly
224
+ reduced in P+USS compared with other groups.Conclusions: The finding of this study
225
+ suggests an independent effect of USS and pregnancy on serum levels of APPs, therefore
226
+ APPs could be used to distinguish P+USS from P-USS'
227
+ - Aging is associated with a loss of skeletal muscle mass and function that negatively
228
+ impacts the independence and quality of life of older individuals. Females demonstrate
229
+ a distinct pattern of muscle aging compared to males, potentially due to menopause,
230
+ when the production of endogenous sex hormones declines. This systematic review
231
+ aims to investigate the current knowledge about the role of estrogen in female
232
+ skeletal muscle aging. A systematic search of MEDLINE Complete, Global Health,
233
+ Embase, PubMed, SPORTDiscus, and CINHAL was conducted. Studies were considered
234
+ eligible if they compared a state of estrogen deficiency (e.g. postmenopausal
235
+ females) or supplementation (e.g. estrogen therapy) to normal estrogen conditions
236
+ (e.g. premenopausal females or no supplementation). Outcome variables of interest
237
+ included measures of skeletal muscle mass, function, damage/repair, and energy
238
+ metabolism. Quality assessment was completed with the relevant Johanna Briggs
239
+ critical appraisal tool, and data were synthesized in a narrative manner. Thirty-two
240
+ studies were included in the review. Compared to premenopausal women, postmenopausal
241
+ women had reduced muscle mass and strength, but the effect of menopause on markers
242
+ of muscle damage and expression of the genes involved in metabolic signaling pathways
243
+ remains unclear. Some studies suggest a beneficial effect of estrogen therapy
244
+ on muscle size and strength, but evidence is largely conflicting and inconclusive,
245
+ potentially due to large variations in the reporting and status of exposure and
246
+ outcomes. The findings from this review point toward a potential negative effect
247
+ of estrogen deficiency on aging skeletal muscle, but further mechanistic evidence
248
+ is needed to clarify its role.
249
+ - source_sentence: 'normal subjects, residents of the Ural region, were examined by
250
+ a dichromatic bone densitometer of "GE/Lunar" firm (USA). After that they were
251
+ divided according to their somatotype: normosthenics, hypersthenics and asthenics.
252
+ Age-related groups in girls were formed from the age of years, in youths from
253
+ years, up to years every other year, after years every years up to the age of
254
+ years. The somatotype has been revealed to influence the mineral density (MD)
255
+ of skeleton, the mass of muscular, connective and fatty tissues: MD in girls has
256
+ been formed at the age of years, in youths at that of years. In normosthenics
257
+ and asthenics MD at the same age was % and %, respectively. At the age of years
258
+ MD in women with hypersthenia was % less than peak bone mass, in those with normosthenia
259
+ it was % less and in women with asthenia % less. In men these measurements were
260
+ and %, respectively.'
261
+ sentences:
262
+ - The C control chart is mostly used to monitor the number of non-conformities per
263
+ inspection unit of constant size. It is known that the classic C-chart control
264
+ limits often experience a high false alarm rate and thus lead to the increase
265
+ of unnecessary costs of inspection. Among many works performed to improve C-charts,
266
+ this paper presents the optimised control limits approach to the C-chart. In addition,
267
+ different C-charts are evaluated through factors related to the design of the
268
+ control chart. Optimally selecting design parameters depends on several process
269
+ parameters from statistical and/or economic aspects in the literature. This study
270
+ presents multi-objective economic-statistical design of different C control charts
271
+ under single assignable cause. An algorithm using the data envelopment analysis
272
+ (DEA) is employed to solve the models. A numerical example is used to illustrate
273
+ the algorithm procedure and to evaluate the performances of different designs.
274
+ - Early adult changes in the facial profile were studied longitudinally from to
275
+ years of age in a Swedish Caucasian sample of female and male dental students.
276
+ Lateral cephalometric radiographs were analysed by the conventional point-based
277
+ method and by the structure-based method of superimposing serial films, adapted
278
+ for computerized numerical analysis. Skeletal and soft tissue changes were described
279
+ by linear and angular variables. The magnitude of linear dimensional changes was
280
+ similar in the two sexes. The largest changes were found in the vertical dimensions.
281
+ Total anterior facial height increased by about mm in the -year period, suggesting
282
+ that the major part of the increase in vertical facial dimensions during the third
283
+ decade of life takes place in the first half of this decade. Sagittal jaw relationship
284
+ increased by about in both sexes. Soft tissue changes reflected those of the vertical
285
+ skeletal dimensions.
286
+ - x-ray images of patients with posttraumatic defects of forearm bones have been
287
+ analyzed using DiaMorph computer-assisted complex. Mean optical density of regenerated
288
+ bone shadows has been evaluated for the purpose of studying the dynamics of osteogenesis
289
+ and mineralization of newly formed bone tissue during osteosynthesis. By planimetry
290
+ of distraction regenerated bones it was established that osteogenesis developed
291
+ by normoplastic type. Typical distraction regenerated bones were formed while
292
+ filling defect-diastases; the regenerated bones lost their zonal structure at
293
+ the end of fixation period. During formation of wedge-shaped regenerated bones
294
+ clear zonal structure of newly formed tissue was not traced, the area of interlayer
295
+ occupied significantly less part than it was in case of filling the defects of
296
+ forearm bones by fragment lengthening and formation of typical distraction regenerated
297
+ bone.
298
+ pipeline_tag: sentence-similarity
299
+ library_name: sentence-transformers
300
+ metrics:
301
+ - cosine_accuracy
302
+ model-index:
303
+ - name: SentenceTransformer based on answerdotai/ModernBERT-large
304
+ results:
305
+ - task:
306
+ type: triplet
307
+ name: Triplet
308
+ dataset:
309
+ name: modernBERT
310
+ type: modernBERT
311
+ metrics:
312
+ - type: cosine_accuracy
313
+ value: 0.6726342710997443
314
+ name: Cosine Accuracy
315
+ - task:
316
+ type: triplet
317
+ name: Triplet
318
+ dataset:
319
+ name: modernBERT disciplines
320
+ type: modernBERT_disciplines
321
+ metrics:
322
+ - type: cosine_accuracy
323
+ value: 0.6756066411238825
324
+ name: Cosine Accuracy
325
+ ---
326
+
327
+ # SentenceTransformer based on answerdotai/ModernBERT-large
328
+
329
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
330
+
331
+ ## Model Details
332
+
333
+ ### Model Description
334
+ - **Model Type:** Sentence Transformer
335
+ - **Base model:** [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) <!-- at revision e829787a68677321312ff287fda2f8ef1a36e02a -->
336
+ - **Maximum Sequence Length:** 8192 tokens
337
+ - **Output Dimensionality:** 1024 dimensions
338
+ - **Similarity Function:** Cosine Similarity
339
+ <!-- - **Training Dataset:** Unknown -->
340
+ <!-- - **Language:** Unknown -->
341
+ <!-- - **License:** Unknown -->
342
+
343
+ ### Model Sources
344
+
345
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
346
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
347
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
348
+
349
+ ### Full Model Architecture
350
+
351
+ ```
352
+ SentenceTransformer(
353
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
354
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
355
+ )
356
+ ```
357
+
358
+ ## Usage
359
+
360
+ ### Direct Usage (Sentence Transformers)
361
+
362
+ First install the Sentence Transformers library:
363
+
364
+ ```bash
365
+ pip install -U sentence-transformers
366
+ ```
367
+
368
+ Then you can load this model and run inference.
369
+ ```python
370
+ from sentence_transformers import SentenceTransformer
371
+
372
+ # Download from the 🤗 Hub
373
+ model = SentenceTransformer("m7n/discipline-bert-modern-large_01")
374
+ # Run inference
375
+ sentences = [
376
+ 'normal subjects, residents of the Ural region, were examined by a dichromatic bone densitometer of "GE/Lunar" firm (USA). After that they were divided according to their somatotype: normosthenics, hypersthenics and asthenics. Age-related groups in girls were formed from the age of years, in youths from years, up to years every other year, after years every years up to the age of years. The somatotype has been revealed to influence the mineral density (MD) of skeleton, the mass of muscular, connective and fatty tissues: MD in girls has been formed at the age of years, in youths at that of years. In normosthenics and asthenics MD at the same age was % and %, respectively. At the age of years MD in women with hypersthenia was % less than peak bone mass, in those with normosthenia it was % less and in women with asthenia % less. In men these measurements were and %, respectively.',
377
+ 'x-ray images of patients with posttraumatic defects of forearm bones have been analyzed using DiaMorph computer-assisted complex. Mean optical density of regenerated bone shadows has been evaluated for the purpose of studying the dynamics of osteogenesis and mineralization of newly formed bone tissue during osteosynthesis. By planimetry of distraction regenerated bones it was established that osteogenesis developed by normoplastic type. Typical distraction regenerated bones were formed while filling defect-diastases; the regenerated bones lost their zonal structure at the end of fixation period. During formation of wedge-shaped regenerated bones clear zonal structure of newly formed tissue was not traced, the area of interlayer occupied significantly less part than it was in case of filling the defects of forearm bones by fragment lengthening and formation of typical distraction regenerated bone.',
378
+ 'Early adult changes in the facial profile were studied longitudinally from to years of age in a Swedish Caucasian sample of female and male dental students. Lateral cephalometric radiographs were analysed by the conventional point-based method and by the structure-based method of superimposing serial films, adapted for computerized numerical analysis. Skeletal and soft tissue changes were described by linear and angular variables. The magnitude of linear dimensional changes was similar in the two sexes. The largest changes were found in the vertical dimensions. Total anterior facial height increased by about mm in the -year period, suggesting that the major part of the increase in vertical facial dimensions during the third decade of life takes place in the first half of this decade. Sagittal jaw relationship increased by about in both sexes. Soft tissue changes reflected those of the vertical skeletal dimensions.',
379
+ ]
380
+ embeddings = model.encode(sentences)
381
+ print(embeddings.shape)
382
+ # [3, 1024]
383
+
384
+ # Get the similarity scores for the embeddings
385
+ similarities = model.similarity(embeddings, embeddings)
386
+ print(similarities.shape)
387
+ # [3, 3]
388
+ ```
389
+
390
+ <!--
391
+ ### Direct Usage (Transformers)
392
+
393
+ <details><summary>Click to see the direct usage in Transformers</summary>
394
+
395
+ </details>
396
+ -->
397
+
398
+ <!--
399
+ ### Downstream Usage (Sentence Transformers)
400
+
401
+ You can finetune this model on your own dataset.
402
+
403
+ <details><summary>Click to expand</summary>
404
+
405
+ </details>
406
+ -->
407
+
408
+ <!--
409
+ ### Out-of-Scope Use
410
+
411
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
412
+ -->
413
+
414
+ ## Evaluation
415
+
416
+ ### Metrics
417
+
418
+ #### Triplet
419
+
420
+ * Datasets: `modernBERT` and `modernBERT_disciplines`
421
+ * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
422
+
423
+ | Metric | modernBERT | modernBERT_disciplines |
424
+ |:--------------------|:-----------|:-----------------------|
425
+ | **cosine_accuracy** | **0.6726** | **0.6756** |
426
+
427
+ <!--
428
+ ## Bias, Risks and Limitations
429
+
430
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
431
+ -->
432
+
433
+ <!--
434
+ ### Recommendations
435
+
436
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
437
+ -->
438
+
439
+ ## Training Details
440
+
441
+ ### Training Dataset
442
+
443
+ #### Unnamed Dataset
444
+
445
+
446
+ * Size: 7,828 training samples
447
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
448
+ * Approximate statistics based on the first 1000 samples:
449
+ | | anchor | positive | negative |
450
+ |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
451
+ | type | string | string | string |
452
+ | details | <ul><li>min: 82 tokens</li><li>mean: 236.35 tokens</li><li>max: 620 tokens</li></ul> | <ul><li>min: 82 tokens</li><li>mean: 237.05 tokens</li><li>max: 663 tokens</li></ul> | <ul><li>min: 82 tokens</li><li>mean: 247.65 tokens</li><li>max: 653 tokens</li></ul> |
453
+ * Samples:
454
+ | anchor | positive | negative |
455
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
456
+ | <code>Implementing management systems in organisations of all types and sizes often raises the following question: "What benefits will this bring?" Initial resistance and criticism are common as potential challenges are identified during the implementation process. To address this, it is essential to highlight the advantages of these systems and engage stakeholders in supporting management efforts. While the planning, implementation, use, maintenance, auditing, and improvement of management systems are generally voluntary, certification is frequently driven by external factors, particularly customer demands. Employees also stand to gain significantly, with knowledge and information serving as valuable resources, especially for leveraging artificial intelligence. This article explores the management's readiness to adopt and fully utilise two management systems based on international standards: the ISO Knowledge management system (KMS) and the ISO/IEC Artificial intelligence management system ...</code> | <code>Machine learning is an established and frequently used technique in industry and academia, but a standard process model to improve success and efficiency of machine learning applications is still missing. Project organizations and machine learning practitioners face manifold challenges and risks when developing machine learning applications and have a need for guidance to meet business expectations. This paper therefore proposes a process model for the development of machine learning applications, covering six phases from defining the scope to maintaining the deployed machine learning application. Business and data understanding are executed simultaneously in the first phase, as both have considerable impact on the feasibility of the project. The next phases are comprised of data preparation, modeling, evaluation, and deployment. Special focus is applied to the last phase, as a model running in changing real-time environments requires close monitoring and maintenance to reduce the risk...</code> | <code>This study aims to obtain user satisfaction factors for a knowledge management system so that a questionnaire can be made for evaluation or measurement. The SECI method is used with the CISE sequence which consists of four knowledge creation steps, namely C-combination, I-nternalization, S-socialization, and ending E-externalization. The stage begins with literature studies and then modifications are made with the selection, addition, and incorporation of existing models. From understanding and analyzing several models, discussions or brainstorming with colleagues were then carried out so that a final model was obtained to compile a list of keywords and statements as a questionnaire based on indicators related to knowledge management and the satisfaction of knowledge management system users. The results obtained there are eight user satisfaction factors divided into technical aspects (knowledge quality, knowledge sharing, system quality, service quality) and social aspects (management ...</code> |
457
+ | <code>This study examines the effect of alloying elements of Ni and W on the repassivation properties of stainless steel (SS) as evaluated by a rapid scratching electrode technique and stress corrosion cracking (SCC) test. The SS specimens were grouped into two different grades according to Ni content (00Cr-0Ni duplex, Type 000LMN [UNS S00000] austenitic SS). Major considerations regarding alloy design were Ni content and the substitution of W for Mo. However, a similar pitting resistance equivalent number (PREN) of to was maintained for all specimens. The main factors for evaluation of repassivation properties are the peak current for the scratched surface and repassivation rate. In M magnesium chloride (MgCl0) and N sulfuric acid containing chloride ions (H0SO0 + % Cl) solution, repassivation test results showed that repassivation properties decreased as Ni content increased. However, W substitution was effective on the repassivation process and increased the resistance of SCC property for...</code> | <code>Abstract High-nitrogen (N) stainless steels (SS) are receiving increased attention because of their strength advantages over carbon (C)-alloyed materials, but they have been found susceptible to dichromium nitride (Cr0N) precipitation during thermal exposure between 000C and ,000C. Sensitization susceptibility of a high-N, low-C austenitic SS by Cr0N precipitation at 000C and 000C was determined using the single-loop electrochemical potentiokinetic reactivation (EPR) test. High-N SS was found susceptible to sensitization caused by grain boundary (GB) precipitation of Cr0N, with the degree of sensitization increasing systematically with aging time at 000C. Sensitization of high-N materials did not require the concomitant precipitation of chromium (Cr)-rich metal carbide (M00C0). Materials aged at 000C were not sensitized, although the rate of precipitation was greater than at 000C. This indicated the minimum Cr level in the Cr-depleted zone of the matrix associated with nitride precipit...</code> | <code>The anodic dissolution characteristics of nickel, molybdenum, and stainless steel have been examined in pure and eutectic melt. Molybdenum and nickel show Tafeltype dissolution kinetics in pure eutectic which permit estimates of longterm corrosion rates as a function of voltage. Nickel exhibits a sharp threshold potential for dissolution in melt, forming a nonpassivating layer. Comparative voltammetry and opencircuit potential measurements with iron in this melt suggest that care may be required in using nickel as an iron sulfide current collector. The anodic dissolution of stainless steel in melt appears to be rate limited by diffusion through a reaction layer, showing a dependence that may be applicable to longterm corrosion predictions. Dissolution is strongly inhibited by dissolved , apparently by formation of a protective anodic oxide layer. Molybdenum appears to owe its excellent anodic corrosion resistance in melt both to a chemically formed prepassive film and to a welldefined ...</code> |
458
+ | <code>FY-0E WindRAD (Fengyun-0E Wind Radar) is a dual-frequency rotating fan-beam scatterometer. Its data characteristics, NOC (NWP Ocean Calibration), and wind retrieval performance are investigated in this paper. The diversity of the radar view geometry varies across the swaths, with maximum diversity in the sweet swaths and limited diversity in the outer and nadir swaths. When NOC backscatter calibration coefficients are computed as a function of incidence angle only (NOCint), a smooth correction is found. However, when relative antenna azimuth angle is included (NOCant), it appears that the corrections as a function of relative azimuth angle vary harmonically and substantially for a specific incidence angle. NOCant corrections yield a better fit of the measurements to the GMF (Geophysical Model Function). Hence, NOCant is applied for the analysis of wind retrieval from the Ku-band and C-band. An extra engineering correction of dB and dB is applied on Ku-band and C-band backscatter values...</code> | <code>Spaceborne synthetic aperture radar (SAR) represents a powerful source of data for enhancing maritime domain awareness (MDA). Wakes generated by traveling vessels hold a crucial role in MDA since they can be exploited both for ship route and velocity estimation and as a marker of ship presence. Even if deep learning (DL) has led to an impressive performance boost on a variety of computer vision tasks, its usage for automatic target recognition (ATR) in SAR images to support MDA is still limited to the detection of ships rather than ship wakes. A dataset is presented in this paper and several state-of-the-art object detectors based on convolutional neural networks (CNNs) are tested with different backbones. The dataset, including more than wake chips, is realized by visually inspecting Sentinel- images over highly trafficked maritime sites. Extensive experiments are shown to characterize CNNs for the wake detection task. For the first time, a deep-learning approach is implemented to spe...</code> | <code>With the publication of Part Wind Actions of the South African Loading Code SANS : , several issues concerning adjustments from the reference standard Eurocode EN - - : could not be resolved due to lack of sufficient updated background information on South African conditions. The need for updating the map for the free field wind speed is related also to the improved representation of the mixed and complex strong wind climate of the country. Furthermore, strong wind probability models are used for the reliability assessment and calibration of wind design procedures. Updating of the reliability provisions for the revised wind loading process was a further need identified at the time. This paper provides a review of the historical development of the representation of the free field wind, used as input to design wind loading procedures for South Africa. The review considers: (i) the historical representations of the geographic distribution of free field wind, (ii) the climatic influences c...</code> |
459
+ * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
460
+ ```json
461
+ {
462
+ "distance_metric": "TripletDistanceMetric.COSINE",
463
+ "triplet_margin": 0.05
464
+ }
465
+ ```
466
+
467
+ ### Evaluation Dataset
468
+
469
+ #### Unnamed Dataset
470
+
471
+
472
+ * Size: 391 evaluation samples
473
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
474
+ * Approximate statistics based on the first 391 samples:
475
+ | | anchor | positive | negative |
476
+ |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
477
+ | type | string | string | string |
478
+ | details | <ul><li>min: 85 tokens</li><li>mean: 239.86 tokens</li><li>max: 592 tokens</li></ul> | <ul><li>min: 87 tokens</li><li>mean: 229.31 tokens</li><li>max: 542 tokens</li></ul> | <ul><li>min: 93 tokens</li><li>mean: 239.99 tokens</li><li>max: 592 tokens</li></ul> |
479
+ * Samples:
480
+ | anchor | positive | negative |
481
+ |:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
482
+ | <code>Industrial Relations: A Journal of Economy and SocietyVolume , Issue p. - Internet Resources Selected by the Institute for Research on Labor and Employment Library University of California, Berkeley TERENCE K. HUWE, TERENCE K. HUWE Director of Library & Information ResourcesSearch for more papers by this authorJANICE KIMBALL, JANICE KIMBALL Library AssistantSearch for more papers by this author TERENCE K. HUWE, TERENCE K. HUWE Director of Library & Information ResourcesSearch for more papers by this authorJANICE KIMBALL, JANICE KIMBALL Library AssistantSearch for more papers by this author First published: April the full textAboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a ...</code> | <code>Industrial Relations: A Journal of Economy and SocietyVolume , Issue p. - Recent Publications Selected by the Institute for Research on Labor and Employment Library University of California, Berkeley Terence K. Huwe, Terence K. Huwe Director of Library & Information ResourcesSearch for more papers by this authorJanice Kimball, Janice Kimball Library AssistantSearch for more papers by this author Terence K. Huwe, Terence K. Huwe Director of Library & Information ResourcesSearch for more papers by this authorJanice Kimball, Janice Kimball Library AssistantSearch for more papers by this author First published: September the full textAboutPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to sha...</code> | <code>This paper suggests that all the models of industrial relations, not just the more statist ones, have been characterized throughout their history by complex and sometimes troublesome relationships with the state. These models have always been conditioned, and in certain sense shaped, by the latter's more or less direct intervention at the moment of their formation and as they have expanded or declined. An intervention which is also influenced by the nature of economic problems that national political economies have to cope with. Such difficulties of relationship are to a large extent due to the fact that political regulation and regulation through industrial relations only partially overlap in their goals and contents. More frequently they compete with each other and have methods and logics that tend to diverge. Whereas decisions are taken by majority principle in the political sphere, in industrial relations they can only be taken unamimously - and especially so in collective bargaini...</code> |
483
+ | <code>Poor response rates to follow-up questionnaires can adversely affect the progress of a randomised controlled trial and the validity of its results. This embedded 'study within a trial' aimed to investigate the impact of including a pen with the postal -month questionnaire completed by the trial participants on the response rates to this questionnaire.This study was a two-armed randomised controlled trial nested in the Gentle Years Yoga (GYY) trial. Participants in the intervention group of the GYY trial were allocated : using simple randomisation to either receive a pen (intervention) or no pen with their -month questionnaire (control). The primary outcome was the proportion of participants sent a -month questionnaire who returned it. Secondary outcomes were time taken to return the questionnaire, proportion of participants sent a reminder to return the questionnaire, and completeness of the questionnaire. Binary outcomes were analysed using logistic regression, time to return by Cox P...</code> | <code>Background Poor response rates to follow-up questionnaires can adversely affect the progress of a randomised controlled trial and the validity of its results. This embedded 'study within a trial' aimed to investigate the impact of including a pen with the postal -month questionnaire completed by the trial participants on the response rates to this questionnaire. Methods This study was a two-armed randomised controlled trial nested in the Gentle Years Yoga (GYY) trial. Participants in the intervention group of the GYY trial were allocated : using simple randomisation to either receive a pen (intervention) or no pen with their -month questionnaire (control). The primary outcome was the proportion of participants sent a -month questionnaire who returned it. Secondary outcomes were time taken to return the questionnaire, proportion of participants sent a reminder to return the questionnaire, and completeness of the questionnaire. Binary outcomes were analysed using logistic regression, tim...</code> | <code>Patients' failure to adhere on tuberculosis (TB) treatment leads to drug resistance, relapse and death. Non-adherence to TB treatment is higher during continuation treatment phase. The study aimed to evaluate effectiveness of combined pill refilling and medication reminders on adherence to TB treatment.A two-arm randomised controlled trial on adult patients with TB was used during continuation treatment phase. In the first arm, in addition to usual care, participants will receive cellphone-based daily medication and weekly pill refilling reminders. In the control arm, participants will receive only usual care. The study will use a covariate adaptive randomisation technique to balance covariates during allocation. The primary outcome is patients' adherence to TB treatment and secondary outcomes are attendance to clinic and treatment outcomes. We apply intention to treat with generalised linear mixed model.Ethical approval was obtained from Institutional Review Board of University of Gon...</code> |
484
+ | <code>EthologyVolume , Issue p. i-i Front CoverFree Access A male Swainson's Spurfowl, Pternistis swainsonii, calling out a raucous 'krrrraaak-krrrraaak-krrrraaak' in the bushveld of Kruger National Park, South Africa. Photograph reproduced by permission of Emmanuel Do Linh San - First published: June ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinked InRedditWechat No abstract is available for this article. Volume000, Issue0July 0000Pages i-i RelatedInformation</code> | <code>EthologyVolume , Issue p. i-i Front CoverFree Access Breeding male Southern Masked-Weaver, Ploceus velatus, building a nest in Addo Elephant National Park, South Africa. Photograph reproduced by permission of Emmanuel Do Linh San First published: July ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat No abstract is available for this article. Volume000, Issue0August 0000Pages i-i RelatedInformation</code> | <code>IbisVolume , Issue p. - Do male Chaffinches Fringilla coelebs copy song sequencing and bout length from their tutors? Katharina Riebel, Corresponding Author Katharina Riebel School of Environmental and Evolutionary Biology, University of St Andrews, File KY00 0TS, UK*Behavioural Biology, Institute of Evolutionary and Ecology Sciences, PO Box , RA Leiden, The Nederlands. Email: for more papers by this authorPeter J. B. Slater, Peter J. B. Slater School of Environmental and Evolutionary Biology, University of St Andrews, File KY00 0TS, UKSearch for more papers by this author Katharina Riebel, Corresponding Author Katharina Riebel School of Environmental and Evolutionary Biology, University of St Andrews, File KY00 0TS, UK*Behavioural Biology, Institute of Evolutionary and Ecology Sciences, PO Box , RA Leiden, The Nederlands. Email: for more papers by this authorPeter J. B. Slater, Peter J. B. Slater School of Environmental and Evolutionary Biology, University of St Andrews, File KY00 0TS...</code> |
485
+ * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
486
+ ```json
487
+ {
488
+ "distance_metric": "TripletDistanceMetric.COSINE",
489
+ "triplet_margin": 0.05
490
+ }
491
+ ```
492
+
493
+ ### Training Hyperparameters
494
+ #### Non-Default Hyperparameters
495
+
496
+ - `eval_strategy`: steps
497
+ - `per_device_train_batch_size`: 4
498
+ - `per_device_eval_batch_size`: 4
499
+ - `learning_rate`: 1e-05
500
+ - `weight_decay`: 0.01
501
+ - `warmup_ratio`: 0.1
502
+ - `batch_sampler`: no_duplicates
503
+
504
+ #### All Hyperparameters
505
+ <details><summary>Click to expand</summary>
506
+
507
+ - `overwrite_output_dir`: False
508
+ - `do_predict`: False
509
+ - `eval_strategy`: steps
510
+ - `prediction_loss_only`: True
511
+ - `per_device_train_batch_size`: 4
512
+ - `per_device_eval_batch_size`: 4
513
+ - `per_gpu_train_batch_size`: None
514
+ - `per_gpu_eval_batch_size`: None
515
+ - `gradient_accumulation_steps`: 1
516
+ - `eval_accumulation_steps`: None
517
+ - `torch_empty_cache_steps`: None
518
+ - `learning_rate`: 1e-05
519
+ - `weight_decay`: 0.01
520
+ - `adam_beta1`: 0.9
521
+ - `adam_beta2`: 0.999
522
+ - `adam_epsilon`: 1e-08
523
+ - `max_grad_norm`: 1.0
524
+ - `num_train_epochs`: 3
525
+ - `max_steps`: -1
526
+ - `lr_scheduler_type`: linear
527
+ - `lr_scheduler_kwargs`: {}
528
+ - `warmup_ratio`: 0.1
529
+ - `warmup_steps`: 0
530
+ - `log_level`: passive
531
+ - `log_level_replica`: warning
532
+ - `log_on_each_node`: True
533
+ - `logging_nan_inf_filter`: True
534
+ - `save_safetensors`: True
535
+ - `save_on_each_node`: False
536
+ - `save_only_model`: False
537
+ - `restore_callback_states_from_checkpoint`: False
538
+ - `no_cuda`: False
539
+ - `use_cpu`: False
540
+ - `use_mps_device`: False
541
+ - `seed`: 42
542
+ - `data_seed`: None
543
+ - `jit_mode_eval`: False
544
+ - `use_ipex`: False
545
+ - `bf16`: False
546
+ - `fp16`: False
547
+ - `fp16_opt_level`: O1
548
+ - `half_precision_backend`: auto
549
+ - `bf16_full_eval`: False
550
+ - `fp16_full_eval`: False
551
+ - `tf32`: None
552
+ - `local_rank`: 0
553
+ - `ddp_backend`: None
554
+ - `tpu_num_cores`: None
555
+ - `tpu_metrics_debug`: False
556
+ - `debug`: []
557
+ - `dataloader_drop_last`: False
558
+ - `dataloader_num_workers`: 0
559
+ - `dataloader_prefetch_factor`: None
560
+ - `past_index`: -1
561
+ - `disable_tqdm`: False
562
+ - `remove_unused_columns`: True
563
+ - `label_names`: None
564
+ - `load_best_model_at_end`: False
565
+ - `ignore_data_skip`: False
566
+ - `fsdp`: []
567
+ - `fsdp_min_num_params`: 0
568
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
569
+ - `fsdp_transformer_layer_cls_to_wrap`: None
570
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
571
+ - `deepspeed`: None
572
+ - `label_smoothing_factor`: 0.0
573
+ - `optim`: adamw_torch
574
+ - `optim_args`: None
575
+ - `adafactor`: False
576
+ - `group_by_length`: False
577
+ - `length_column_name`: length
578
+ - `ddp_find_unused_parameters`: None
579
+ - `ddp_bucket_cap_mb`: None
580
+ - `ddp_broadcast_buffers`: False
581
+ - `dataloader_pin_memory`: True
582
+ - `dataloader_persistent_workers`: False
583
+ - `skip_memory_metrics`: True
584
+ - `use_legacy_prediction_loop`: False
585
+ - `push_to_hub`: False
586
+ - `resume_from_checkpoint`: None
587
+ - `hub_model_id`: None
588
+ - `hub_strategy`: every_save
589
+ - `hub_private_repo`: None
590
+ - `hub_always_push`: False
591
+ - `gradient_checkpointing`: False
592
+ - `gradient_checkpointing_kwargs`: None
593
+ - `include_inputs_for_metrics`: False
594
+ - `include_for_metrics`: []
595
+ - `eval_do_concat_batches`: True
596
+ - `fp16_backend`: auto
597
+ - `push_to_hub_model_id`: None
598
+ - `push_to_hub_organization`: None
599
+ - `mp_parameters`:
600
+ - `auto_find_batch_size`: False
601
+ - `full_determinism`: False
602
+ - `torchdynamo`: None
603
+ - `ray_scope`: last
604
+ - `ddp_timeout`: 1800
605
+ - `torch_compile`: False
606
+ - `torch_compile_backend`: None
607
+ - `torch_compile_mode`: None
608
+ - `dispatch_batches`: None
609
+ - `split_batches`: None
610
+ - `include_tokens_per_second`: False
611
+ - `include_num_input_tokens_seen`: False
612
+ - `neftune_noise_alpha`: None
613
+ - `optim_target_modules`: None
614
+ - `batch_eval_metrics`: False
615
+ - `eval_on_start`: False
616
+ - `use_liger_kernel`: False
617
+ - `eval_use_gather_object`: False
618
+ - `average_tokens_across_devices`: False
619
+ - `prompts`: None
620
+ - `batch_sampler`: no_duplicates
621
+ - `multi_dataset_batch_sampler`: proportional
622
+
623
+ </details>
624
+
625
+ ### Training Logs
626
+ | Epoch | Step | Training Loss | Validation Loss | modernBERT_cosine_accuracy | modernBERT_disciplines_cosine_accuracy |
627
+ |:------:|:----:|:-------------:|:---------------:|:--------------------------:|:--------------------------------------:|
628
+ | 0 | 0 | - | - | 0.4783 | - |
629
+ | 0.0511 | 100 | 0.0534 | 0.0495 | 0.5090 | - |
630
+ | 0.1022 | 200 | 0.0502 | 0.0474 | 0.5243 | - |
631
+ | 0.1533 | 300 | 0.0486 | 0.0465 | 0.5499 | - |
632
+ | 0.2044 | 400 | 0.0465 | 0.0457 | 0.5831 | - |
633
+ | 0.2555 | 500 | 0.0468 | 0.0467 | 0.5754 | - |
634
+ | 0.3066 | 600 | 0.0465 | 0.0444 | 0.6113 | - |
635
+ | 0.3577 | 700 | 0.0426 | 0.0467 | 0.5831 | - |
636
+ | 0.4088 | 800 | 0.0445 | 0.0454 | 0.5857 | - |
637
+ | 0.4599 | 900 | 0.0441 | 0.0441 | 0.6215 | - |
638
+ | 0.5110 | 1000 | 0.0432 | 0.0423 | 0.6189 | - |
639
+ | 0.5621 | 1100 | 0.0433 | 0.0417 | 0.6189 | - |
640
+ | 0.6132 | 1200 | 0.0395 | 0.0416 | 0.6240 | - |
641
+ | 0.6643 | 1300 | 0.0408 | 0.0403 | 0.6419 | - |
642
+ | 0.7154 | 1400 | 0.0414 | 0.0414 | 0.6445 | - |
643
+ | 0.7665 | 1500 | 0.044 | 0.0423 | 0.6343 | - |
644
+ | 0.8176 | 1600 | 0.0436 | 0.0418 | 0.6292 | - |
645
+ | 0.8687 | 1700 | 0.0392 | 0.0402 | 0.6624 | - |
646
+ | 0.9198 | 1800 | 0.039 | 0.0434 | 0.6419 | - |
647
+ | 0.9709 | 1900 | 0.0413 | 0.0439 | 0.5959 | - |
648
+ | 1.0220 | 2000 | 0.0396 | 0.0437 | 0.6087 | - |
649
+ | 1.0731 | 2100 | 0.0402 | 0.0414 | 0.6266 | - |
650
+ | 1.1242 | 2200 | 0.0402 | 0.0411 | 0.6496 | - |
651
+ | 1.1753 | 2300 | 0.0362 | 0.0415 | 0.6419 | - |
652
+ | 1.2264 | 2400 | 0.0371 | 0.0393 | 0.6496 | - |
653
+ | 1.2775 | 2500 | 0.0353 | 0.0396 | 0.6445 | - |
654
+ | 1.3286 | 2600 | 0.0322 | 0.0418 | 0.6496 | - |
655
+ | 1.3797 | 2700 | 0.0329 | 0.0412 | 0.6394 | - |
656
+ | 1.4308 | 2800 | 0.0311 | 0.0400 | 0.6445 | - |
657
+ | 1.4819 | 2900 | 0.0318 | 0.0385 | 0.6573 | - |
658
+ | 1.5330 | 3000 | 0.0306 | 0.0387 | 0.6726 | - |
659
+ | 1.5841 | 3100 | 0.0273 | 0.0387 | 0.6803 | - |
660
+ | 1.6352 | 3200 | 0.0285 | 0.0384 | 0.6803 | - |
661
+ | 1.6863 | 3300 | 0.0299 | 0.0375 | 0.6675 | - |
662
+ | 1.7374 | 3400 | 0.0304 | 0.0378 | 0.6522 | - |
663
+ | 1.7885 | 3500 | 0.03 | 0.0388 | 0.6496 | - |
664
+ | 1.8396 | 3600 | 0.028 | 0.0383 | 0.6803 | - |
665
+ | 1.8906 | 3700 | 0.0264 | 0.0380 | 0.6957 | - |
666
+ | 1.9417 | 3800 | 0.0275 | 0.0388 | 0.6573 | - |
667
+ | 1.9928 | 3900 | 0.0314 | 0.0378 | 0.6803 | - |
668
+ | 2.0439 | 4000 | 0.03 | 0.0388 | 0.6777 | - |
669
+ | 2.0950 | 4100 | 0.0308 | 0.0380 | 0.6752 | - |
670
+ | 2.1461 | 4200 | 0.0263 | 0.0382 | 0.6598 | - |
671
+ | 2.1972 | 4300 | 0.0215 | 0.0391 | 0.6573 | - |
672
+ | 2.2483 | 4400 | 0.017 | 0.0413 | 0.6471 | - |
673
+ | 2.2994 | 4500 | 0.0173 | 0.0398 | 0.6726 | - |
674
+ | 2.3505 | 4600 | 0.0183 | 0.0393 | 0.6752 | - |
675
+ | 2.4016 | 4700 | 0.0189 | 0.0399 | 0.6957 | - |
676
+ | 2.4527 | 4800 | 0.0123 | 0.0407 | 0.6803 | - |
677
+ | 2.5038 | 4900 | 0.0155 | 0.0405 | 0.6803 | - |
678
+ | 2.5549 | 5000 | 0.0108 | 0.0413 | 0.6726 | - |
679
+ | 2.6060 | 5100 | 0.0112 | 0.0416 | 0.6650 | - |
680
+ | 2.6571 | 5200 | 0.0134 | 0.0414 | 0.6777 | - |
681
+ | 2.7082 | 5300 | 0.0133 | 0.0406 | 0.6624 | - |
682
+ | 2.7593 | 5400 | 0.0109 | 0.0408 | 0.6701 | - |
683
+ | 2.8104 | 5500 | 0.0121 | 0.0408 | 0.6726 | - |
684
+ | 2.8615 | 5600 | 0.0124 | 0.0408 | 0.6752 | - |
685
+ | 2.9126 | 5700 | 0.012 | 0.0407 | 0.6752 | - |
686
+ | 2.9637 | 5800 | 0.0127 | 0.0406 | 0.6726 | - |
687
+ | 3.0 | 5871 | - | - | - | 0.6756 |
688
+
689
+
690
+ ### Framework Versions
691
+ - Python: 3.10.12
692
+ - Sentence Transformers: 3.3.1
693
+ - Transformers: 4.48.0.dev0
694
+ - PyTorch: 2.5.1+cu121
695
+ - Accelerate: 1.2.1
696
+ - Datasets: 3.2.0
697
+ - Tokenizers: 0.21.0
698
+
699
+ ## Citation
700
+
701
+ ### BibTeX
702
+
703
+ #### Sentence Transformers
704
+ ```bibtex
705
+ @inproceedings{reimers-2019-sentence-bert,
706
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
707
+ author = "Reimers, Nils and Gurevych, Iryna",
708
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
709
+ month = "11",
710
+ year = "2019",
711
+ publisher = "Association for Computational Linguistics",
712
+ url = "https://arxiv.org/abs/1908.10084",
713
+ }
714
+ ```
715
+
716
+ #### TripletLoss
717
+ ```bibtex
718
+ @misc{hermans2017defense,
719
+ title={In Defense of the Triplet Loss for Person Re-Identification},
720
+ author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
721
+ year={2017},
722
+ eprint={1703.07737},
723
+ archivePrefix={arXiv},
724
+ primaryClass={cs.CV}
725
+ }
726
+ ```
727
+
728
+ <!--
729
+ ## Glossary
730
+
731
+ *Clearly define terms in order to be accessible across audiences.*
732
+ -->
733
+
734
+ <!--
735
+ ## Model Card Authors
736
+
737
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
738
+ -->
739
+
740
+ <!--
741
+ ## Model Card Contact
742
+
743
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
744
+ -->
config.json ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "answerdotai/ModernBERT-large",
3
+ "architectures": [
4
+ "ModernBertModel"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 50281,
9
+ "classifier_activation": "gelu",
10
+ "classifier_bias": false,
11
+ "classifier_dropout": 0.0,
12
+ "classifier_pooling": "mean",
13
+ "cls_token_id": 50281,
14
+ "decoder_bias": true,
15
+ "deterministic_flash_attn": false,
16
+ "embedding_dropout": 0.0,
17
+ "eos_token_id": 50282,
18
+ "global_attn_every_n_layers": 3,
19
+ "global_rope_theta": 160000.0,
20
+ "gradient_checkpointing": false,
21
+ "hidden_activation": "gelu",
22
+ "hidden_size": 1024,
23
+ "initializer_cutoff_factor": 2.0,
24
+ "initializer_range": 0.02,
25
+ "intermediate_size": 2624,
26
+ "layer_norm_eps": 1e-05,
27
+ "local_attention": 128,
28
+ "local_rope_theta": 10000.0,
29
+ "max_position_embeddings": 8192,
30
+ "mlp_bias": false,
31
+ "mlp_dropout": 0.0,
32
+ "model_type": "modernbert",
33
+ "norm_bias": false,
34
+ "norm_eps": 1e-05,
35
+ "num_attention_heads": 16,
36
+ "num_hidden_layers": 28,
37
+ "pad_token_id": 50283,
38
+ "position_embedding_type": "absolute",
39
+ "reference_compile": false,
40
+ "sep_token_id": 50282,
41
+ "sparse_pred_ignore_index": -100,
42
+ "sparse_prediction": false,
43
+ "torch_dtype": "float32",
44
+ "transformers_version": "4.48.0.dev0",
45
+ "vocab_size": 50368
46
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.48.0.dev0",
5
+ "pytorch": "2.5.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:71e71472a3c609416e461aa8cb9282f4344259ee060ba62a1ee4291e5fd52190
3
+ size 1579143688
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 8192,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": true,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,945 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "|||IP_ADDRESS|||",
5
+ "lstrip": false,
6
+ "normalized": true,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": false
10
+ },
11
+ "1": {
12
+ "content": "<|padding|>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "50254": {
20
+ "content": " ",
21
+ "lstrip": false,
22
+ "normalized": true,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": false
26
+ },
27
+ "50255": {
28
+ "content": " ",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": false
34
+ },
35
+ "50256": {
36
+ "content": " ",
37
+ "lstrip": false,
38
+ "normalized": true,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": false
42
+ },
43
+ "50257": {
44
+ "content": " ",
45
+ "lstrip": false,
46
+ "normalized": true,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": false
50
+ },
51
+ "50258": {
52
+ "content": " ",
53
+ "lstrip": false,
54
+ "normalized": true,
55
+ "rstrip": false,
56
+ "single_word": false,
57
+ "special": false
58
+ },
59
+ "50259": {
60
+ "content": " ",
61
+ "lstrip": false,
62
+ "normalized": true,
63
+ "rstrip": false,
64
+ "single_word": false,
65
+ "special": false
66
+ },
67
+ "50260": {
68
+ "content": " ",
69
+ "lstrip": false,
70
+ "normalized": true,
71
+ "rstrip": false,
72
+ "single_word": false,
73
+ "special": false
74
+ },
75
+ "50261": {
76
+ "content": " ",
77
+ "lstrip": false,
78
+ "normalized": true,
79
+ "rstrip": false,
80
+ "single_word": false,
81
+ "special": false
82
+ },
83
+ "50262": {
84
+ "content": " ",
85
+ "lstrip": false,
86
+ "normalized": true,
87
+ "rstrip": false,
88
+ "single_word": false,
89
+ "special": false
90
+ },
91
+ "50263": {
92
+ "content": " ",
93
+ "lstrip": false,
94
+ "normalized": true,
95
+ "rstrip": false,
96
+ "single_word": false,
97
+ "special": false
98
+ },
99
+ "50264": {
100
+ "content": " ",
101
+ "lstrip": false,
102
+ "normalized": true,
103
+ "rstrip": false,
104
+ "single_word": false,
105
+ "special": false
106
+ },
107
+ "50265": {
108
+ "content": " ",
109
+ "lstrip": false,
110
+ "normalized": true,
111
+ "rstrip": false,
112
+ "single_word": false,
113
+ "special": false
114
+ },
115
+ "50266": {
116
+ "content": " ",
117
+ "lstrip": false,
118
+ "normalized": true,
119
+ "rstrip": false,
120
+ "single_word": false,
121
+ "special": false
122
+ },
123
+ "50267": {
124
+ "content": " ",
125
+ "lstrip": false,
126
+ "normalized": true,
127
+ "rstrip": false,
128
+ "single_word": false,
129
+ "special": false
130
+ },
131
+ "50268": {
132
+ "content": " ",
133
+ "lstrip": false,
134
+ "normalized": true,
135
+ "rstrip": false,
136
+ "single_word": false,
137
+ "special": false
138
+ },
139
+ "50269": {
140
+ "content": " ",
141
+ "lstrip": false,
142
+ "normalized": true,
143
+ "rstrip": false,
144
+ "single_word": false,
145
+ "special": false
146
+ },
147
+ "50270": {
148
+ "content": " ",
149
+ "lstrip": false,
150
+ "normalized": true,
151
+ "rstrip": false,
152
+ "single_word": false,
153
+ "special": false
154
+ },
155
+ "50271": {
156
+ "content": " ",
157
+ "lstrip": false,
158
+ "normalized": true,
159
+ "rstrip": false,
160
+ "single_word": false,
161
+ "special": false
162
+ },
163
+ "50272": {
164
+ "content": " ",
165
+ "lstrip": false,
166
+ "normalized": true,
167
+ "rstrip": false,
168
+ "single_word": false,
169
+ "special": false
170
+ },
171
+ "50273": {
172
+ "content": " ",
173
+ "lstrip": false,
174
+ "normalized": true,
175
+ "rstrip": false,
176
+ "single_word": false,
177
+ "special": false
178
+ },
179
+ "50274": {
180
+ "content": " ",
181
+ "lstrip": false,
182
+ "normalized": true,
183
+ "rstrip": false,
184
+ "single_word": false,
185
+ "special": false
186
+ },
187
+ "50275": {
188
+ "content": " ",
189
+ "lstrip": false,
190
+ "normalized": true,
191
+ "rstrip": false,
192
+ "single_word": false,
193
+ "special": false
194
+ },
195
+ "50276": {
196
+ "content": " ",
197
+ "lstrip": false,
198
+ "normalized": true,
199
+ "rstrip": false,
200
+ "single_word": false,
201
+ "special": false
202
+ },
203
+ "50277": {
204
+ "content": "|||EMAIL_ADDRESS|||",
205
+ "lstrip": false,
206
+ "normalized": true,
207
+ "rstrip": false,
208
+ "single_word": false,
209
+ "special": false
210
+ },
211
+ "50278": {
212
+ "content": "|||PHONE_NUMBER|||",
213
+ "lstrip": false,
214
+ "normalized": true,
215
+ "rstrip": false,
216
+ "single_word": false,
217
+ "special": false
218
+ },
219
+ "50279": {
220
+ "content": "<|endoftext|>",
221
+ "lstrip": false,
222
+ "normalized": false,
223
+ "rstrip": false,
224
+ "single_word": false,
225
+ "special": true
226
+ },
227
+ "50280": {
228
+ "content": "[UNK]",
229
+ "lstrip": false,
230
+ "normalized": false,
231
+ "rstrip": false,
232
+ "single_word": false,
233
+ "special": true
234
+ },
235
+ "50281": {
236
+ "content": "[CLS]",
237
+ "lstrip": false,
238
+ "normalized": false,
239
+ "rstrip": false,
240
+ "single_word": false,
241
+ "special": true
242
+ },
243
+ "50282": {
244
+ "content": "[SEP]",
245
+ "lstrip": false,
246
+ "normalized": false,
247
+ "rstrip": false,
248
+ "single_word": false,
249
+ "special": true
250
+ },
251
+ "50283": {
252
+ "content": "[PAD]",
253
+ "lstrip": false,
254
+ "normalized": false,
255
+ "rstrip": false,
256
+ "single_word": false,
257
+ "special": true
258
+ },
259
+ "50284": {
260
+ "content": "[MASK]",
261
+ "lstrip": true,
262
+ "normalized": false,
263
+ "rstrip": false,
264
+ "single_word": false,
265
+ "special": true
266
+ },
267
+ "50285": {
268
+ "content": "[unused0]",
269
+ "lstrip": false,
270
+ "normalized": true,
271
+ "rstrip": false,
272
+ "single_word": false,
273
+ "special": false
274
+ },
275
+ "50286": {
276
+ "content": "[unused1]",
277
+ "lstrip": false,
278
+ "normalized": true,
279
+ "rstrip": false,
280
+ "single_word": false,
281
+ "special": false
282
+ },
283
+ "50287": {
284
+ "content": "[unused2]",
285
+ "lstrip": false,
286
+ "normalized": true,
287
+ "rstrip": false,
288
+ "single_word": false,
289
+ "special": false
290
+ },
291
+ "50288": {
292
+ "content": "[unused3]",
293
+ "lstrip": false,
294
+ "normalized": true,
295
+ "rstrip": false,
296
+ "single_word": false,
297
+ "special": false
298
+ },
299
+ "50289": {
300
+ "content": "[unused4]",
301
+ "lstrip": false,
302
+ "normalized": true,
303
+ "rstrip": false,
304
+ "single_word": false,
305
+ "special": false
306
+ },
307
+ "50290": {
308
+ "content": "[unused5]",
309
+ "lstrip": false,
310
+ "normalized": true,
311
+ "rstrip": false,
312
+ "single_word": false,
313
+ "special": false
314
+ },
315
+ "50291": {
316
+ "content": "[unused6]",
317
+ "lstrip": false,
318
+ "normalized": true,
319
+ "rstrip": false,
320
+ "single_word": false,
321
+ "special": false
322
+ },
323
+ "50292": {
324
+ "content": "[unused7]",
325
+ "lstrip": false,
326
+ "normalized": true,
327
+ "rstrip": false,
328
+ "single_word": false,
329
+ "special": false
330
+ },
331
+ "50293": {
332
+ "content": "[unused8]",
333
+ "lstrip": false,
334
+ "normalized": true,
335
+ "rstrip": false,
336
+ "single_word": false,
337
+ "special": false
338
+ },
339
+ "50294": {
340
+ "content": "[unused9]",
341
+ "lstrip": false,
342
+ "normalized": true,
343
+ "rstrip": false,
344
+ "single_word": false,
345
+ "special": false
346
+ },
347
+ "50295": {
348
+ "content": "[unused10]",
349
+ "lstrip": false,
350
+ "normalized": true,
351
+ "rstrip": false,
352
+ "single_word": false,
353
+ "special": false
354
+ },
355
+ "50296": {
356
+ "content": "[unused11]",
357
+ "lstrip": false,
358
+ "normalized": true,
359
+ "rstrip": false,
360
+ "single_word": false,
361
+ "special": false
362
+ },
363
+ "50297": {
364
+ "content": "[unused12]",
365
+ "lstrip": false,
366
+ "normalized": true,
367
+ "rstrip": false,
368
+ "single_word": false,
369
+ "special": false
370
+ },
371
+ "50298": {
372
+ "content": "[unused13]",
373
+ "lstrip": false,
374
+ "normalized": true,
375
+ "rstrip": false,
376
+ "single_word": false,
377
+ "special": false
378
+ },
379
+ "50299": {
380
+ "content": "[unused14]",
381
+ "lstrip": false,
382
+ "normalized": true,
383
+ "rstrip": false,
384
+ "single_word": false,
385
+ "special": false
386
+ },
387
+ "50300": {
388
+ "content": "[unused15]",
389
+ "lstrip": false,
390
+ "normalized": true,
391
+ "rstrip": false,
392
+ "single_word": false,
393
+ "special": false
394
+ },
395
+ "50301": {
396
+ "content": "[unused16]",
397
+ "lstrip": false,
398
+ "normalized": true,
399
+ "rstrip": false,
400
+ "single_word": false,
401
+ "special": false
402
+ },
403
+ "50302": {
404
+ "content": "[unused17]",
405
+ "lstrip": false,
406
+ "normalized": true,
407
+ "rstrip": false,
408
+ "single_word": false,
409
+ "special": false
410
+ },
411
+ "50303": {
412
+ "content": "[unused18]",
413
+ "lstrip": false,
414
+ "normalized": true,
415
+ "rstrip": false,
416
+ "single_word": false,
417
+ "special": false
418
+ },
419
+ "50304": {
420
+ "content": "[unused19]",
421
+ "lstrip": false,
422
+ "normalized": true,
423
+ "rstrip": false,
424
+ "single_word": false,
425
+ "special": false
426
+ },
427
+ "50305": {
428
+ "content": "[unused20]",
429
+ "lstrip": false,
430
+ "normalized": true,
431
+ "rstrip": false,
432
+ "single_word": false,
433
+ "special": false
434
+ },
435
+ "50306": {
436
+ "content": "[unused21]",
437
+ "lstrip": false,
438
+ "normalized": true,
439
+ "rstrip": false,
440
+ "single_word": false,
441
+ "special": false
442
+ },
443
+ "50307": {
444
+ "content": "[unused22]",
445
+ "lstrip": false,
446
+ "normalized": true,
447
+ "rstrip": false,
448
+ "single_word": false,
449
+ "special": false
450
+ },
451
+ "50308": {
452
+ "content": "[unused23]",
453
+ "lstrip": false,
454
+ "normalized": true,
455
+ "rstrip": false,
456
+ "single_word": false,
457
+ "special": false
458
+ },
459
+ "50309": {
460
+ "content": "[unused24]",
461
+ "lstrip": false,
462
+ "normalized": true,
463
+ "rstrip": false,
464
+ "single_word": false,
465
+ "special": false
466
+ },
467
+ "50310": {
468
+ "content": "[unused25]",
469
+ "lstrip": false,
470
+ "normalized": true,
471
+ "rstrip": false,
472
+ "single_word": false,
473
+ "special": false
474
+ },
475
+ "50311": {
476
+ "content": "[unused26]",
477
+ "lstrip": false,
478
+ "normalized": true,
479
+ "rstrip": false,
480
+ "single_word": false,
481
+ "special": false
482
+ },
483
+ "50312": {
484
+ "content": "[unused27]",
485
+ "lstrip": false,
486
+ "normalized": true,
487
+ "rstrip": false,
488
+ "single_word": false,
489
+ "special": false
490
+ },
491
+ "50313": {
492
+ "content": "[unused28]",
493
+ "lstrip": false,
494
+ "normalized": true,
495
+ "rstrip": false,
496
+ "single_word": false,
497
+ "special": false
498
+ },
499
+ "50314": {
500
+ "content": "[unused29]",
501
+ "lstrip": false,
502
+ "normalized": true,
503
+ "rstrip": false,
504
+ "single_word": false,
505
+ "special": false
506
+ },
507
+ "50315": {
508
+ "content": "[unused30]",
509
+ "lstrip": false,
510
+ "normalized": true,
511
+ "rstrip": false,
512
+ "single_word": false,
513
+ "special": false
514
+ },
515
+ "50316": {
516
+ "content": "[unused31]",
517
+ "lstrip": false,
518
+ "normalized": true,
519
+ "rstrip": false,
520
+ "single_word": false,
521
+ "special": false
522
+ },
523
+ "50317": {
524
+ "content": "[unused32]",
525
+ "lstrip": false,
526
+ "normalized": true,
527
+ "rstrip": false,
528
+ "single_word": false,
529
+ "special": false
530
+ },
531
+ "50318": {
532
+ "content": "[unused33]",
533
+ "lstrip": false,
534
+ "normalized": true,
535
+ "rstrip": false,
536
+ "single_word": false,
537
+ "special": false
538
+ },
539
+ "50319": {
540
+ "content": "[unused34]",
541
+ "lstrip": false,
542
+ "normalized": true,
543
+ "rstrip": false,
544
+ "single_word": false,
545
+ "special": false
546
+ },
547
+ "50320": {
548
+ "content": "[unused35]",
549
+ "lstrip": false,
550
+ "normalized": true,
551
+ "rstrip": false,
552
+ "single_word": false,
553
+ "special": false
554
+ },
555
+ "50321": {
556
+ "content": "[unused36]",
557
+ "lstrip": false,
558
+ "normalized": true,
559
+ "rstrip": false,
560
+ "single_word": false,
561
+ "special": false
562
+ },
563
+ "50322": {
564
+ "content": "[unused37]",
565
+ "lstrip": false,
566
+ "normalized": true,
567
+ "rstrip": false,
568
+ "single_word": false,
569
+ "special": false
570
+ },
571
+ "50323": {
572
+ "content": "[unused38]",
573
+ "lstrip": false,
574
+ "normalized": true,
575
+ "rstrip": false,
576
+ "single_word": false,
577
+ "special": false
578
+ },
579
+ "50324": {
580
+ "content": "[unused39]",
581
+ "lstrip": false,
582
+ "normalized": true,
583
+ "rstrip": false,
584
+ "single_word": false,
585
+ "special": false
586
+ },
587
+ "50325": {
588
+ "content": "[unused40]",
589
+ "lstrip": false,
590
+ "normalized": true,
591
+ "rstrip": false,
592
+ "single_word": false,
593
+ "special": false
594
+ },
595
+ "50326": {
596
+ "content": "[unused41]",
597
+ "lstrip": false,
598
+ "normalized": true,
599
+ "rstrip": false,
600
+ "single_word": false,
601
+ "special": false
602
+ },
603
+ "50327": {
604
+ "content": "[unused42]",
605
+ "lstrip": false,
606
+ "normalized": true,
607
+ "rstrip": false,
608
+ "single_word": false,
609
+ "special": false
610
+ },
611
+ "50328": {
612
+ "content": "[unused43]",
613
+ "lstrip": false,
614
+ "normalized": true,
615
+ "rstrip": false,
616
+ "single_word": false,
617
+ "special": false
618
+ },
619
+ "50329": {
620
+ "content": "[unused44]",
621
+ "lstrip": false,
622
+ "normalized": true,
623
+ "rstrip": false,
624
+ "single_word": false,
625
+ "special": false
626
+ },
627
+ "50330": {
628
+ "content": "[unused45]",
629
+ "lstrip": false,
630
+ "normalized": true,
631
+ "rstrip": false,
632
+ "single_word": false,
633
+ "special": false
634
+ },
635
+ "50331": {
636
+ "content": "[unused46]",
637
+ "lstrip": false,
638
+ "normalized": true,
639
+ "rstrip": false,
640
+ "single_word": false,
641
+ "special": false
642
+ },
643
+ "50332": {
644
+ "content": "[unused47]",
645
+ "lstrip": false,
646
+ "normalized": true,
647
+ "rstrip": false,
648
+ "single_word": false,
649
+ "special": false
650
+ },
651
+ "50333": {
652
+ "content": "[unused48]",
653
+ "lstrip": false,
654
+ "normalized": true,
655
+ "rstrip": false,
656
+ "single_word": false,
657
+ "special": false
658
+ },
659
+ "50334": {
660
+ "content": "[unused49]",
661
+ "lstrip": false,
662
+ "normalized": true,
663
+ "rstrip": false,
664
+ "single_word": false,
665
+ "special": false
666
+ },
667
+ "50335": {
668
+ "content": "[unused50]",
669
+ "lstrip": false,
670
+ "normalized": true,
671
+ "rstrip": false,
672
+ "single_word": false,
673
+ "special": false
674
+ },
675
+ "50336": {
676
+ "content": "[unused51]",
677
+ "lstrip": false,
678
+ "normalized": true,
679
+ "rstrip": false,
680
+ "single_word": false,
681
+ "special": false
682
+ },
683
+ "50337": {
684
+ "content": "[unused52]",
685
+ "lstrip": false,
686
+ "normalized": true,
687
+ "rstrip": false,
688
+ "single_word": false,
689
+ "special": false
690
+ },
691
+ "50338": {
692
+ "content": "[unused53]",
693
+ "lstrip": false,
694
+ "normalized": true,
695
+ "rstrip": false,
696
+ "single_word": false,
697
+ "special": false
698
+ },
699
+ "50339": {
700
+ "content": "[unused54]",
701
+ "lstrip": false,
702
+ "normalized": true,
703
+ "rstrip": false,
704
+ "single_word": false,
705
+ "special": false
706
+ },
707
+ "50340": {
708
+ "content": "[unused55]",
709
+ "lstrip": false,
710
+ "normalized": true,
711
+ "rstrip": false,
712
+ "single_word": false,
713
+ "special": false
714
+ },
715
+ "50341": {
716
+ "content": "[unused56]",
717
+ "lstrip": false,
718
+ "normalized": true,
719
+ "rstrip": false,
720
+ "single_word": false,
721
+ "special": false
722
+ },
723
+ "50342": {
724
+ "content": "[unused57]",
725
+ "lstrip": false,
726
+ "normalized": true,
727
+ "rstrip": false,
728
+ "single_word": false,
729
+ "special": false
730
+ },
731
+ "50343": {
732
+ "content": "[unused58]",
733
+ "lstrip": false,
734
+ "normalized": true,
735
+ "rstrip": false,
736
+ "single_word": false,
737
+ "special": false
738
+ },
739
+ "50344": {
740
+ "content": "[unused59]",
741
+ "lstrip": false,
742
+ "normalized": true,
743
+ "rstrip": false,
744
+ "single_word": false,
745
+ "special": false
746
+ },
747
+ "50345": {
748
+ "content": "[unused60]",
749
+ "lstrip": false,
750
+ "normalized": true,
751
+ "rstrip": false,
752
+ "single_word": false,
753
+ "special": false
754
+ },
755
+ "50346": {
756
+ "content": "[unused61]",
757
+ "lstrip": false,
758
+ "normalized": true,
759
+ "rstrip": false,
760
+ "single_word": false,
761
+ "special": false
762
+ },
763
+ "50347": {
764
+ "content": "[unused62]",
765
+ "lstrip": false,
766
+ "normalized": true,
767
+ "rstrip": false,
768
+ "single_word": false,
769
+ "special": false
770
+ },
771
+ "50348": {
772
+ "content": "[unused63]",
773
+ "lstrip": false,
774
+ "normalized": true,
775
+ "rstrip": false,
776
+ "single_word": false,
777
+ "special": false
778
+ },
779
+ "50349": {
780
+ "content": "[unused64]",
781
+ "lstrip": false,
782
+ "normalized": true,
783
+ "rstrip": false,
784
+ "single_word": false,
785
+ "special": false
786
+ },
787
+ "50350": {
788
+ "content": "[unused65]",
789
+ "lstrip": false,
790
+ "normalized": true,
791
+ "rstrip": false,
792
+ "single_word": false,
793
+ "special": false
794
+ },
795
+ "50351": {
796
+ "content": "[unused66]",
797
+ "lstrip": false,
798
+ "normalized": true,
799
+ "rstrip": false,
800
+ "single_word": false,
801
+ "special": false
802
+ },
803
+ "50352": {
804
+ "content": "[unused67]",
805
+ "lstrip": false,
806
+ "normalized": true,
807
+ "rstrip": false,
808
+ "single_word": false,
809
+ "special": false
810
+ },
811
+ "50353": {
812
+ "content": "[unused68]",
813
+ "lstrip": false,
814
+ "normalized": true,
815
+ "rstrip": false,
816
+ "single_word": false,
817
+ "special": false
818
+ },
819
+ "50354": {
820
+ "content": "[unused69]",
821
+ "lstrip": false,
822
+ "normalized": true,
823
+ "rstrip": false,
824
+ "single_word": false,
825
+ "special": false
826
+ },
827
+ "50355": {
828
+ "content": "[unused70]",
829
+ "lstrip": false,
830
+ "normalized": true,
831
+ "rstrip": false,
832
+ "single_word": false,
833
+ "special": false
834
+ },
835
+ "50356": {
836
+ "content": "[unused71]",
837
+ "lstrip": false,
838
+ "normalized": true,
839
+ "rstrip": false,
840
+ "single_word": false,
841
+ "special": false
842
+ },
843
+ "50357": {
844
+ "content": "[unused72]",
845
+ "lstrip": false,
846
+ "normalized": true,
847
+ "rstrip": false,
848
+ "single_word": false,
849
+ "special": false
850
+ },
851
+ "50358": {
852
+ "content": "[unused73]",
853
+ "lstrip": false,
854
+ "normalized": true,
855
+ "rstrip": false,
856
+ "single_word": false,
857
+ "special": false
858
+ },
859
+ "50359": {
860
+ "content": "[unused74]",
861
+ "lstrip": false,
862
+ "normalized": true,
863
+ "rstrip": false,
864
+ "single_word": false,
865
+ "special": false
866
+ },
867
+ "50360": {
868
+ "content": "[unused75]",
869
+ "lstrip": false,
870
+ "normalized": true,
871
+ "rstrip": false,
872
+ "single_word": false,
873
+ "special": false
874
+ },
875
+ "50361": {
876
+ "content": "[unused76]",
877
+ "lstrip": false,
878
+ "normalized": true,
879
+ "rstrip": false,
880
+ "single_word": false,
881
+ "special": false
882
+ },
883
+ "50362": {
884
+ "content": "[unused77]",
885
+ "lstrip": false,
886
+ "normalized": true,
887
+ "rstrip": false,
888
+ "single_word": false,
889
+ "special": false
890
+ },
891
+ "50363": {
892
+ "content": "[unused78]",
893
+ "lstrip": false,
894
+ "normalized": true,
895
+ "rstrip": false,
896
+ "single_word": false,
897
+ "special": false
898
+ },
899
+ "50364": {
900
+ "content": "[unused79]",
901
+ "lstrip": false,
902
+ "normalized": true,
903
+ "rstrip": false,
904
+ "single_word": false,
905
+ "special": false
906
+ },
907
+ "50365": {
908
+ "content": "[unused80]",
909
+ "lstrip": false,
910
+ "normalized": true,
911
+ "rstrip": false,
912
+ "single_word": false,
913
+ "special": false
914
+ },
915
+ "50366": {
916
+ "content": "[unused81]",
917
+ "lstrip": false,
918
+ "normalized": true,
919
+ "rstrip": false,
920
+ "single_word": false,
921
+ "special": false
922
+ },
923
+ "50367": {
924
+ "content": "[unused82]",
925
+ "lstrip": false,
926
+ "normalized": true,
927
+ "rstrip": false,
928
+ "single_word": false,
929
+ "special": false
930
+ }
931
+ },
932
+ "clean_up_tokenization_spaces": true,
933
+ "cls_token": "[CLS]",
934
+ "extra_special_tokens": {},
935
+ "mask_token": "[MASK]",
936
+ "model_input_names": [
937
+ "input_ids",
938
+ "attention_mask"
939
+ ],
940
+ "model_max_length": 1000000000000000019884624838656,
941
+ "pad_token": "[PAD]",
942
+ "sep_token": "[SEP]",
943
+ "tokenizer_class": "PreTrainedTokenizerFast",
944
+ "unk_token": "[UNK]"
945
+ }