truongcongminh commited on
Commit
a363f36
·
verified ·
1 Parent(s): 6601575

Upload 13 files

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
2_Dense/config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"in_features": 768, "out_features": 512, "bias": true, "activation_function": "torch.nn.modules.activation.Tanh"}
2_Dense/model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:986dd5160d6aa2e107d67874926028f49d5cf8a1f67874e2b2171d11b746384b
3
+ size 1575072
README.md ADDED
@@ -0,0 +1,1007 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/distiluse-base-multilingual-cased-v1
3
+ library_name: sentence-transformers
4
+ pipeline_tag: sentence-similarity
5
+ tags:
6
+ - sentence-transformers
7
+ - sentence-similarity
8
+ - feature-extraction
9
+ - generated_from_trainer
10
+ - dataset_size:5108
11
+ - loss:CosineSimilarityLoss
12
+ widget:
13
+ - source_sentence: 'EDUCATION
14
+
15
+ University of Roseton
16
+
17
+ Bachelor of Science in Computer Science | Graduated: 2016
18
+
19
+
20
+ Graduated Cum Laude with GPA: 3.9
21
+
22
+ Class Salutatorian
23
+
24
+ Computer Club President
25
+
26
+ Student Council Member
27
+
28
+ Winslough High School
29
+
30
+ High School Diploma | 2008 - 2012
31
+
32
+
33
+ Computer Club Vice President
34
+
35
+ Winslough Warriors Basketball Team Member
36
+
37
+ AP Scholar Award
38
+
39
+ National Honor Society Member
40
+
41
+ SKILLS AND ABILITIES
42
+
43
+ Programming Languages: Proficient in multiple programming languages
44
+
45
+ Problem Solving: Strong problem-solving and critical thinking skills
46
+
47
+ Expertise: Expertise in software systems and computer operating structures
48
+
49
+ Communication: Effective communication and teamwork skills'
50
+ sentences:
51
+ - 'Eduvator
52
+
53
+ Skills and expertise:
54
+
55
+ Degree:
56
+
57
+
58
+ University graduate, major in Computer Science or Information Technology.
59
+
60
+ Priority is given to candidates who graduated from universities such as University
61
+ of Natural Sciences, University of Information Technology - VNU-HCM, University
62
+ of Technology, etc.
63
+
64
+ Experience:
65
+
66
+
67
+ At least 2 years of experience developing software for enterprise solutions, especially
68
+ using NodeJS.
69
+
70
+ Professional skills:
71
+
72
+
73
+ Experience with NodeJS and familiarity with ExpressJS (priority is given to those
74
+ with experience with Apollo Server).
75
+
76
+ Good knowledge of SQL and NoSQL databases, ability to write complex queries (priority
77
+ is given to those who know how to use MongoDB).
78
+
79
+ Proficient or experienced with Docker.
80
+
81
+ Good problem-solving skills, proficient in handling and manipulating strings/characters
82
+ and arrays.
83
+
84
+ Strong knowledge of the git source code management system.
85
+
86
+ Learning ability:
87
+
88
+
89
+ Ability to learn new technologies quickly.'
90
+ - 'Skills & Expertise:
91
+
92
+
93
+ Requirements:
94
+
95
+
96
+ Education:
97
+
98
+
99
+ Graduated from a regular university (with a degree), priority is given to majors:
100
+ Information Technology, Telecommunications, Information Systems, Computer Science,
101
+ etc.
102
+
103
+ Experience:
104
+
105
+
106
+ At least 3 years of experience as a Java programmer.
107
+
108
+ Professional knowledge:
109
+
110
+
111
+ Strong knowledge of Object Oriented Programming (OOP), Web Development, Database
112
+ Management System (DBMS), ORM, Design Pattern.
113
+
114
+ Proficient in Restful API, HTML, CSS, JavaScript (Jquery, ...), Bootstrap, JSON,
115
+ XML.
116
+
117
+ Proficient in PostgreSQL (or MySQL, MSSQL, Oracle), priority is given to candidates
118
+ with the ability to design DB and optimize system performance.
119
+
120
+ Understand and apply MVC model.
121
+
122
+ Proficient in Git.'
123
+ - 'Skills & Expertise:
124
+
125
+
126
+ Requirements:
127
+
128
+
129
+ Education:
130
+
131
+
132
+ University degree or higher in Information Technology or Electronics - Telecommunications.
133
+
134
+ Professional certificates:
135
+
136
+
137
+ CCNA or CCNP certificate.
138
+
139
+ Priority is given to candidates with certificates such as: MCSA, Azure, VMware,
140
+ CEH, OSCE, OSCP, COMPTIA Pentest+ or equivalent.
141
+
142
+ Foreign languages:
143
+
144
+
145
+ English equivalent to Toeic 400 points or higher.
146
+
147
+ Office skills:
148
+
149
+
150
+ Proficient in MS Office, Visio.'
151
+ - source_sentence: 'EDUCATION Bachelor of Technology Really Great University Really
152
+ Great Company 2016 - Present Responsible for database administration and website
153
+ design.
154
+
155
+ Developed the logic for a streamlined ad-serving platform that scales effectively
156
+ for educational institutions and online classroom management.
157
+
158
+
159
+ SKILLS Web Design Design Thinking Wireframe Creation Front-End Coding Back-End
160
+ Technology Problem Solving Computer Literacy Project Management Tools Strong Communication
161
+ Skills'
162
+ sentences:
163
+ - 'ABBANK
164
+
165
+
166
+ Skill requirements:
167
+
168
+
169
+ University or master''s degree in: Information Security, Information Technology,
170
+ Software Engineering, Computer Science.
171
+
172
+ At least 5 years of experience in the field of information security.
173
+
174
+ At least 2 years of experience in the field of DevOps / DevSecOps.
175
+
176
+ At least 2 years of experience working with Cloud platforms (AWS, Azure, GCP).
177
+
178
+ Deep knowledge of IT in the field of DevSecOps, solutions in On-premise, Cloud
179
+ native and Hybrid environments (Container, Kubernetes, Docker, Git, CI/CD Jenkins,
180
+ GitLab CI, GitHub...).
181
+
182
+ Skills related to exploiting vulnerabilities, security weaknesses.
183
+
184
+ Understanding of OWASP, MITRE Attack.
185
+
186
+ Deep knowledge of CI/CD pipeline model, deploying integrated security automation
187
+ solutions in the development process.
188
+
189
+ Develop documentation and standard processes to deploy and operate security solutions
190
+ on Cloud/DevSecOps in a stable and secure manner.
191
+
192
+ Have certificates such as CEH, OSCP, CCSP... or equivalent; having a certificate
193
+ in GCP Cloud Security is an advantage.
194
+
195
+ Education requirements:
196
+
197
+
198
+ Graduate from university or master''s degree in related field.
199
+
200
+ Language requirements:
201
+
202
+
203
+ Being able to communicate in English is an advantage.'
204
+ - 'FPT
205
+
206
+ Skills and qualifications required:
207
+
208
+
209
+ General standards:
210
+
211
+
212
+ Be a Vietnamese citizen, have a permanent residence in Vietnam.
213
+
214
+ Be under 35 years old.
215
+
216
+ Be in good health to perform the job.
217
+
218
+ Have good moral qualities, no criminal record, no detention, no prison sentence,
219
+ suspended sentence, non-custodial reform, no local education status, medical treatment,
220
+ drug rehabilitation, etc.
221
+
222
+ Specific standards:
223
+
224
+
225
+ Professional qualifications:
226
+
227
+
228
+ Graduate from university or higher, regular system (including second university,
229
+ not including university transfer) at domestic universities or graduate from university
230
+ or higher at foreign universities, affiliated universities.
231
+
232
+ Major: Graduated from majors such as Information Technology, Information Security,
233
+ Telecommunications Electronics, Information Electronics, Mathematics - Informatics
234
+ or equivalent majors.
235
+
236
+ Foreign Language:
237
+
238
+
239
+ Having one of the following certificates: TOEIC 600/990, TOEFL PBT/IPT 500/677,
240
+ TOEFL CBT 173/300, TOEFL iBT 61/120, IELTS 5.5/9.0, Cambridge Exam First (FCE),
241
+ B2 European Framework, 4/6 6-level Foreign Language Proficiency Framework for
242
+ Vietnam. Accepting additional English certificates within 24 months from the date
243
+ of recruitment.
244
+
245
+ Knowledge, skills and experience:
246
+
247
+
248
+ Experience in managing server systems and storage systems is an advantage.
249
+
250
+ Having one of the following certificates: MCITP-SA/Oracle Certified Associate/Oracle
251
+ Solaris 11 System Administrator Certification/Linux Professional Institute LPIC-1
252
+ is an advantage.
253
+
254
+ Withstand high work pressure, accept to monitor, track systems and handle incidents
255
+ according to job requirements. Have a serious, careful and enthusiastic working
256
+ attitude.
257
+
258
+ Ability to research and learn technology quickly.
259
+
260
+ Good communication and problem-solving skills.
261
+
262
+ Ability to work in a team or independently.'
263
+ - 'Professional qualifications:
264
+
265
+ Experience: At least 1 year of participating in projects related to Machine Learning
266
+ and image processing.
267
+
268
+ Programming languages: Proficiency in Python and using frameworks such as TensorFlow,
269
+ PyTorch, Keras.
270
+
271
+ Programming skills: C/C++ skills are an advantage.
272
+
273
+ CAD experience: Experience in AI for CAD (AutoCAD, SolidWorks,...) and reading
274
+ CAD drawings in Python/C++ is an advantage.
275
+
276
+ CAD knowledge: Knowledge of 2D to 3D, 3D CAD and Gaussian Splatting is an advantage.
277
+
278
+ Awards: Priority is given to those who have IT/mathematical awards or scientific
279
+ papers in conferences on Image Processing, Image Recognition.
280
+
281
+ Language: Good English if the candidate is capable and can work full-time.'
282
+ - source_sentence: 'PROFESSIONAL EXPERIENCE
283
+
284
+ Giggling Platypus Co.
285
+
286
+ Software Engineer
287
+
288
+ 01 Jun 2052 - Present
289
+
290
+
291
+ Designed and implemented a new microservice architecture using Borcelle, improving
292
+ application performance by 20%.
293
+
294
+ Developed and maintained critical features for the company''s core product, resulting
295
+ in a 15% increase in user engagement.
296
+
297
+ Lead the development of a new automated testing framework using Rimberio, reducing
298
+ manual testing time by 30%.
299
+
300
+ Developed and implemented a real-time data processing pipeline using Rimberio
301
+ to handle a 10x increase in data volume.
302
+
303
+ Data Thynk Unlimited
304
+
305
+ Software Engineer
306
+
307
+ 01 Aug 2050 - 30 Apr 2052
308
+
309
+
310
+ Developed custom software solutions to improve application performance.
311
+
312
+ Worked closely with cross-functional teams to ensure requirements and quality
313
+ standards were met.
314
+
315
+ EDUCATION
316
+
317
+ Fauget University
318
+
319
+ Bachelor of Science in Information Technology
320
+
321
+ June 2050
322
+
323
+
324
+ SKILLS & COMPETENCIES
325
+
326
+ Programming Languages: Rimberio, Borcelle, Java, Python
327
+
328
+ Software Development: System Design, Automated Testing, CI/CD
329
+
330
+ Analytical and Problem Solving Skills: Requirements Analysis, Performance Optimization
331
+
332
+ Communication and Collaboration: Teamwork, Agile Project Management
333
+
334
+ Adaptability and Continuous Learning: Quickly Assimilate New Technologies
335
+
336
+ CERTIFICATIONS
337
+
338
+ Cloud Certified Rimberio | 2050
339
+
340
+ Software Lifecycle Professional Borcelle | 2050
341
+
342
+ Scrum Developer Borcelle | 2050'
343
+ sentences:
344
+ - 'Skills and expertise:
345
+
346
+
347
+ Education:
348
+
349
+
350
+ University graduate majoring in Information Technology or related field.
351
+
352
+ Experience:
353
+
354
+
355
+ Minimum 2 years of front-end development experience using the following technologies:
356
+
357
+ Proficient in front-end development with JavaScript.
358
+
359
+ Experience developing applications for Windows operating systems.
360
+
361
+ Passionate about developing applications for Windows platforms.'
362
+ - 'Required Skills:
363
+
364
+
365
+ Experience:
366
+
367
+
368
+ Minimum 3 years of experience working with .NET or .NET Core.
369
+
370
+ Proficient in back-end technologies: ASP.NET Core, EF6 Code First, Identity Server
371
+ 4, RESTful API, SQL Server 2016, C#, Unit Test.
372
+
373
+ Good knowledge of ASP.NET MVC frameworks (3,4,5,6), JavaScript, jQuery, JSON,
374
+ Web API and web application security.
375
+
376
+ Additional Skills:
377
+
378
+
379
+ Experience with NodeJS or WPF, WinForms is an advantage.
380
+
381
+ Preferably with Microsoft MCP/MCSD certification in web applications or Azure
382
+ solutions.
383
+
384
+ Understanding of Agile/Scrum development methodology.
385
+
386
+ Personal Skills:
387
+
388
+
389
+ Honest and confident when working directly with customers.
390
+
391
+ Passionate about developing innovative products and willing to learn new technologies
392
+ from Microsoft Tech Stack and Azure cloud services.
393
+
394
+ Ability to analyze and translate business needs into system design and technical
395
+ solutions.'
396
+ - 'Qualifications:
397
+
398
+ Experience:
399
+
400
+ Minimum 2-3 years of experience working with Fullstack technologies, including:
401
+
402
+ Frontend: Vue.js (TypeScript), React Native or Flutter.
403
+
404
+ Backend: Node.js, C# .NET Core.
405
+
406
+ Database: MySQL, SQL Server.
407
+
408
+ Architecture: Microservices.
409
+
410
+ Advantages:
411
+
412
+ Knowledge of Kubernetes (K8s) and ArgoCD.
413
+
414
+ Experience working with CI/CD and DevOps systems is a big plus.
415
+
416
+ Good problem-solving skills, ability to work independently and collaborate effectively
417
+ in a team.
418
+
419
+ Logical thinking and willingness to learn new technologies.'
420
+ - source_sentence: 'EDUCATION Bachelor of Technology Really Great University Really
421
+ Great Company 2016 - Present Responsible for database administration and website
422
+ design.
423
+
424
+ Developed the logic for a streamlined ad-serving platform that scales effectively
425
+ for educational institutions and online classroom management.
426
+
427
+
428
+ SKILLS Web Design Design Thinking Wireframe Creation Front-End Coding Back-End
429
+ Technology Problem Solving Computer Literacy Project Management Tools Strong Communication
430
+ Skills'
431
+ sentences:
432
+ - 'LG CNS Vietnam
433
+
434
+
435
+ Skill Requirements
436
+
437
+ University graduate majoring in IT.
438
+
439
+ Good English communication skills.
440
+
441
+ At least 1 year of experience with C#, .NET framework.
442
+
443
+ Experience with SQL and DB Function/Procedure.
444
+
445
+ Experience with complex SQL optimization (PL/SQL preferred).
446
+
447
+ Experience developing Zebra print (will be trained if joining the company).
448
+
449
+ Experience with C++, Unix/Linux (will be trained if joining the company).
450
+
451
+ Ability to read/write Korean is an advantage.
452
+
453
+ Education Requirements
454
+
455
+ University graduate majoring in related field.
456
+
457
+ Language Requirements
458
+
459
+ Good English communication skills.
460
+
461
+ Ability to read/write Korean is an advantage.'
462
+ - 'CA Advance
463
+
464
+ Skills & Expertise:
465
+
466
+ General Requirements:
467
+
468
+ Education:
469
+
470
+
471
+ College degree or higher.
472
+
473
+ Development Experience:
474
+
475
+
476
+ 2+ years of experience developing Web systems.
477
+
478
+ Source Code Management:
479
+
480
+
481
+ 2+ years of experience using Git, GitHub or GitLab.
482
+
483
+ Learning Ability:
484
+
485
+
486
+ Willing to learn and develop new languages.
487
+
488
+ Specific Experience:
489
+
490
+ 2 years of experience in:
491
+
492
+
493
+ Using HTML, CSS and Bootstrap.
494
+
495
+ Using JavaScript.
496
+
497
+ Using Next.js, React.js and Redux.
498
+
499
+ Using Restful API.
500
+
501
+ 1 year of experience in:
502
+
503
+
504
+ Developing on public cloud (AWS, Azure, GCP).
505
+
506
+ Developing using Agile methodology.
507
+
508
+ Experience developing team projects.'
509
+ - 'Professional Qualifications:
510
+
511
+ Education:
512
+
513
+ University degree in Information Technology, Software Engineering or related fields.
514
+
515
+ Experience:
516
+
517
+ More than 4 years of experience as a software developer.
518
+
519
+ Skills:
520
+
521
+ Experience managing configuration management platforms such as Git, GitHub.
522
+
523
+ Familiar with both WinForm and WebForm applications.
524
+
525
+ Understanding SQL (MSSQL).
526
+
527
+ Using tools: Confluence, Jira, GitHub, CI/CD.
528
+
529
+ Understanding of software development processes: Agile, Waterfall.
530
+
531
+ Good communication skills (preferred).
532
+
533
+ Experience with .NET both WinForm and WebForm, Node.js, Angular.
534
+
535
+ Good skills to have:
536
+
537
+ Self-management mindset.
538
+
539
+ Experience with Information Management Systems.'
540
+ - source_sentence: 'EDUCATION
541
+
542
+ BA in Management Information Systems
543
+
544
+ Duy Tan University (2021 - Expected completion 05/2025)
545
+
546
+ GPA: 3.6/4.0
547
+
548
+ TECHNICAL SKILLS
549
+
550
+ Frontend:
551
+
552
+
553
+ Languages ??& Frameworks: HTML, CSS, JavaScript, TypeScript
554
+
555
+ Libraries & Tools: TailwindCSS, React.js, Next.js
556
+
557
+ Backend:
558
+
559
+
560
+ Main Framework: NestJS, ExpressJS
561
+
562
+ Database & ORM: PostgreSQL, MongoDB, TypeOrm, Mongoose
563
+
564
+ Cloud Services: AWS, Elasticsearch
565
+
566
+
567
+ Messaging & Streaming: KafkaJS, WebSocket
568
+
569
+
570
+ Container & Deployment: Docker
571
+
572
+
573
+ Other Tools:
574
+
575
+
576
+ Version Control: Git, GitHub
577
+
578
+ CI/CD: Vercel
579
+
580
+ PERSONAL PROJECTS
581
+
582
+ Nestgres
583
+
584
+ GitHub Repository: Learning Project with NestJS and PostgreSQL. Integrate AWS
585
+ S3 for file storage, Docker for containerization, and Elasticsearch for advanced
586
+ search. This project helped me gain a deep understanding of building a scalable
587
+ backend system and handling big data.
588
+
589
+
590
+ Simple Todo
591
+
592
+ Live Demo: A simple to-do list application to reinforce my knowledge of React,
593
+ including component architecture, state management, and efficient rendering.
594
+
595
+
596
+ Nestactube
597
+
598
+ GitHub Repository: A full-stack video platform combining NestJS and React, serving
599
+ video streaming from backend to frontend. Focusing on handling large media files
600
+ and ensuring smooth video playback.
601
+
602
+
603
+ MindForge
604
+
605
+ GitHub Repository - Live Demo: Clone of Notion with note creation and editing
606
+ features. Using Convex for secure data storage and Clerk for user authentication.
607
+ The project helped me develop a friendly interface and handle complex data structures.
608
+
609
+
610
+ AWARDS & ACHIEVEMENTS
611
+
612
+ Excellent Academic Performance - 2022, 2023
613
+
614
+ Boeing Scholarship - 2022, 2023
615
+
616
+ Third Prize - Duy Tan Informatics Competition, 2023
617
+
618
+ Third Prize - Informatics Olympiad (non-specialist group), 2023
619
+
620
+ CERTIFICATIONS
621
+
622
+ Foundations of User Experience (UX) Design'
623
+ sentences:
624
+ - 'Skills and Qualifications Required:
625
+
626
+
627
+ Qualifications:
628
+
629
+
630
+ Bachelor''s degree in Computer Science, Computer Networking or related fields
631
+ from Universities such as University of Science, University of Natural Sciences,
632
+ University of Information Technology, Vietnam National University, Ho Chi Minh
633
+ City, or Vietnam National University, Hanoi.
634
+
635
+ Only candidates with academic background and practical experience directly related
636
+ to Information Technology will be considered (candidates from short-term or non-major
637
+ programs will not be accepted).
638
+
639
+ Grade Point Average (GPA):
640
+
641
+
642
+ Minimum GPA: 7.0 (on a scale of 10) or 2.8 (on a scale of 4)
643
+
644
+
645
+ Technical Skills:
646
+
647
+
648
+ Strong programming skills with C/C++, along with knowledge of object-oriented
649
+ programming.
650
+
651
+ Have practical experience (1 year or more) working with C/C++.
652
+
653
+ Basic knowledge of operating systems such as Windows, Linux, and MacOS.
654
+
655
+ Understanding of network protocols and security principles. Strong team working
656
+ skills and problem solving ability.'
657
+ - 'Qualifications:
658
+
659
+
660
+ Expected Skills:
661
+
662
+
663
+ Programming Languages: Proficiency in C# and experience working with Unity3D.
664
+
665
+ Game Development: Solid understanding of game mechanics, UI/UX, and physics.
666
+
667
+ Cybersecurity Tools: Experience with cyber security tools with a defensive/offensive
668
+ mindset.
669
+
670
+ Performance Optimization: Solid skills in game optimization and memory management.
671
+
672
+ Version Management Knowledge: Familiarity with Git or similar version control
673
+ systems.
674
+
675
+ Preferred Skills:
676
+
677
+
678
+ AR/VR development experience is a plus.
679
+
680
+ Knowledge of multi-player and networking concepts, along with creativity.
681
+
682
+ Education:
683
+
684
+
685
+ Bachelor''s degree in Computer Science, Game Development, or related field, or
686
+ equivalent hands-on experience.'
687
+ - 'Professional qualifications:
688
+
689
+ Experience: At least 1 year of participating in projects related to Machine Learning
690
+ and image processing.
691
+
692
+ Programming languages: Proficiency in Python and using frameworks such as TensorFlow,
693
+ PyTorch, Keras.
694
+
695
+ Programming skills: C/C++ skills are an advantage.
696
+
697
+ CAD experience: Experience in AI for CAD (AutoCAD, SolidWorks,...) and reading
698
+ CAD drawings in Python/C++ is an advantage.
699
+
700
+ CAD knowledge: Knowledge of 2D to 3D, 3D CAD and Gaussian Splatting is an advantage.
701
+
702
+ Awards: Priority is given to those who have IT/mathematical awards or scientific
703
+ papers in conferences on Image Processing, Image Recognition.
704
+
705
+ Language: Good English if the candidate is capable and can work full-time.'
706
+ ---
707
+
708
+ # SentenceTransformer based on sentence-transformers/distiluse-base-multilingual-cased-v1
709
+
710
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/distiluse-base-multilingual-cased-v1](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v1). It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
711
+
712
+ ## Model Details
713
+
714
+ ### Model Description
715
+ - **Model Type:** Sentence Transformer
716
+ - **Base model:** [sentence-transformers/distiluse-base-multilingual-cased-v1](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v1) <!-- at revision 30487a83b196a60bdf70abff0505ec70f112213c -->
717
+ - **Maximum Sequence Length:** 128 tokens
718
+ - **Output Dimensionality:** 512 tokens
719
+ - **Similarity Function:** Cosine Similarity
720
+ <!-- - **Training Dataset:** Unknown -->
721
+ <!-- - **Language:** Unknown -->
722
+ <!-- - **License:** Unknown -->
723
+
724
+ ### Model Sources
725
+
726
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
727
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
728
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
729
+
730
+ ### Full Model Architecture
731
+
732
+ ```
733
+ SentenceTransformer(
734
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel
735
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
736
+ (2): Dense({'in_features': 768, 'out_features': 512, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
737
+ )
738
+ ```
739
+
740
+ ## Usage
741
+
742
+ ### Direct Usage (Sentence Transformers)
743
+
744
+ First install the Sentence Transformers library:
745
+
746
+ ```bash
747
+ pip install -U sentence-transformers
748
+ ```
749
+
750
+ Then you can load this model and run inference.
751
+ ```python
752
+ from sentence_transformers import SentenceTransformer
753
+
754
+ # Download from the 🤗 Hub
755
+ model = SentenceTransformer("sentence_transformers_model_id")
756
+ # Run inference
757
+ sentences = [
758
+ 'EDUCATION\nBA in Management Information Systems\nDuy Tan University (2021 - Expected completion 05/2025)\nGPA: 3.6/4.0\nTECHNICAL SKILLS\nFrontend:\n\nLanguages ??& Frameworks: HTML, CSS, JavaScript, TypeScript\nLibraries & Tools: TailwindCSS, React.js, Next.js\nBackend:\n\nMain Framework: NestJS, ExpressJS\nDatabase & ORM: PostgreSQL, MongoDB, TypeOrm, Mongoose\nCloud Services: AWS, Elasticsearch\n\nMessaging & Streaming: KafkaJS, WebSocket\n\nContainer & Deployment: Docker\n\nOther Tools:\n\nVersion Control: Git, GitHub\nCI/CD: Vercel\nPERSONAL PROJECTS\nNestgres\nGitHub Repository: Learning Project with NestJS and PostgreSQL. Integrate AWS S3 for file storage, Docker for containerization, and Elasticsearch for advanced search. This project helped me gain a deep understanding of building a scalable backend system and handling big data.\n\nSimple Todo\nLive Demo: A simple to-do list application to reinforce my knowledge of React, including component architecture, state management, and efficient rendering.\n\nNestactube\nGitHub Repository: A full-stack video platform combining NestJS and React, serving video streaming from backend to frontend. Focusing on handling large media files and ensuring smooth video playback.\n\nMindForge\nGitHub Repository - Live Demo: Clone of Notion with note creation and editing features. Using Convex for secure data storage and Clerk for user authentication. The project helped me develop a friendly interface and handle complex data structures.\n\nAWARDS & ACHIEVEMENTS\nExcellent Academic Performance - 2022, 2023\nBoeing Scholarship - 2022, 2023\nThird Prize - Duy Tan Informatics Competition, 2023\nThird Prize - Informatics Olympiad (non-specialist group), 2023\nCERTIFICATIONS\nFoundations of User Experience (UX) Design',
759
+ "Skills and Qualifications Required:\n\nQualifications:\n\nBachelor's degree in Computer Science, Computer Networking or related fields from Universities such as University of Science, University of Natural Sciences, University of Information Technology, Vietnam National University, Ho Chi Minh City, or Vietnam National University, Hanoi.\nOnly candidates with academic background and practical experience directly related to Information Technology will be considered (candidates from short-term or non-major programs will not be accepted).\nGrade Point Average (GPA):\n\nMinimum GPA: 7.0 (on a scale of 10) or 2.8 (on a scale of 4)\n\nTechnical Skills:\n\nStrong programming skills with C/C++, along with knowledge of object-oriented programming.\nHave practical experience (1 year or more) working with C/C++.\nBasic knowledge of operating systems such as Windows, Linux, and MacOS.\nUnderstanding of network protocols and security principles. Strong team working skills and problem solving ability.",
760
+ "Qualifications:\n\nExpected Skills:\n\nProgramming Languages: Proficiency in C# and experience working with Unity3D.\nGame Development: Solid understanding of game mechanics, UI/UX, and physics.\nCybersecurity Tools: Experience with cyber security tools with a defensive/offensive mindset.\nPerformance Optimization: Solid skills in game optimization and memory management.\nVersion Management Knowledge: Familiarity with Git or similar version control systems.\nPreferred Skills:\n\nAR/VR development experience is a plus.\nKnowledge of multi-player and networking concepts, along with creativity.\nEducation:\n\nBachelor's degree in Computer Science, Game Development, or related field, or equivalent hands-on experience.",
761
+ ]
762
+ embeddings = model.encode(sentences)
763
+ print(embeddings.shape)
764
+ # [3, 512]
765
+
766
+ # Get the similarity scores for the embeddings
767
+ similarities = model.similarity(embeddings, embeddings)
768
+ print(similarities.shape)
769
+ # [3, 3]
770
+ ```
771
+
772
+ <!--
773
+ ### Direct Usage (Transformers)
774
+
775
+ <details><summary>Click to see the direct usage in Transformers</summary>
776
+
777
+ </details>
778
+ -->
779
+
780
+ <!--
781
+ ### Downstream Usage (Sentence Transformers)
782
+
783
+ You can finetune this model on your own dataset.
784
+
785
+ <details><summary>Click to expand</summary>
786
+
787
+ </details>
788
+ -->
789
+
790
+ <!--
791
+ ### Out-of-Scope Use
792
+
793
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
794
+ -->
795
+
796
+ <!--
797
+ ## Bias, Risks and Limitations
798
+
799
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
800
+ -->
801
+
802
+ <!--
803
+ ### Recommendations
804
+
805
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
806
+ -->
807
+
808
+ ## Training Details
809
+
810
+ ### Training Dataset
811
+
812
+ #### Unnamed Dataset
813
+
814
+
815
+ * Size: 5,108 training samples
816
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
817
+ * Approximate statistics based on the first 1000 samples:
818
+ | | sentence_0 | sentence_1 | label |
819
+ |:--------|:-------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:----------------------------------------------------------------|
820
+ | type | string | string | float |
821
+ | details | <ul><li>min: 44 tokens</li><li>mean: 106.09 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 46 tokens</li><li>mean: 117.55 tokens</li><li>max: 128 tokens</li></ul> | <ul><li>min: 0.5</li><li>mean: 0.66</li><li>max: 0.74</li></ul> |
822
+ * Samples:
823
+ | sentence_0 | sentence_1 | label |
824
+ |||:------------------|
825
+ | <code>EDUCATION<br>Software Engineer (CMU)<br>Duy Tan University, 09/2021 - Present<br>GPA: 3.6/4.0<br>OUTSTANDING PROJECTS<br>Fashion Store Website<br>Time: 17/01 - 25/02<br>Role: Backend Developer<br><br>Features:<br><br>Register, log in, log out (customer rights)<br>Log in and manage with admin rights<br>Add, edit, delete products and vouchers (admin)<br>Display product and voucher list<br>Technology:<br><br>Frontend: Customer page & admin dashboard (edit from available template)<br>Backend: Typescript (NestJS)<br>Database: MongoDB (Mongoose)<br>GitHub Link: Fashion Store Project<br><br>SKILLS<br>Front-end: HTML, CSS, JavaScript, Bootstrap 5, ReactJS<br>Back-end: ExpressJS, NestJS (TypeScript)<br>Database: MySQL, MongoDB<br>OTHER TOOLS & TECHNOLOGIES<br>Tools: Git (GitHub), Docker, Postman<br>Other languages: C# (ASP.NET), Java (Spring Boot)<br>LANGUAGES<br>English: Good communication and reading skills</code> | <code>Professional Qualifications:<br>Experience: Minimum 2 years of experience working with Java, proficient in Spring Boot.<br>Frontend Framework: Proficient in using one of the Frontend frameworks (such as Vue, React,...) to develop interactive user interfaces.<br>Database: Experience working with relational databases such as MS SQL Server, Oracle, PostgreSQL, MongoDB.<br>Microservice Architecture: Experience with microservice architecture and containerization technologies such as Kubernetes/Docker.<br>Multitasking: Ability to handle multitasking, multithreading, multiprocessing, and mechanisms such as hash tables, file processing mechanisms.<br>API Integration: Experience in integrating/developing RESTful, SOAP, TCP/IP APIs.<br>Operating System: Experience working with Enterprise Linux and Windows Server operating systems.</code> | <code>0.65</code> |
826
+ | <code>EDUCATION HISTORY University of Roseton Master of Science in Software Engineering | Graduated: 2020 Best Thesis Awardee Berou Solutions Scholarship Recipient De Loureigh University Bachelor of Science in Computer Science | Graduated: 2016 (Cum Laude) Founder of DLU Programming Club Hackathon Champion Beechtown 2015 RELEVANT SKILLS Programming Languages: JavaScript, C/C++, Java, Python, Kotlin, Go Core Skills: Problem Solving, Team Communication</code> | <code>FPT<br><br>Skill and qualification requirements:<br><br>General standards:<br><br>Be a Vietnamese citizen, have a permanent residence in Vietnam.<br>Be under 35 years old.<br>Be in good health to perform the job.<br>Have good moral qualities, no criminal record, no detention, no prison sentence, suspended sentence, non-custodial reform, no local education status, medical treatment, drug rehabilitation, etc.<br>Specific standards:<br><br>Professional qualifications:<br><br>Graduate from university or higher, regular system (including second university, not including university transfer) at domestic universities or graduate from university or higher at foreign universities, affiliated universities.<br>Major: Graduated from majors such as Information Technology, Information Security, Telecommunications Electronics, Information Electronics, Mathematics - Informatics or equivalent majors.<br>Foreign Language:<br><br>Having one of the following certificates: TOEIC 600/990, TOEFL PBT/IPT 500/677, TOEFL CBT 173/300, TOEFL iBT 61/120, IELTS 5.5/9.0, Cambridge Exam First (FCE), B2 European Framework, 4/6 6-level Foreign Language Proficiency Framework for Vietnam. Accepting additional English certificates within 24 months from the date of recruitment.<br>Knowledge, skills and experience:<br><br>Experience in managing server systems and storage systems is an advantage.<br>Having one of the following certificates: MCITP-SA/Oracle Certified Associate/Oracle Solaris 11 System Administrator Certification/Linux Professional Institute LPIC-1 is an advantage.<br>Withstand high work pressure, accept to monitor, track systems and handle incidents according to job requirements. Have a serious, careful and enthusiastic working attitude.<br>Ability to research and learn technology quickly.<br>Good communication and problem-solving skills.<br>Ability to work in a team or independently.</code> | <code>0.66</code> |
827
+ | <code>EDUCATION Bachelor of Science in Computer Science Rutgers University ? New Brunswick, NJ 2008 - 2012 SKILLS HTML CSS JavaScript React jQuery Angular.js Vue.js Enzyme Jest Git</code> | <code>ABBANK<br><br>Skill requirements:<br><br>Minimum 5 years of experience as a BA in CLIMS system.<br>Experience in lending activities.<br>Experience in developing test scenarios and test plans for software projects.<br>Understanding of software development processes.<br>Ensure compliance with application development standards and quality processes.<br>Ability to analyze and resolve complex technical issues, provide feasible and timely solutions.<br>Effective communication and teamwork skills when collaborating with different functional groups.<br>Focus on Agile implementation and willingness to learn and grow from experience.<br>Education requirements:<br><br>University degree in IT, banking, finance or related fields.<br>Language requirements:<br><br>Being able to communicate in English is an advantage.</code> | <code>0.62</code> |
828
+ * Loss: [<code>CosineSimilarityLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters:
829
+ ```json
830
+ {
831
+ "loss_fct": "torch.nn.modules.loss.MSELoss"
832
+ }
833
+ ```
834
+
835
+ ### Training Hyperparameters
836
+ #### Non-Default Hyperparameters
837
+
838
+ - `per_device_train_batch_size`: 16
839
+ - `per_device_eval_batch_size`: 16
840
+ - `multi_dataset_batch_sampler`: round_robin
841
+
842
+ #### All Hyperparameters
843
+ <details><summary>Click to expand</summary>
844
+
845
+ - `overwrite_output_dir`: False
846
+ - `do_predict`: False
847
+ - `eval_strategy`: no
848
+ - `prediction_loss_only`: True
849
+ - `per_device_train_batch_size`: 16
850
+ - `per_device_eval_batch_size`: 16
851
+ - `per_gpu_train_batch_size`: None
852
+ - `per_gpu_eval_batch_size`: None
853
+ - `gradient_accumulation_steps`: 1
854
+ - `eval_accumulation_steps`: None
855
+ - `torch_empty_cache_steps`: None
856
+ - `learning_rate`: 5e-05
857
+ - `weight_decay`: 0.0
858
+ - `adam_beta1`: 0.9
859
+ - `adam_beta2`: 0.999
860
+ - `adam_epsilon`: 1e-08
861
+ - `max_grad_norm`: 1
862
+ - `num_train_epochs`: 3
863
+ - `max_steps`: -1
864
+ - `lr_scheduler_type`: linear
865
+ - `lr_scheduler_kwargs`: {}
866
+ - `warmup_ratio`: 0.0
867
+ - `warmup_steps`: 0
868
+ - `log_level`: passive
869
+ - `log_level_replica`: warning
870
+ - `log_on_each_node`: True
871
+ - `logging_nan_inf_filter`: True
872
+ - `save_safetensors`: True
873
+ - `save_on_each_node`: False
874
+ - `save_only_model`: False
875
+ - `restore_callback_states_from_checkpoint`: False
876
+ - `no_cuda`: False
877
+ - `use_cpu`: False
878
+ - `use_mps_device`: False
879
+ - `seed`: 42
880
+ - `data_seed`: None
881
+ - `jit_mode_eval`: False
882
+ - `use_ipex`: False
883
+ - `bf16`: False
884
+ - `fp16`: False
885
+ - `fp16_opt_level`: O1
886
+ - `half_precision_backend`: auto
887
+ - `bf16_full_eval`: False
888
+ - `fp16_full_eval`: False
889
+ - `tf32`: None
890
+ - `local_rank`: 0
891
+ - `ddp_backend`: None
892
+ - `tpu_num_cores`: None
893
+ - `tpu_metrics_debug`: False
894
+ - `debug`: []
895
+ - `dataloader_drop_last`: False
896
+ - `dataloader_num_workers`: 0
897
+ - `dataloader_prefetch_factor`: None
898
+ - `past_index`: -1
899
+ - `disable_tqdm`: False
900
+ - `remove_unused_columns`: True
901
+ - `label_names`: None
902
+ - `load_best_model_at_end`: False
903
+ - `ignore_data_skip`: False
904
+ - `fsdp`: []
905
+ - `fsdp_min_num_params`: 0
906
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
907
+ - `fsdp_transformer_layer_cls_to_wrap`: None
908
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
909
+ - `deepspeed`: None
910
+ - `label_smoothing_factor`: 0.0
911
+ - `optim`: adamw_torch
912
+ - `optim_args`: None
913
+ - `adafactor`: False
914
+ - `group_by_length`: False
915
+ - `length_column_name`: length
916
+ - `ddp_find_unused_parameters`: None
917
+ - `ddp_bucket_cap_mb`: None
918
+ - `ddp_broadcast_buffers`: False
919
+ - `dataloader_pin_memory`: True
920
+ - `dataloader_persistent_workers`: False
921
+ - `skip_memory_metrics`: True
922
+ - `use_legacy_prediction_loop`: False
923
+ - `push_to_hub`: False
924
+ - `resume_from_checkpoint`: None
925
+ - `hub_model_id`: None
926
+ - `hub_strategy`: every_save
927
+ - `hub_private_repo`: False
928
+ - `hub_always_push`: False
929
+ - `gradient_checkpointing`: False
930
+ - `gradient_checkpointing_kwargs`: None
931
+ - `include_inputs_for_metrics`: False
932
+ - `eval_do_concat_batches`: True
933
+ - `fp16_backend`: auto
934
+ - `push_to_hub_model_id`: None
935
+ - `push_to_hub_organization`: None
936
+ - `mp_parameters`:
937
+ - `auto_find_batch_size`: False
938
+ - `full_determinism`: False
939
+ - `torchdynamo`: None
940
+ - `ray_scope`: last
941
+ - `ddp_timeout`: 1800
942
+ - `torch_compile`: False
943
+ - `torch_compile_backend`: None
944
+ - `torch_compile_mode`: None
945
+ - `dispatch_batches`: None
946
+ - `split_batches`: None
947
+ - `include_tokens_per_second`: False
948
+ - `include_num_input_tokens_seen`: False
949
+ - `neftune_noise_alpha`: None
950
+ - `optim_target_modules`: None
951
+ - `batch_eval_metrics`: False
952
+ - `eval_on_start`: False
953
+ - `eval_use_gather_object`: False
954
+ - `batch_sampler`: batch_sampler
955
+ - `multi_dataset_batch_sampler`: round_robin
956
+
957
+ </details>
958
+
959
+ ### Training Logs
960
+ | Epoch | Step | Training Loss |
961
+ |:------:|:----:|:-------------:|
962
+ | 1.5625 | 500 | 0.0023 |
963
+
964
+
965
+ ### Framework Versions
966
+ - Python: 3.10.12
967
+ - Sentence Transformers: 3.2.1
968
+ - Transformers: 4.44.2
969
+ - PyTorch: 2.5.0+cu121
970
+ - Accelerate: 0.34.2
971
+ - Datasets: 3.1.0
972
+ - Tokenizers: 0.19.1
973
+
974
+ ## Citation
975
+
976
+ ### BibTeX
977
+
978
+ #### Sentence Transformers
979
+ ```bibtex
980
+ @inproceedings{reimers-2019-sentence-bert,
981
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
982
+ author = "Reimers, Nils and Gurevych, Iryna",
983
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
984
+ month = "11",
985
+ year = "2019",
986
+ publisher = "Association for Computational Linguistics",
987
+ url = "https://arxiv.org/abs/1908.10084",
988
+ }
989
+ ```
990
+
991
+ <!--
992
+ ## Glossary
993
+
994
+ *Clearly define terms in order to be accessible across audiences.*
995
+ -->
996
+
997
+ <!--
998
+ ## Model Card Authors
999
+
1000
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1001
+ -->
1002
+
1003
+ <!--
1004
+ ## Model Card Contact
1005
+
1006
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1007
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/distiluse-base-multilingual-cased-v1",
3
+ "activation": "gelu",
4
+ "architectures": [
5
+ "DistilBertModel"
6
+ ],
7
+ "attention_dropout": 0.1,
8
+ "dim": 768,
9
+ "dropout": 0.1,
10
+ "hidden_dim": 3072,
11
+ "initializer_range": 0.02,
12
+ "max_position_embeddings": 512,
13
+ "model_type": "distilbert",
14
+ "n_heads": 12,
15
+ "n_layers": 6,
16
+ "pad_token_id": 0,
17
+ "qa_dropout": 0.1,
18
+ "seq_classif_dropout": 0.2,
19
+ "sinusoidal_pos_embds": false,
20
+ "tie_weights_": true,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.44.2",
23
+ "vocab_size": 119547
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.2.1",
4
+ "transformers": "4.44.2",
5
+ "pytorch": "2.5.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:698b2fe793ff27e94ca8b6eb4470fd4f00592dff8530dd5d22b3466459d93a6e
3
+ size 538947416
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Dense",
18
+ "type": "sentence_transformers.models.Dense"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": false,
48
+ "mask_token": "[MASK]",
49
+ "max_len": 512,
50
+ "model_max_length": 128,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "DistilBertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff