SentenceTransformer based on sentence-transformers/paraphrase-multilingual-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2
Maximum Sequence Length: 128 tokens
Output Dimensionality: 768 tokens
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ylv02/paraphrase-multilingual-mpnet-base-v2")
# Run inference
sentences = [
    'Power Distribution System Intern (for Students Only)',
    'Corporate Services CS provides Intel employees the infrastructure and environment to create breakthrough technology that makes amazing experiences possible. Our scope is vast, whether we are providing services that help employees stay productive and satisfied, or transforming how we work by providing ultra-pure water/chemicals to factory process tools, shuttling employees between sites, protecting the environment, keeping employees safe, or building and maintaining offices, labs, and factories. Our employees create a better tomorrow for all Intel employees around the world.\r\nThe Power Distribution System Intern will work closely under the direct supervision of a senior engineer to perform specific engineering tasks of analysis or test nature in specialized engineering fields. The intern will apply theoretical knowledge and engineering techniques to the solution of basic analytical engineering problems. Other responsibilities include:\r\n•\tOwning safety within an area of influence; ensuring systems operate reliably to avoid manufacturing, facility (or operations) impacts by meeting uptime goals.\r\n•\tEffective collaboration with construction and operations teams to coordinate work in the field overlooking projects to meet design criteria, scope, budget, and schedule; oversight of planning, design, reconfiguration, construction, maintenance, and modifications to equipment, machinery, facilities systems, buildings, and structures as required to support business requirements.\r\n•\tDelivering operational efficiency, maintenance, testing, and commissioning solutions\r\n•\tIdentify and implement opportunities to achieve the lowest cost of ownership to key stakeholders while maintaining reliability.\r\n•\tCollaboration with the Site Facility management team, suppliers, and key stakeholders, and through an in-depth understanding of the operating systems to determine and ensure compliance with existing standards, specifications, building codes, and processes.',
    'Thực hiện các nghiệp vụ kế toán liên quan đến doanh thu, công nợ.\r\nTheo dõi, quản lý tài sản, công cụ và các nghiệp vụ liên quan\r\nThực hiện tổng hợp số liệu và cân đối số liệu kế toán.\r\nGiao dịch với ngân hàng\r\nLập báo cáo thuế GTGT\r\nThực hiện các công việc khác theo sự phân công của CBQL.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

Size: 710,308 training samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 3 tokens
mean: 88.58 tokens
max: 128 tokens

min: 3 tokens
mean: 91.76 tokens
max: 128 tokens

	anchor	positive
type	string	string
details	min: 3 tokens mean: 88.58 tokens max: 128 tokens	min: 3 tokens mean: 91.76 tokens max: 128 tokens

Samples:

anchor	positive
`- Tham gia xây dựng và thực hiện kế hoạch nhân sự, tuyển dụng, đào tạo, thực hiện các công việc liên quan đến bảo hiểm xã hội, chế độ, chính sách, an toàn lao động`
- Tham gia tham mưu xu hướng tuyển dụng, đào tạo, chế độ đối với người lao động hiệu quả, phù hợp
- Tham gia xây dựng chính sách, quy định liên quan đến nhân sự: Chế độ học việc, thử việc, hợp đồng lao động, lương, thưởng, phép năm, quyết định nhân sự, hợp đồng lao động, chấm công tính lương, phép,...
- Tham gia xây dựng cơ cấu tổ chức, hệ thống các quy định, quy chế, qui trình và phối hợp với các bộ phận trong công ty giám sát việc chấp hành
- Các công việc hành chính theo sự phân công của lãnh đạo công ty và trưởng bộ phận.
Quyền lợi:
- Mức lương, thưởng, đãi ngộ hấp dẫn dựa trên năng lực và kinh nghiệm làm việc
- Chế độ về BHXH, BHYT, BHTN theo quy định
- Chế độ nghỉ phép, nghỉ hàng tuần và nghỉ Lễ theo quy định.	`- Trình độ học vấn: Tốt nghiệp Đại học trở lên chuyên ngành Quản trị nhân lực, Kinh tế, Kế toán….`
- Có kinh nghiệm làm việc ở vị trí tương đương
- Nhanh nhẹn, trung thực, cẩn thận trong công việc
- Có kiến thức tốt về quản trị nhân lực, tâm lý học, chính sách tiền lương, bảo hiểm xã hội
- Có kinh nghiệm về an toàn lao động là lợi thế
• Education: Bachelor's degree in Food Science, Engineering, Business Administration, or a related field. Advanced degree preferred. • Experience: Minimum of 1-2 years of experience in production management, with a focus on dried fruits or similar food products. • Leadership Skills : Proven leadership and team management abilities, with a track record of building and leading high-performing teams. • Technical Knowledge: Strong understanding of dried fruits production processes, equipment, and quality control standards.	• Production Oversight: Manage and oversee all aspects of dried fruits production, ensuring efficient and cost-effective operations. • Team Management: Lead, mentor, and develop the production team, fostering a culture of continuous improvement and teamwork. • Quality Control: Ensure that all products meet established quality standards and regulatory requirements. • Process Optimization: Identify and implement process improvements to enhance production efficiency, reduce waste, and maximize yield. • Resource Management: Manage inventory levels, equipment maintenance, and supply chain logistics to ensure uninterrupted production. • Safety Compliance: Ensure compliance with all safety regulations and company policies, promoting a safe working environment. • Budget Management: Develop and manage the production budget, tracking expenses and optimizing resource allocation. • Reporting: Prepare and present regular reports on production metrics, performance, and improvement initiatives to senior management.
`YÊU CẦU CÔNG VIỆC`
Nam/nữ tốt nghiệp cao đẳng, đại học chuyên ngành về chẩn đoán hình ảnh, kỹ thuật hình ảnh, kỹ sư y sinh hoặc các chuyên ngành liên quan
Thành thạo tiếng Anh 4 kỹ năng nghe, nói, đọc, viết
Có kinh nghiệm làm về các sản phẩm XQ, CT, MRI. Từng tham gia các hoạt động Presale là 1 lợi thế
Có kinh nghiệm 2 năm trở lên làm vị trí tương đương hoặc kỹ thuật viên chẩn đoán hình ảnh
Kỹ năng giao tiếp, thuyết trình tốt, ham học hỏi

* Quyền lợi
- Lương: 15 – 25 tr/tháng
- Thưởng lễ tết, tháng lương 13, thưởng dự án, cống hiến
- BHXH, BHSK
- Định kỳ xét tăng lương, Du lịch
- Các chế độ khác theo quy định công ty
- Thời gian loàm việc: T2 – T6 và 2 ngày thứ 7 trong tháng
- Địa điểm làm việc:
+ 5 – A2 Ngõ 158, Nguyễn Khánh Toàn, Cầu Giấy, Hà Nội
+ HCM: 195 Đỗ Pháp Thuận, An Phú, Thủ Đức	`- Hướng dẫn sử dụng các thiết bị mảng CT, MRI, XQ`
- Đọc dịch tài liệu sản phẩm, kho tư liệu xây dựng kho tư liệu học, hình ảnh, chương trình chụp
- Phối hợp cùng bộ phận kinh doanh giới thiệu cho khách hàng về tính năng, cấu hình, cách sử dụng sản phẩm.
- Chăm sóc, hỗ trợ, tư vấn, tạo mối quan hệ sâu sắc với các đơn vị đang sử dụng thiết bị
- Tương tác với hãng sản xuất để giải đáp các thắc mắc về sản phẩm
- Báo cáo công việc trực tiếp và định kì hằng tuần với trưởng bộ phận

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Evaluation Dataset

Unnamed Dataset

Size: 78,924 evaluation samples
Columns: anchor and positive
Approximate statistics based on the first 1000 samples:
anchor positive
type string string
details
min: 3 tokens
mean: 88.18 tokens
max: 128 tokens

min: 3 tokens
mean: 88.68 tokens
max: 128 tokens

	anchor	positive
type	string	string
details	min: 3 tokens mean: 88.18 tokens max: 128 tokens	min: 3 tokens mean: 88.68 tokens max: 128 tokens

Samples:

anchor	positive
JOB REQUIREMENT ₋ Bachelor's degree or higher in Economics, Marketing, or Business Administration. ₋ +3 years of experience as a Product Marketing Officer or similar Marketing role, preferably in Banking and Financial Service. ₋ Customer-oriented thinking and digital, technological mindset ₋ Hands-on experience with marketing technology platforms, applications, UX/UI ₋ Metrics-driven marketing mind with eyes for creativity ₋ Effective communication and interpersonal skills with a proven ability to work with cross-functional team ₋ Ability in time management skills and work well under high pressure ₋ Creativity and problem-solving skills ₋ Proficiency in English BENEFITS – Attractive Salary (full insurance) + Allowance. – Salary of probation: full 100% – An attractive package of leave: 15 annual leave, birthday leave (gifts). – Salary review based on work performance and company’s performance. – Performance bonus, 13th- month salary. – Health care package; Gym package; Annual health check-up. – Nice & modern working space with young, dynamic & friendly colleagues and free coffee, tea, drinks. – Yearly company trip; Year-End Party. – Working hours: 8:00 – 17:00 from Monday – Friday.	1. Product Marketing: • Driving results - launching products and features with maximum impact and driving sustained adoption and engagement over time • Determining and measuring success metrics for launch, growth, and ongoing adoption. Generate actionable insights and comprehensive reports on campaign performance. • Championing your product - keeping the broader team (Marketing, Business, Credit, Dealer…) aware of your product and campaigns. • Writing and briefing creative - creating messaging and design briefs for different mediums and audiences. 2. Product Development: • Design and implement sophisticated product features, pricing strategies, marketing campaigns, and other programs on banking systems & integrated platforms, ensuring all setups are fully integrated. • Collaborate with Business Development, IT, other divisions, and vendors to address system issues and enhancements, ensuring timely resolution. • Participate in developing appropriate new financial products, enhancing existing products, and coordinating with related stakeholders to execute. • Partner effectively with marketing, IT, and data science teams to ensure optimal integration and utilization of Martech solutions. 3. Others: • Participate in developing sales support tools including sales kit, brochure, leaflet… • Monitor budget spending to ensure cost efficiency and effectiveness. • Fulfill other tasks and/or assignments requested by Head or Line Manager.
`- Xây dựng đầy đủ các quy trình, quy định cho các nghiệp vụ kế toán, tài chính phát sinh tại công ty, đảm bảo kiểm soát được hoạt động thu chi tài chính của công ty.`
- Xây dựng đầy đủ các hướng dẫn nghiệp vụ kế toán, đảm bảo nghiệp vụ kế toán được ghi nhận đúng, đủ tuân thủ theo đúng quy định của nhà nước.
- Lập báo cáo và cung cấp kịp thời, đầy đủ báo cáo theo đúng thời hạn quy định, tuân thủ đúng quy định, chuẩn mực kế toán.
- Kiểm tra, soát xét chứng từ kế toán, đảm bảo hạn chế các rủi ro và tuân thủ theo đúng quy định của Nhà nước.
- Đánh giá hiệu quả đầu tư/phương án kinh doanh.
- 100% các chi phí được kiểm soát và thực hiện đúng quy định.
- Lập và kiểm soát ngân sách bộ phận đảm bảo việc thu chi theo đúng ngân sách đã được phê duyệt.	`- Bằng cấp, trình độ: Tốt nghiệp đại học chính qui các chuyên ngành Tài chính kế toán`
- Kinh nghiệm:Tối thiểu 3 năm kinh nghiệm ở vị trí tương đương hoặc làm tối thiểu 3 năm kiểm toán.
- Kiến thức:
+ Có kiến thức rộng về quản trị tài chính doanh nghiệp, phân tích tài chính.
+Am hiểu về các phần mềm quản lý doanh nghiệp, tài chính kế toán.
+ Nắm vứng kiến thức về luật tài chính, kế toán, thuế… của Nhà nước
- Kỹ năng:
+ Kỹ năng làm việc thành công trong môi trường làm việc nhóm.
+ Kỹ năng ra quyết định, giải quyết vấn đề.
+ Kỹ năng tổ chức công việc và quản lý thời gian, kỹ năng lập kế hoạch, báo cáo.
+ Kỹ năng truyền đạt và giao tiếp tốt.
`1. Trình độ (Education): Đại học trở lên chuyên ngành Tài chính, Ngân hàng, Bất động sản`
2. Kiến thức (Knowledge): Có kiến thức chuyên sâu trong các lĩnh vực:
- Tài chính/kế toán/ngân hàng
- Bất động sản
- Đầu tư
3. Kinh nghiệm chuyên môn (Professional experience): 10 năm kinh nghiệm trong lĩnh vực tài chính
4. Kinh nghiệm quản lý (Management experience): 5 năm kinh nghiệm quản lý ở vị trí tương đương	`I. Nhiệm vụ 1: Kiểm soát hiệu quả dự án & xây dựng báo cáo Quản trị`
- Lập/cập nhật FS dự án định kỳ/theo yêu cầu, kiểm soát chương trình bán hàng, phân tích doanh thu, chi phí, dòng tiền dự án và đề xuất các cảnh báo sớm nhằm đảm hiệu quả dự án.
- Quản lý việc thiết lập các báo cáo quản trị định kỳ gửi đúng hạn đến BGĐ, báo cáo phân tích hiệu quả toàn dự án (FS dự án) gửi GMD và các báo cáo quản trị khác theo yêu cầu của Ban TGĐ. Đưa ra các cảnh báo & đề xuất giải pháp nhằm đảm bảo hiệu quả chi phí, đạt được mục tiêu lợi nhuận và hiệu quả dòng tiền.
II. Nhiệm vụ 2: Xây dựng quy trình, quy định kiểm soát ngân sách
- Xây dựng quy trình, quy định liên quan đến việc lập & kiểm soát ngân sách đảm bảo hiệu quả sử dụng chi phí và tuân thủ quy định của Tập đoàn
- Hỗ trợ xây dựng, phân tích và thực hiện chiến lược đầu tư, bao gồm xây dựng các mô hình và thẩm định hiệu quả tài chính của các dự án đầu tư.
III.Nhiệm vụ 3: Lập và kiểm soát ngân sách
- Hướng dẫn cho các Giám đốc PB trong suốt quy trình lập dự toán ngân sách hàng năm, và phát triển các giả định dự toán ngân sách để xây dựng kế hoạch kinh doanh và dự toán ngân sách toàn diện trình phê duyệt.
- Quản lý và kiểm tra kết quả thực hiện ngân sách (hàng tháng, quý và năm) và đưa ra các khuyến nghị phù hợp cho việc cải thiện nhằm đạt được các kế hoạch đề ra.
IV.Nhiệm vụ 4: Xây dựng hệ thống kiểm soát ngân sách đảm kiểm soát ngân sách hiệu quả, linh hoạt

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
learning_rate: 2e-05
num_train_epochs: 4
warmup_ratio: 0.1
fp16: True
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 4
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	Training Loss	Validation Loss
0.4505	20000	0.4569	0.2730
0.9010	40000	0.2368	0.1940
1.3515	60000	0.1738	0.1548
1.8020	80000	0.114	0.1328
2.2525	100000	0.0843	0.1115
2.7030	120000	0.0564	0.0972
3.1535	140000	0.0404	0.0841
3.6040	160000	0.0282	0.0760

Framework Versions

Python: 3.10.15
Sentence Transformers: 3.2.0
Transformers: 4.45.2
PyTorch: 2.4.1
Accelerate: 1.0.1
Datasets: 3.0.1
Tokenizers: 0.20.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

ylv02
/

paraphrase-multilingual-mpnet-base-v2

SentenceTransformer based on sentence-transformers/paraphrase-multilingual-mpnet-base-v2

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Training Details

Training Dataset

Unnamed Dataset

Evaluation Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

MultipleNegativesRankingLoss

Model tree for ylv02/paraphrase-multilingual-mpnet-base-v2