jh8416's picture
Update model on 2024-08-16 06:06:57
8d524f7 verified
|
raw
history blame
13.5 kB
metadata
base_model: jh8416/my_ewha_model_2024_1
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:97764
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: 미디어 언어 중간시험 중간시험 강평 제부
    sentences:
      - 적분의 정의교육관B동  교시이 강의에서는 리만적분의 정의와 유용한 여러 가지 적분법
      - 창립 주년 기념일 성격심리학 입문 제부 성향적 영역 제부 성향적 영역
      - career paths for the DIS graduates Ewha cyber campus How to
  - source_sentence: >-
      hierarchies through relationality in the ethics of care International
      Journal of
    sentences:
      - economy culture and law to ethics
      - >-
        Instructor Bae Movie WIT Values ethics and advocacy Lecture group
        discussion
      - 깊이 이해할  있는 지름길일 것입니다
  - source_sentence: 주차별 강의 내용은 사정에 따라 변동될
    sentences:
      - 조순경 여성직종의 외주화와 간접차별 KTX 승무원 간접고용을 통해 
      - 욕구를 만족시키기 위해 기본적인 마케팅전략의 개념과 이론을 학습하고 여러 성공적인 마케팅전략의
      -  리더십 성공사례 연구 내용 정신전력교육구술평가 임관종합평가 대비 정신전력 교육 조직속에서
  - source_sentence: 상형미를 기반으로 하는 조형미를 이해하여 궁극적으로는 서화동원을 이해하고 동양예술에서 추구한 획과
    sentences:
      - 장흔들리는 마음 수업자료스타트업얼라이언스가이드북시리즈초보창업자를위한 HR가이드북  장초기 단계 재무관리 핵심 공략 장초기
      - 중간시험 리만적분 연습문제
      - 같은 다층적이며 종합적인 접근을 통해 궁극적으로는 생태계가 유지되고 작동하는 원리 그리고
  - source_sentence: 제작 과정 이해  실습 석고 몰드 캐스팅 기법을 이용한 개별
    sentences:
      - 선거 Mould 제작 Slip Casting 석고원형 제작 원형완성  검사 Project
      - 역사적 고찰 세기 교수법 초급 피아노 교수법 기초  유아과정 중급
      - 발전과제 학교문화와 풍토 주교재  학교문화의 개념  특징 조직문화 이론Ouchi의

SentenceTransformer based on jh8416/my_ewha_model_2024_1

This is a sentence-transformers model finetuned from jh8416/my_ewha_model_2024_1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: jh8416/my_ewha_model_2024_1
  • Maximum Sequence Length: 128 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("jh8416/my_ewha_model_2024_1")
# Run inference
sentences = [
    '제작 과정 이해 및 실습 석고 몰드 캐스팅 기법을 이용한 개별',
    '선거 Mould 제작 Slip Casting 석고원형 제작 원형완성 및 검사 Project',
    '발전과제 학교문화와 풍토 주교재 장 학교문화의 개념 및 특징 조직문화 이론Ouchi의',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 97,764 training samples
  • Columns: sentence_0 and sentence_1
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1
    type string string
    details
    • min: 6 tokens
    • mean: 17.88 tokens
    • max: 30 tokens
    • min: 3 tokens
    • mean: 18.09 tokens
    • max: 41 tokens
  • Samples:
    sentence_0 sentence_1
    자신을 닫아놓으면서도 다른 한 편으론 보석처럼 반짝이는 돌은 매력이 있다 작품의 용도 설정 보석함 필구함 기타 IV
    자신을 닫아놓으면서도 다른 한 편으론 보석처럼 반짝이는 돌은 매력이 있다 발표 및 제출인쇄물A포맷 도안밑그림 이미지 크기보석함과 기타 함의 크기를 기준으로 함
    자신을 닫아놓으면서도 다른 한 편으론 보석처럼 반짝이는 돌은 매력이 있다 밑그림에 채색 및 기법 표시 보석함 크기외경 xx
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.0818 500 1.0712
0.1636 1000 0.9295
0.2455 1500 0.8423
0.3273 2000 0.8157
0.4091 2500 0.794
0.4909 3000 0.7058
0.5727 3500 0.6726
0.6546 4000 0.6664
0.7364 4500 0.6302
0.8182 5000 0.6029
0.9000 5500 0.5936
0.9818 6000 0.5873

Framework Versions

  • Python: 3.12.0
  • Sentence Transformers: 3.0.1
  • Transformers: 4.43.3
  • PyTorch: 2.4.0+cu121
  • Accelerate: 0.33.0
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}