bikalnetomi
/

RLHF-PPO-RewardModel-LLama3-3B-v1

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Community

RLHF-PPO-RewardModel-LLama3-3B-v1

Ctrl+K

Ctrl+K

1 contributor

History: 3 commits

bikalnetomi's picture

Create config.json

28dfa45 verified 9 months ago

.gitattributes

1.57 kB

bikalnetomi/RLHF-PPO-RewardModel-LLama3-3B-v1 9 months ago
README.md

1.77 kB

bikalnetomi/RLHF-PPO-RewardModel-LLama3-3B-v1 9 months ago
adapter_config.json

774 Bytes

bikalnetomi/RLHF-PPO-RewardModel-LLama3-3B-v1 9 months ago
adapter_model.safetensors

73.4 MB
LFS

bikalnetomi/RLHF-PPO-RewardModel-LLama3-3B-v1 9 months ago
config.json

878 Bytes

Create config.json 9 months ago
special_tokens_map.json

434 Bytes

bikalnetomi/RLHF-PPO-RewardModel-LLama3-3B-v1 9 months ago
tokenizer.json

17.2 MB
LFS

bikalnetomi/RLHF-PPO-RewardModel-LLama3-3B-v1 9 months ago
tokenizer_config.json

54.7 kB

bikalnetomi/RLHF-PPO-RewardModel-LLama3-3B-v1 9 months ago
training_args.bin
Detected Pickle imports (9)
- "torch.device",
- "transformers.trainer_utils.IntervalStrategy",
- "transformers.trainer_utils.HubStrategy",
- "transformers.trainer_pt_utils.AcceleratorConfig",
- "trl.trainer.reward_config.RewardConfig",
- "accelerate.state.PartialState",
- "accelerate.utils.dataclasses.DistributedType",
- "transformers.training_args.OptimizerNames",
- "transformers.trainer_utils.SchedulerType"
How to fix it?
5.37 kB
LFS

bikalnetomi/RLHF-PPO-RewardModel-LLama3-3B-v1 9 months ago