bikalnetomi
/

RLHF-PPO-PPOModel-LLama3-1B-v1.0

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

RLHF-PPO-PPOModel-LLama3-1B-v1.0

1 contributor

History: 5 commits

bikalnetomi's picture

End of training

98e1741 verified 27 days ago

.gitattributes

1.57 kB

End of training 27 days ago
README.md

2.44 kB

End of training 27 days ago
config.json

926 Bytes

End of training 27 days ago
generation_config.json

124 Bytes

End of training 27 days ago
model.safetensors

4.94 GB
LFS

End of training 27 days ago
special_tokens_map.json

434 Bytes

End of training 27 days ago
tokenizer.json

17.2 MB
LFS

End of training 27 days ago
tokenizer_config.json

54.8 kB

End of training 27 days ago
training_args.bin
Detected Pickle imports (9)
- "transformers.trainer_utils.IntervalStrategy",
- "transformers.trainer_utils.HubStrategy",
- "transformers.trainer_pt_utils.AcceleratorConfig",
- "trl.trainer.ppo_config.PPOConfig",
- "accelerate.utils.dataclasses.DistributedType",
- "transformers.training_args.OptimizerNames",
- "accelerate.state.PartialState",
- "torch.device",
- "transformers.trainer_utils.SchedulerType"
How to fix it?
6.07 kB
LFS

End of training 27 days ago