`aux_loss_alpha` should be 1e-4 instead of 1e-3?

#60

According to DeepSeekV3 technical report section 4.2

For the balance loss, we set 𝛼 to 0.0001

Cannot merge
This branch has merge conflicts in the following files:
  • config.json
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment