Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -18,6 +18,40 @@ license_link: https://huggingface.co/tencent/Tencent-Hunyuan-Large/blob/main/LIC
18
  </p><p align="center">
19
  <a href="https://arxiv.org/abs/2411.02265" style="color: blue;"><b>Technical Report</b></a>&nbsp&nbsp|&nbsp&nbsp <a href="https://huggingface.co/spaces/tencent/Hunyuan-Large"><b>Demo</b></a>&nbsp&nbsp&nbsp|&nbsp&nbsp <a href="https://cloud.tencent.com/document/product/851/112032" style="color: blue;"><b>Tencent Cloud TI</b></a>&nbsp&nbsp&nbsp</p>
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ### Model Introduction
22
 
23
  With the rapid development of artificial intelligence technology, large language models (LLMs) have made significant progress in fields such as natural language processing, computer vision, and scientific tasks. However, as the scale of these models increases, optimizing resource consumption while maintaining high performance has become a key challenge. To address this challenge, we have explored Mixture of Experts (MoE) models. The currently unveiled Hunyuan-Large (Hunyuan-MoE-A52B) model is the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters. This is currently the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters.
 
18
  </p><p align="center">
19
  <a href="https://arxiv.org/abs/2411.02265" style="color: blue;"><b>Technical Report</b></a>&nbsp&nbsp|&nbsp&nbsp <a href="https://huggingface.co/spaces/tencent/Hunyuan-Large"><b>Demo</b></a>&nbsp&nbsp&nbsp|&nbsp&nbsp <a href="https://cloud.tencent.com/document/product/851/112032" style="color: blue;"><b>Tencent Cloud TI</b></a>&nbsp&nbsp&nbsp</p>
20
 
21
+
22
+
23
+ <p>
24
+ <table align="center">
25
+ <tbody>
26
+ <tr align="center">
27
+ <td align="center" colspan="3"><strong>Download Models</strong></td>
28
+ </tr>
29
+ <tr align="center">
30
+ <td align="center" style="width: 200px;" ><strong>Models</strong></td>
31
+ <td align="center" style="width: 400px;"><strong>Huggingface Download URL</strong></td>
32
+ <td align="center" style="width: 400px;"><strong>Tencent Cloud Download URL</strong></td>
33
+ </tr>
34
+ <tr align="center">
35
+ <td align="center" style="width: 200px;">Hunyuan-A52B-Instruct-FP8</td>
36
+ <td style="width: 400px;"><a href="https://huggingface.co/tencent/Tencent-Hunyuan-Large/tree/main/Hunyuan-A52B-Instruct-FP8" ;">Hunyuan-A52B-Instruct-FP8</a></td>
37
+ <td style="width: 400px;"><a href="https://hunyuan-large-model-1258344703.cos.ap-guangzhou.myqcloud.com/Hunyuan-A52B-Instruct-128k-fp8.zip" ;">Hunyuan-A52B-Instruct-FP8</a></td>
38
+ </tr>
39
+ <tr align="center">
40
+ <td align="center" style="width: 200px;">Hunyuan-A52B-Instruct</td>
41
+ <td style="width: 400px;"><a href="https://huggingface.co/tencent/Tencent-Hunyuan-Large/tree/main/Hunyuan-A52B-Instruct" ;">Hunyuan-A52B-Instruct</a></td>
42
+ <td style="width: 400px;"><a href="https://hunyuan-large-model-1258344703.cos.ap-guangzhou.myqcloud.com/Hunyuan-A52B-Instruct-128k.zip" ;">Hunyuan-A52B-Instruct</a></td>
43
+ </tr>
44
+ <tr align="center">
45
+ <td align="center" style="width: 200px;">Hunyuan-A52B-Pretrain</td>
46
+ <td style="width: 400px;"><a href="https://huggingface.co/tencent/Tencent-Hunyuan-Large/tree/main/Hunyuan-A52B-Pretrain" ;">Hunyuan-A52B-Pretrain</a></td>
47
+ <td style="width: 400px;"><a href="https://hunyuan-large-model-1258344703.cos.ap-guangzhou.myqcloud.com/Hunyuan-A52B-Pretrain-256k.zip" ;">Hunyuan-A52B-Pretrain</a></td>
48
+ </tr>
49
+ </tbody>
50
+ </table>
51
+ </p>
52
+
53
+
54
+
55
  ### Model Introduction
56
 
57
  With the rapid development of artificial intelligence technology, large language models (LLMs) have made significant progress in fields such as natural language processing, computer vision, and scientific tasks. However, as the scale of these models increases, optimizing resource consumption while maintaining high performance has become a key challenge. To address this challenge, we have explored Mixture of Experts (MoE) models. The currently unveiled Hunyuan-Large (Hunyuan-MoE-A52B) model is the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters. This is currently the largest open-source Transformer-based MoE model in the industry, featuring a total of 389 billion parameters and 52 billion active parameters.