File size: 1,435 Bytes
a44dc8d
 
 
 
 
 
 
 
 
62596dc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
license: cc
license_name: creative-commons-attribution-4.0-international
license_link: https://creativecommons.org/licenses/by/4.0/
pipeline_tag: image-classification
tags:
- knowledge-distillation
- modular-neural-architecture
---
# m2mKD

This repository contains the checkpoints for [m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers](https://arxiv.org/abs/2402.16918).

## Released checkpoints

For the usage of the checkpoints listed below, please refer to the instructions provided on our [GitHub repo](https://github.com/kamanphoebe/m2mKD).
- `nac_scale_tinyimnet.pth`/`nac_scale_imnet.pth`: NAC model with a scale-free prior trained using m2mKD.
- `vmoe_base.pth`: V-MoE-Base model trained using m2mKD.
- `FT_huge`: a directory containing DeiT-Huge teacher modules for NAC model training.
- `nac_tinyimnet_students`: a directory containing NAC student modules for Tiny-ImageNet.

## Acknowledgement

Our implementation is mainly based on [Deep-Incubation](https://github.com/LeapLabTHU/Deep-Incubation). 

## Citation

If you use the checkpoints, please cite our paper:
```
@misc{lo2024m2mkd,
    title={m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers}, 
    author={Ka Man Lo and Yiming Liang and Wenyu Du and Yuantao Fan and Zili Wang and Wenhao Huang and Lei Ma and Jie Fu},
    year={2024},
    eprint={2402.16918},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
```