pretrainCPICANN / README.md
caobin's picture
Update README.md
99d4a17 verified
metadata
extra_gated_prompt: You agree not to use the dataset for commercial purposes.
extra_gated_fields:
  Company: text
  Name: text
  Purpose_of_data: text
  Contact_Info: text
  I want to use this model for:
    type: select
    options:
      - Research
      - Education
      - label: Other
        value: other
  I agree to use this dataset for non-commercial use ONLY: checkbox
extra_gated_heading: Acknowledge license to accept the repository
extra_gated_description: Our team may take 2-3 days to process your request
extra_gated_button_content: Acknowledge license

Acquisition Process

  • Please fill out all required information truthfully.
  • Personal verification will be completed within two days.
  • Once approved, you will be granted access to download the content.

CPICANN Pretrained Models Repository

This repository contains the pretrained models of the method described in the CPICANN paper, available at GitHub.

Ablation Study on Self-Attention Configuration

In Table 1 below, we present the results of an ablation study on self-attention configuration:

No. Model Configuration Trainable Parameters Accuracy on Validation Set (%)
1 ED: 128, HN: 8, SL: 4 13,725,793 86.16
2 ED: 128, HN: 8, SL: 6 (CPICANN) 14,385,505 87.50
3 ED: 128, HN: 8, SL: 8 15,045,217 86.94
4 ED: 256, HN: 8, SL: 6 17,243,873 85.51
5 ED: 384, HN: 8, SL: 6 20,872,161 86.14
6 ED: 128, HN: 4, SL: 6 14,385,505 86.43
7 ED: 384, HN: 6, SL: 6 20,872,161 85.78

Based on the validation set accuracy, the self-attention module is optimized within the following ranges: self-attention layers of 4, 6, or 8; embedding dimensions of 128, 256, or 384; and head numbers of 4, 6, or 8. The results are detailed in Table S1, with the notations of ED for embedding dimensions, HN for head number, and SL for the number of self-attention layers. The ablation study identifies the optimal configuration of CPICANN as ED: 128, HN: 8, SL: 6.

CNNonly and ATTENTIONonly Models

Two additional models, CNNonly and ATTENTIONonly, isolate the CNN and attention parts of CPICANN, respectively.

Datasets Tested

CPICANN is evaluated on four distinguished datasets, denoted as D1, D2, D3, and D4, with the following characteristics:

  • D1: 0% background ratio and Gaussian noise (σ=0.25) (v chosen in paper)
  • D2: 3% background ratio and Gaussian noise (σ=0.25)
  • D3: 0% background ratio and Gaussian noise (σ=1)
  • D4: 0% background ratio and Gaussian noise (σ=3)

Contribution and suggestions are always welcome. You can also contact the authors for research collaboration.