The Cluj-Napoca series is mostly an experiment.
Steps to replicate:
Use laserQlora.ipynb from cognitivecomputations/laserRMT to determine which layers should be eliminated.
Adapt the script for Mihaiii/Pallas-0.5
by replacing model_name = "mistralai/Mistral-7B-v0.1"
with model_name = "Mihaiii/Pallas-0.5"
and layer_numbers = list(range(31, -1, -1))
with layer_numbers = list(range(59, -1, -1))
, 59 being the last recurrent layer index Pallas-0.5 has.
Click to see the result you'll receive
Calculating Signal to Noise Ratio at layer model.layers.0.self_attn.k_proj Signal to Noise Ratio at layer model.layers.0.self_attn.k_proj = 0.34616405651386795
Calculating Signal to Noise Ratio at layer model.layers.0.self_attn.v_proj Signal to Noise Ratio at layer model.layers.0.self_attn.v_proj = 15.35865625718883
Calculating Signal to Noise Ratio at layer model.layers.1.self_attn.k_proj
Signal to Noise Ratio at layer model.layers.1.self_attn.k_proj = 0.7206548634038767
Calculating Signal to Noise Ratio at layer model.layers.1.self_attn.v_proj Signal to Noise Ratio at layer model.layers.1.self_attn.v_proj = 3.2591477935986704
Calculating Signal to Noise Ratio at layer model.layers.2.self_attn.k_proj Signal to Noise Ratio at layer model.layers.2.self_attn.k_proj = 0.5311484408046
Calculating Signal to Noise Ratio at layer model.layers.2.self_attn.v_proj Signal to Noise Ratio at layer model.layers.2.self_attn.v_proj = 5.109442630946979
Calculating Signal to Noise Ratio at layer model.layers.3.self_attn.k_proj Signal to Noise Ratio at layer model.layers.3.self_attn.k_proj = 0.4341506575442939
Calculating Signal to Noise Ratio at layer model.layers.3.self_attn.v_proj Signal to Noise Ratio at layer model.layers.3.self_attn.v_proj = 7.519101868970723
Calculating Signal to Noise Ratio at layer model.layers.4.self_attn.k_proj Signal to Noise Ratio at layer model.layers.4.self_attn.k_proj = 0.43156326950369167
Calculating Signal to Noise Ratio at layer model.layers.4.self_attn.v_proj Signal to Noise Ratio at layer model.layers.4.self_attn.v_proj = 3.3721301592636337
Calculating Signal to Noise Ratio at layer model.layers.5.self_attn.k_proj Signal to Noise Ratio at layer model.layers.5.self_attn.k_proj = 0.4146416750704863
Calculating Signal to Noise Ratio at layer model.layers.5.self_attn.v_proj Signal to Noise Ratio at layer model.layers.5.self_attn.v_proj = 17.88975706822606
Calculating Signal to Noise Ratio at layer model.layers.6.self_attn.k_proj Signal to Noise Ratio at layer model.layers.6.self_attn.k_proj = 0.4311999332093549
Calculating Signal to Noise Ratio at layer model.layers.6.self_attn.v_proj Signal to Noise Ratio at layer model.layers.6.self_attn.v_proj = 32.20151585537659
Calculating Signal to Noise Ratio at layer model.layers.7.self_attn.k_proj Signal to Noise Ratio at layer model.layers.7.self_attn.k_proj = 0.4152094643425305
Calculating Signal to Noise Ratio at layer model.layers.7.self_attn.v_proj Signal to Noise Ratio at layer model.layers.7.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.8.self_attn.k_proj Signal to Noise Ratio at layer model.layers.8.self_attn.k_proj = 0.3623575163597641
Calculating Signal to Noise Ratio at layer model.layers.8.self_attn.v_proj Signal to Noise Ratio at layer model.layers.8.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.9.self_attn.k_proj Signal to Noise Ratio at layer model.layers.9.self_attn.k_proj = 0.44609016848062005
Calculating Signal to Noise Ratio at layer model.layers.9.self_attn.v_proj Signal to Noise Ratio at layer model.layers.9.self_attn.v_proj = 1230.8526493095455
Calculating Signal to Noise Ratio at layer model.layers.10.self_attn.k_proj Signal to Noise Ratio at layer model.layers.10.self_attn.k_proj = 0.5036779136885361
Calculating Signal to Noise Ratio at layer model.layers.10.self_attn.v_proj Signal to Noise Ratio at layer model.layers.10.self_attn.v_proj = 1225.9565161503585
Calculating Signal to Noise Ratio at layer model.layers.11.self_attn.k_proj Signal to Noise Ratio at layer model.layers.11.self_attn.k_proj = 0.8464746929570776
Calculating Signal to Noise Ratio at layer model.layers.11.self_attn.v_proj Signal to Noise Ratio at layer model.layers.11.self_attn.v_proj = 249.73542526059745
Calculating Signal to Noise Ratio at layer model.layers.12.self_attn.k_proj Signal to Noise Ratio at layer model.layers.12.self_attn.k_proj = 0.7472833813081716
Calculating Signal to Noise Ratio at layer model.layers.12.self_attn.v_proj Signal to Noise Ratio at layer model.layers.12.self_attn.v_proj = 1475.741913325959
Calculating Signal to Noise Ratio at layer model.layers.13.self_attn.k_proj Signal to Noise Ratio at layer model.layers.13.self_attn.k_proj = 0.6900561437886662
Calculating Signal to Noise Ratio at layer model.layers.13.self_attn.v_proj Signal to Noise Ratio at layer model.layers.13.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.14.self_attn.k_proj Signal to Noise Ratio at layer model.layers.14.self_attn.k_proj = 0.879488259102746
Calculating Signal to Noise Ratio at layer model.layers.14.self_attn.v_proj Signal to Noise Ratio at layer model.layers.14.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.15.self_attn.k_proj Signal to Noise Ratio at layer model.layers.15.self_attn.k_proj = 0.8212827221029891
Calculating Signal to Noise Ratio at layer model.layers.15.self_attn.v_proj Signal to Noise Ratio at layer model.layers.15.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.16.self_attn.k_proj Signal to Noise Ratio at layer model.layers.16.self_attn.k_proj = 0.939714841037408
Calculating Signal to Noise Ratio at layer model.layers.16.self_attn.v_proj Signal to Noise Ratio at layer model.layers.16.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.17.self_attn.k_proj Signal to Noise Ratio at layer model.layers.17.self_attn.k_proj = 1.1122911986074888
Calculating Signal to Noise Ratio at layer model.layers.17.self_attn.v_proj Signal to Noise Ratio at layer model.layers.17.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.18.self_attn.k_proj Signal to Noise Ratio at layer model.layers.18.self_attn.k_proj = 0.9121383292266945
Calculating Signal to Noise Ratio at layer model.layers.18.self_attn.v_proj Signal to Noise Ratio at layer model.layers.18.self_attn.v_proj = 788.8261618785485
Calculating Signal to Noise Ratio at layer model.layers.19.self_attn.k_proj Signal to Noise Ratio at layer model.layers.19.self_attn.k_proj = 0.9715624891930363
Calculating Signal to Noise Ratio at layer model.layers.19.self_attn.v_proj Signal to Noise Ratio at layer model.layers.19.self_attn.v_proj = 565.6062067127933
Calculating Signal to Noise Ratio at layer model.layers.20.self_attn.k_proj Signal to Noise Ratio at layer model.layers.20.self_attn.k_proj = 0.9658735932092948
Calculating Signal to Noise Ratio at layer model.layers.20.self_attn.v_proj Signal to Noise Ratio at layer model.layers.20.self_attn.v_proj = 173.68213657649758
Calculating Signal to Noise Ratio at layer model.layers.21.self_attn.k_proj Signal to Noise Ratio at layer model.layers.21.self_attn.k_proj = 1.0208128327398873
Calculating Signal to Noise Ratio at layer model.layers.21.self_attn.v_proj Signal to Noise Ratio at layer model.layers.21.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.22.self_attn.k_proj Signal to Noise Ratio at layer model.layers.22.self_attn.k_proj = 0.8767115421156565
Calculating Signal to Noise Ratio at layer model.layers.22.self_attn.v_proj Signal to Noise Ratio at layer model.layers.22.self_attn.v_proj = 1690.9373147427925
Calculating Signal to Noise Ratio at layer model.layers.23.self_attn.k_proj Signal to Noise Ratio at layer model.layers.23.self_attn.k_proj = 0.9917777373667964
Calculating Signal to Noise Ratio at layer model.layers.23.self_attn.v_proj Signal to Noise Ratio at layer model.layers.23.self_attn.v_proj = 1506.6032364420512
Calculating Signal to Noise Ratio at layer model.layers.24.self_attn.k_proj Signal to Noise Ratio at layer model.layers.24.self_attn.k_proj = 1.0207218414788868
Calculating Signal to Noise Ratio at layer model.layers.24.self_attn.v_proj Signal to Noise Ratio at layer model.layers.24.self_attn.v_proj = 146.62625418833036
Calculating Signal to Noise Ratio at layer model.layers.25.self_attn.k_proj Signal to Noise Ratio at layer model.layers.25.self_attn.k_proj = 0.9707599015919387
Calculating Signal to Noise Ratio at layer model.layers.25.self_attn.v_proj Signal to Noise Ratio at layer model.layers.25.self_attn.v_proj = 257.9292799096513
Calculating Signal to Noise Ratio at layer model.layers.26.self_attn.k_proj Signal to Noise Ratio at layer model.layers.26.self_attn.k_proj = 0.8617543423891454
Calculating Signal to Noise Ratio at layer model.layers.26.self_attn.v_proj Signal to Noise Ratio at layer model.layers.26.self_attn.v_proj = 34.81369296505358
Calculating Signal to Noise Ratio at layer model.layers.27.self_attn.k_proj Signal to Noise Ratio at layer model.layers.27.self_attn.k_proj = 0.8801045544411704
Calculating Signal to Noise Ratio at layer model.layers.27.self_attn.v_proj Signal to Noise Ratio at layer model.layers.27.self_attn.v_proj = 10.606090192242721
Calculating Signal to Noise Ratio at layer model.layers.28.self_attn.k_proj Signal to Noise Ratio at layer model.layers.28.self_attn.k_proj = 0.7758175782347406
Calculating Signal to Noise Ratio at layer model.layers.28.self_attn.v_proj Signal to Noise Ratio at layer model.layers.28.self_attn.v_proj = 15.045700293750533
Calculating Signal to Noise Ratio at layer model.layers.29.self_attn.k_proj Signal to Noise Ratio at layer model.layers.29.self_attn.k_proj = 0.6950855099687395
Calculating Signal to Noise Ratio at layer model.layers.29.self_attn.v_proj Signal to Noise Ratio at layer model.layers.29.self_attn.v_proj = 8.911400115023547
Calculating Signal to Noise Ratio at layer model.layers.30.self_attn.k_proj Signal to Noise Ratio at layer model.layers.30.self_attn.k_proj = 0.8502166964551224
Calculating Signal to Noise Ratio at layer model.layers.30.self_attn.v_proj Signal to Noise Ratio at layer model.layers.30.self_attn.v_proj = 39.16454811852842
Calculating Signal to Noise Ratio at layer model.layers.31.self_attn.k_proj Signal to Noise Ratio at layer model.layers.31.self_attn.k_proj = 0.8114261748000102
Calculating Signal to Noise Ratio at layer model.layers.31.self_attn.v_proj Signal to Noise Ratio at layer model.layers.31.self_attn.v_proj = 15.232121720528768
Calculating Signal to Noise Ratio at layer model.layers.32.self_attn.k_proj Signal to Noise Ratio at layer model.layers.32.self_attn.k_proj = 0.8171534747659152
Calculating Signal to Noise Ratio at layer model.layers.32.self_attn.v_proj Signal to Noise Ratio at layer model.layers.32.self_attn.v_proj = 44.24568579763897
Calculating Signal to Noise Ratio at layer model.layers.33.self_attn.k_proj Signal to Noise Ratio at layer model.layers.33.self_attn.k_proj = 1.0559033041558032
Calculating Signal to Noise Ratio at layer model.layers.33.self_attn.v_proj Signal to Noise Ratio at layer model.layers.33.self_attn.v_proj = 44.04153996123169
Calculating Signal to Noise Ratio at layer model.layers.34.self_attn.k_proj Signal to Noise Ratio at layer model.layers.34.self_attn.k_proj = 0.8710953229091645
Calculating Signal to Noise Ratio at layer model.layers.34.self_attn.v_proj Signal to Noise Ratio at layer model.layers.34.self_attn.v_proj = 68.64244557504348
Calculating Signal to Noise Ratio at layer model.layers.35.self_attn.k_proj Signal to Noise Ratio at layer model.layers.35.self_attn.k_proj = 0.9532579825557792
Calculating Signal to Noise Ratio at layer model.layers.35.self_attn.v_proj Signal to Noise Ratio at layer model.layers.35.self_attn.v_proj = 77.22896365544904
Calculating Signal to Noise Ratio at layer model.layers.36.self_attn.k_proj Signal to Noise Ratio at layer model.layers.36.self_attn.k_proj = 0.7857943800481151
Calculating Signal to Noise Ratio at layer model.layers.36.self_attn.v_proj Signal to Noise Ratio at layer model.layers.36.self_attn.v_proj = 41.764676631172684
Calculating Signal to Noise Ratio at layer model.layers.37.self_attn.k_proj Signal to Noise Ratio at layer model.layers.37.self_attn.k_proj = 0.9566414094295352
Calculating Signal to Noise Ratio at layer model.layers.37.self_attn.v_proj Signal to Noise Ratio at layer model.layers.37.self_attn.v_proj = 197.90757310006273
Calculating Signal to Noise Ratio at layer model.layers.38.self_attn.k_proj Signal to Noise Ratio at layer model.layers.38.self_attn.k_proj = 0.8376618883945027
Calculating Signal to Noise Ratio at layer model.layers.38.self_attn.v_proj Signal to Noise Ratio at layer model.layers.38.self_attn.v_proj = 29.87200982970284
Calculating Signal to Noise Ratio at layer model.layers.39.self_attn.k_proj Signal to Noise Ratio at layer model.layers.39.self_attn.k_proj = 1.1301710981992348
Calculating Signal to Noise Ratio at layer model.layers.39.self_attn.v_proj Signal to Noise Ratio at layer model.layers.39.self_attn.v_proj = 1675.7645711321682
Calculating Signal to Noise Ratio at layer model.layers.40.self_attn.k_proj Signal to Noise Ratio at layer model.layers.40.self_attn.k_proj = 1.5244403389879522
Calculating Signal to Noise Ratio at layer model.layers.40.self_attn.v_proj Signal to Noise Ratio at layer model.layers.40.self_attn.v_proj = 406.13928327811595
Calculating Signal to Noise Ratio at layer model.layers.41.self_attn.k_proj Signal to Noise Ratio at layer model.layers.41.self_attn.k_proj = 1.6595441320443285
Calculating Signal to Noise Ratio at layer model.layers.41.self_attn.v_proj Signal to Noise Ratio at layer model.layers.41.self_attn.v_proj = 630.0868624694981
Calculating Signal to Noise Ratio at layer model.layers.42.self_attn.k_proj Signal to Noise Ratio at layer model.layers.42.self_attn.k_proj = 1.037089746602981
Calculating Signal to Noise Ratio at layer model.layers.42.self_attn.v_proj Signal to Noise Ratio at layer model.layers.42.self_attn.v_proj = 5.865115179753161
Calculating Signal to Noise Ratio at layer model.layers.43.self_attn.k_proj Signal to Noise Ratio at layer model.layers.43.self_attn.k_proj = 1.2167307353377796
Calculating Signal to Noise Ratio at layer model.layers.43.self_attn.v_proj Signal to Noise Ratio at layer model.layers.43.self_attn.v_proj = 14.493857040997593
Calculating Signal to Noise Ratio at layer model.layers.44.self_attn.k_proj Signal to Noise Ratio at layer model.layers.44.self_attn.k_proj = 1.3254801456913765
Calculating Signal to Noise Ratio at layer model.layers.44.self_attn.v_proj Signal to Noise Ratio at layer model.layers.44.self_attn.v_proj = 929.5507948184927
Calculating Signal to Noise Ratio at layer model.layers.45.self_attn.k_proj Signal to Noise Ratio at layer model.layers.45.self_attn.k_proj = 0.8799221460946477
Calculating Signal to Noise Ratio at layer model.layers.45.self_attn.v_proj Signal to Noise Ratio at layer model.layers.45.self_attn.v_proj = 138.24150062697706
Calculating Signal to Noise Ratio at layer model.layers.46.self_attn.k_proj Signal to Noise Ratio at layer model.layers.46.self_attn.k_proj = 1.4364369040069944
Calculating Signal to Noise Ratio at layer model.layers.46.self_attn.v_proj Signal to Noise Ratio at layer model.layers.46.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.47.self_attn.k_proj Signal to Noise Ratio at layer model.layers.47.self_attn.k_proj = 1.5039953326988464
Calculating Signal to Noise Ratio at layer model.layers.47.self_attn.v_proj Signal to Noise Ratio at layer model.layers.47.self_attn.v_proj = 916.1727358213857
Calculating Signal to Noise Ratio at layer model.layers.48.self_attn.k_proj Signal to Noise Ratio at layer model.layers.48.self_attn.k_proj = 1.3774147345025962
Calculating Signal to Noise Ratio at layer model.layers.48.self_attn.v_proj Signal to Noise Ratio at layer model.layers.48.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.49.self_attn.k_proj Signal to Noise Ratio at layer model.layers.49.self_attn.k_proj = 1.496570053548836
Calculating Signal to Noise Ratio at layer model.layers.49.self_attn.v_proj Signal to Noise Ratio at layer model.layers.49.self_attn.v_proj = 816.8708843069953
Calculating Signal to Noise Ratio at layer model.layers.50.self_attn.k_proj Signal to Noise Ratio at layer model.layers.50.self_attn.k_proj = 1.1144650796270612
Calculating Signal to Noise Ratio at layer model.layers.50.self_attn.v_proj Signal to Noise Ratio at layer model.layers.50.self_attn.v_proj = 1013.1682247787271
Calculating Signal to Noise Ratio at layer model.layers.51.self_attn.k_proj Signal to Noise Ratio at layer model.layers.51.self_attn.k_proj = 3.7913550246540635
Calculating Signal to Noise Ratio at layer model.layers.51.self_attn.v_proj Signal to Noise Ratio at layer model.layers.51.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.52.self_attn.k_proj Signal to Noise Ratio at layer model.layers.52.self_attn.k_proj = 1.680754165581029
Calculating Signal to Noise Ratio at layer model.layers.52.self_attn.v_proj Signal to Noise Ratio at layer model.layers.52.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.53.self_attn.k_proj Signal to Noise Ratio at layer model.layers.53.self_attn.k_proj = 3.064423507932819
Calculating Signal to Noise Ratio at layer model.layers.53.self_attn.v_proj Signal to Noise Ratio at layer model.layers.53.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.54.self_attn.k_proj Signal to Noise Ratio at layer model.layers.54.self_attn.k_proj = 2.3201283603647047
Calculating Signal to Noise Ratio at layer model.layers.54.self_attn.v_proj Signal to Noise Ratio at layer model.layers.54.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.55.self_attn.k_proj Signal to Noise Ratio at layer model.layers.55.self_attn.k_proj = 3.9188910045391916
Calculating Signal to Noise Ratio at layer model.layers.55.self_attn.v_proj Signal to Noise Ratio at layer model.layers.55.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.56.self_attn.k_proj Signal to Noise Ratio at layer model.layers.56.self_attn.k_proj = 2.8077111768801046
Calculating Signal to Noise Ratio at layer model.layers.56.self_attn.v_proj Signal to Noise Ratio at layer model.layers.56.self_attn.v_proj = inf
Calculating Signal to Noise Ratio at layer model.layers.57.self_attn.k_proj Signal to Noise Ratio at layer model.layers.57.self_attn.k_proj = 2.24360670610018
Calculating Signal to Noise Ratio at layer model.layers.57.self_attn.v_proj Signal to Noise Ratio at layer model.layers.57.self_attn.v_proj = 381.6422403317739
Calculating Signal to Noise Ratio at layer model.layers.58.self_attn.k_proj Signal to Noise Ratio at layer model.layers.58.self_attn.k_proj = 1.6971178916519492
Calculating Signal to Noise Ratio at layer model.layers.58.self_attn.v_proj Signal to Noise Ratio at layer model.layers.58.self_attn.v_proj = 182.5246839720645
Calculating Signal to Noise Ratio at layer model.layers.59.self_attn.k_proj Signal to Noise Ratio at layer model.layers.59.self_attn.k_proj = 1.7869714644485775
Calculating Signal to Noise Ratio at layer model.layers.59.self_attn.v_proj Signal to Noise Ratio at layer model.layers.59.self_attn.v_proj = 73.06378101191355
Finished laserRMT scanning.
Then look for the layer indexes where self_attn.v_proj
snr is Infinity (here is the list: 56, 55, 54, 53, 52, 51, 48, 46, 21, 17, 16, 15, 14, 13, 8, 7) and eliminate those layers using mergekit.
Here is the mergekit config:
slices:
- sources:
- model: "Mihaiii/Pallas-0.5"
layer_range: [0, 7]
- sources:
- model: "Mihaiii/Pallas-0.5"
layer_range: [9, 13]
- sources:
- model: "Mihaiii/Pallas-0.5"
layer_range: [18, 21]
- sources:
- model: "Mihaiii/Pallas-0.5"
layer_range: [22, 46]
- sources:
- model: "Mihaiii/Pallas-0.5"
layer_range: [47, 48]
- sources:
- model: "Mihaiii/Pallas-0.5"
layer_range: [49, 51]
- sources:
- model: "Mihaiii/Pallas-0.5"
layer_range: [57, 60]
merge_method: passthrough
dtype: bfloat16
The resulted model (outputted by mergekit) is this model (Cluj-Napoca-0.2).
Cluj-Napoca versions 0.3 - 0.5 (including) are finetuned each having previous version as base.
Cluj-Napoca version 0.6 is a pruned down version of 0.5.
Cluj-Napoca version 0.7 - 0.11 (including) are finetuned each having previous version as base.
Prompt Format:
SYSTEM: <ANY SYSTEM CONTEXT>
USER:
ASSISTANT:
- Downloads last month
- 57