====== Perplexity statistics ====== Mean PPL(Q) : 7.990955 ± 0.051527 Mean PPL(base) : 7.534124 ± 0.048206 Cor(ln(PPL(Q)), ln(PPL(base))): 99.03% Mean ln(PPL(Q)/PPL(base)) : 0.058868 ± 0.000897 Mean PPL(Q)/PPL(base) : 1.060635 ± 0.000951 Mean PPL(Q)-PPL(base) : 0.456831 ± 0.007702 ====== KL divergence statistics ====== Mean KLD: 0.051341 ± 0.000211 Maximum KLD: 6.532913 99.9% KLD: 0.841224 99.0% KLD: 0.299998 99.0% KLD: 0.299998 Median KLD: 0.038058 10.0% KLD: 0.002190 5.0% KLD: 0.000690 1.0% KLD: 0.000095 Minimum KLD: -0.000027 ====== Token probability statistics ====== Mean Δp: -1.276 ± 0.017 % Maximum Δp: 97.255% 99.9% Δp: 27.515% 99.0% Δp: 15.140% 95.0% Δp: 7.440% 90.0% Δp: 4.054% 75.0% Δp: 0.520% Median Δp: -0.129% 25.0% Δp: -2.686% 10.0% Δp: -8.215% 5.0% Δp: -12.358% 1.0% Δp: -22.757% 0.1% Δp: -44.994% Minimum Δp: -85.147% RMS Δp : 6.512 ± 0.032 % Same top p: 87.756 ± 0.086 %