File size: 18,680 Bytes
7885a28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
"""
.. _statsrefmanual:

==========================================
Statistical functions (:mod:`scipy.stats`)
==========================================

.. currentmodule:: scipy.stats

This module contains a large number of probability distributions,
summary and frequency statistics, correlation functions and statistical
tests, masked statistics, kernel density estimation, quasi-Monte Carlo
functionality, and more.

Statistics is a very large area, and there are topics that are out of scope
for SciPy and are covered by other packages. Some of the most important ones
are:

- `statsmodels <https://www.statsmodels.org/stable/index.html>`__:
  regression, linear models, time series analysis, extensions to topics
  also covered by ``scipy.stats``.
- `Pandas <https://pandas.pydata.org/>`__: tabular data, time series
  functionality, interfaces to other statistical languages.
- `PyMC <https://docs.pymc.io/>`__: Bayesian statistical
  modeling, probabilistic machine learning.
- `scikit-learn <https://scikit-learn.org/>`__: classification, regression,
  model selection.
- `Seaborn <https://seaborn.pydata.org/>`__: statistical data visualization.
- `rpy2 <https://rpy2.github.io/>`__: Python to R bridge.


Probability distributions
=========================

Each univariate distribution is an instance of a subclass of `rv_continuous`
(`rv_discrete` for discrete distributions):

.. autosummary::
   :toctree: generated/

   rv_continuous
   rv_discrete
   rv_histogram

Continuous distributions
------------------------

.. autosummary::
   :toctree: generated/

   alpha             -- Alpha
   anglit            -- Anglit
   arcsine           -- Arcsine
   argus             -- Argus
   beta              -- Beta
   betaprime         -- Beta Prime
   bradford          -- Bradford
   burr              -- Burr (Type III)
   burr12            -- Burr (Type XII)
   cauchy            -- Cauchy
   chi               -- Chi
   chi2              -- Chi-squared
   cosine            -- Cosine
   crystalball       -- Crystalball
   dgamma            -- Double Gamma
   dpareto_lognorm   -- Double Pareto Lognormal
   dweibull          -- Double Weibull
   erlang            -- Erlang
   expon             -- Exponential
   exponnorm         -- Exponentially Modified Normal
   exponweib         -- Exponentiated Weibull
   exponpow          -- Exponential Power
   f                 -- F (Snecdor F)
   fatiguelife       -- Fatigue Life (Birnbaum-Saunders)
   fisk              -- Fisk
   foldcauchy        -- Folded Cauchy
   foldnorm          -- Folded Normal
   genlogistic       -- Generalized Logistic
   gennorm           -- Generalized normal
   genpareto         -- Generalized Pareto
   genexpon          -- Generalized Exponential
   genextreme        -- Generalized Extreme Value
   gausshyper        -- Gauss Hypergeometric
   gamma             -- Gamma
   gengamma          -- Generalized gamma
   genhalflogistic   -- Generalized Half Logistic
   genhyperbolic     -- Generalized Hyperbolic
   geninvgauss       -- Generalized Inverse Gaussian
   gibrat            -- Gibrat
   gompertz          -- Gompertz (Truncated Gumbel)
   gumbel_r          -- Right Sided Gumbel, Log-Weibull, Fisher-Tippett, Extreme Value Type I
   gumbel_l          -- Left Sided Gumbel, etc.
   halfcauchy        -- Half Cauchy
   halflogistic      -- Half Logistic
   halfnorm          -- Half Normal
   halfgennorm       -- Generalized Half Normal
   hypsecant         -- Hyperbolic Secant
   invgamma          -- Inverse Gamma
   invgauss          -- Inverse Gaussian
   invweibull        -- Inverse Weibull
   irwinhall         -- Irwin-Hall
   jf_skew_t         -- Jones and Faddy Skew-T
   johnsonsb         -- Johnson SB
   johnsonsu         -- Johnson SU
   kappa4            -- Kappa 4 parameter
   kappa3            -- Kappa 3 parameter
   ksone             -- Distribution of Kolmogorov-Smirnov one-sided test statistic
   kstwo             -- Distribution of Kolmogorov-Smirnov two-sided test statistic
   kstwobign         -- Limiting Distribution of scaled Kolmogorov-Smirnov two-sided test statistic.
   landau            -- Landau
   laplace           -- Laplace
   laplace_asymmetric    -- Asymmetric Laplace
   levy              -- Levy
   levy_l
   levy_stable
   logistic          -- Logistic
   loggamma          -- Log-Gamma
   loglaplace        -- Log-Laplace (Log Double Exponential)
   lognorm           -- Log-Normal
   loguniform        -- Log-Uniform
   lomax             -- Lomax (Pareto of the second kind)
   maxwell           -- Maxwell
   mielke            -- Mielke's Beta-Kappa
   moyal             -- Moyal
   nakagami          -- Nakagami
   ncx2              -- Non-central chi-squared
   ncf               -- Non-central F
   nct               -- Non-central Student's T
   norm              -- Normal (Gaussian)
   norminvgauss      -- Normal Inverse Gaussian
   pareto            -- Pareto
   pearson3          -- Pearson type III
   powerlaw          -- Power-function
   powerlognorm      -- Power log normal
   powernorm         -- Power normal
   rdist             -- R-distribution
   rayleigh          -- Rayleigh
   rel_breitwigner   -- Relativistic Breit-Wigner
   rice              -- Rice
   recipinvgauss     -- Reciprocal Inverse Gaussian
   semicircular      -- Semicircular
   skewcauchy        -- Skew Cauchy
   skewnorm          -- Skew normal
   studentized_range    -- Studentized Range
   t                 -- Student's T
   trapezoid         -- Trapezoidal
   triang            -- Triangular
   truncexpon        -- Truncated Exponential
   truncnorm         -- Truncated Normal
   truncpareto       -- Truncated Pareto
   truncweibull_min  -- Truncated minimum Weibull distribution
   tukeylambda       -- Tukey-Lambda
   uniform           -- Uniform
   vonmises          -- Von-Mises (Circular)
   vonmises_line     -- Von-Mises (Line)
   wald              -- Wald
   weibull_min       -- Minimum Weibull (see Frechet)
   weibull_max       -- Maximum Weibull (see Frechet)
   wrapcauchy        -- Wrapped Cauchy

The ``fit`` method of the univariate continuous distributions uses
maximum likelihood estimation to fit the distribution to a data set.
The ``fit`` method can accept regular data or *censored data*.
Censored data is represented with instances of the `CensoredData`
class.

.. autosummary::
   :toctree: generated/

   CensoredData


Multivariate distributions
--------------------------

.. autosummary::
   :toctree: generated/

   multivariate_normal    -- Multivariate normal distribution
   matrix_normal          -- Matrix normal distribution
   dirichlet              -- Dirichlet
   dirichlet_multinomial  -- Dirichlet multinomial distribution
   wishart                -- Wishart
   invwishart             -- Inverse Wishart
   multinomial            -- Multinomial distribution
   special_ortho_group    -- SO(N) group
   ortho_group            -- O(N) group
   unitary_group          -- U(N) group
   random_correlation     -- random correlation matrices
   multivariate_t         -- Multivariate t-distribution
   multivariate_hypergeom -- Multivariate hypergeometric distribution
   normal_inverse_gamma   -- Normal-inverse-gamma distribution
   random_table           -- Distribution of random tables with given marginals
   uniform_direction      -- Uniform distribution on S(N-1)
   vonmises_fisher        -- Von Mises-Fisher distribution

`scipy.stats.multivariate_normal` methods accept instances
of the following class to represent the covariance.

.. autosummary::
   :toctree: generated/

   Covariance             -- Representation of a covariance matrix


Discrete distributions
----------------------

.. autosummary::
   :toctree: generated/

   bernoulli                -- Bernoulli
   betabinom                -- Beta-Binomial
   betanbinom               -- Beta-Negative Binomial
   binom                    -- Binomial
   boltzmann                -- Boltzmann (Truncated Discrete Exponential)
   dlaplace                 -- Discrete Laplacian
   geom                     -- Geometric
   hypergeom                -- Hypergeometric
   logser                   -- Logarithmic (Log-Series, Series)
   nbinom                   -- Negative Binomial
   nchypergeom_fisher       -- Fisher's Noncentral Hypergeometric
   nchypergeom_wallenius    -- Wallenius's Noncentral Hypergeometric
   nhypergeom               -- Negative Hypergeometric
   planck                   -- Planck (Discrete Exponential)
   poisson                  -- Poisson
   poisson_binom            -- Poisson Binomial
   randint                  -- Discrete Uniform
   skellam                  -- Skellam
   yulesimon                -- Yule-Simon
   zipf                     -- Zipf (Zeta)
   zipfian                  -- Zipfian


An overview of statistical functions is given below.  Many of these functions
have a similar version in `scipy.stats.mstats` which work for masked arrays.

Summary statistics
==================

.. autosummary::
   :toctree: generated/

   describe          -- Descriptive statistics
   gmean             -- Geometric mean
   hmean             -- Harmonic mean
   pmean             -- Power mean
   kurtosis          -- Fisher or Pearson kurtosis
   mode              -- Modal value
   moment            -- Central moment
   lmoment
   expectile         -- Expectile
   skew              -- Skewness
   kstat             --
   kstatvar          --
   tmean             -- Truncated arithmetic mean
   tvar              -- Truncated variance
   tmin              --
   tmax              --
   tstd              --
   tsem              --
   variation         -- Coefficient of variation
   find_repeats
   rankdata
   tiecorrect
   trim_mean
   gstd              -- Geometric Standard Deviation
   iqr
   sem
   bayes_mvs
   mvsdist
   entropy
   differential_entropy
   median_abs_deviation

Frequency statistics
====================

.. autosummary::
   :toctree: generated/

   cumfreq
   percentileofscore
   scoreatpercentile
   relfreq

.. autosummary::
   :toctree: generated/

   binned_statistic     -- Compute a binned statistic for a set of data.
   binned_statistic_2d  -- Compute a 2-D binned statistic for a set of data.
   binned_statistic_dd  -- Compute a d-D binned statistic for a set of data.

.. _hypotests:

Hypothesis Tests and related functions
======================================
SciPy has many functions for performing hypothesis tests that return a
test statistic and a p-value, and several of them return confidence intervals
and/or other related information.

The headings below are based on common uses of the functions within, but due to
the wide variety of statistical procedures, any attempt at coarse-grained
categorization will be imperfect. Also, note that tests within the same heading
are not interchangeable in general (e.g. many have different distributional
assumptions).

One Sample Tests / Paired Sample Tests
--------------------------------------
One sample tests are typically used to assess whether a single sample was
drawn from a specified distribution or a distribution with specified properties
(e.g. zero mean).

.. autosummary::
   :toctree: generated/

   ttest_1samp
   binomtest
   quantile_test
   skewtest
   kurtosistest
   normaltest
   jarque_bera
   shapiro
   anderson
   cramervonmises
   ks_1samp
   goodness_of_fit
   chisquare
   power_divergence

Paired sample tests are often used to assess whether two samples were drawn
from the same distribution; they differ from the independent sample tests below
in that each observation in one sample is treated as paired with a
closely-related observation in the other sample (e.g. when environmental
factors are controlled between observations within a pair but not among pairs).
They can also be interpreted or used as one-sample tests (e.g. tests on the
mean or median of *differences* between paired observations).

.. autosummary::
   :toctree: generated/

   ttest_rel
   wilcoxon

Association/Correlation Tests
-----------------------------

These tests are often used to assess whether there is a relationship (e.g.
linear) between paired observations in multiple samples or among the
coordinates of multivariate observations.

.. autosummary::
   :toctree: generated/

   linregress
   pearsonr
   spearmanr
   pointbiserialr
   kendalltau
   chatterjeexi
   weightedtau
   somersd
   siegelslopes
   theilslopes
   page_trend_test
   multiscale_graphcorr

These association tests and are to work with samples in the form of contingency
tables. Supporting functions are available in `scipy.stats.contingency`.

.. autosummary::
   :toctree: generated/

   chi2_contingency
   fisher_exact
   barnard_exact
   boschloo_exact

Independent Sample Tests
------------------------
Independent sample tests are typically used to assess whether multiple samples
were independently drawn from the same distribution or different distributions
with a shared property (e.g. equal means).

Some tests are specifically for comparing two samples.

.. autosummary::
   :toctree: generated/

   ttest_ind_from_stats
   poisson_means_test
   ttest_ind
   mannwhitneyu
   bws_test
   ranksums
   brunnermunzel
   mood
   ansari
   cramervonmises_2samp
   epps_singleton_2samp
   ks_2samp
   kstest

Others are generalized to multiple samples.

.. autosummary::
   :toctree: generated/

   f_oneway
   tukey_hsd
   dunnett
   kruskal
   alexandergovern
   fligner
   levene
   bartlett
   median_test
   friedmanchisquare
   anderson_ksamp

Resampling and Monte Carlo Methods
----------------------------------
The following functions can reproduce the p-value and confidence interval
results of most of the functions above, and often produce accurate results in a
wider variety of conditions. They can also be used to perform hypothesis tests
and generate confidence intervals for custom statistics. This flexibility comes
at the cost of greater computational requirements and stochastic results.

.. autosummary::
   :toctree: generated/

   monte_carlo_test
   permutation_test
   bootstrap
   power

Instances of the following object can be passed into some hypothesis test
functions to perform a resampling or Monte Carlo version of the hypothesis
test.

.. autosummary::
   :toctree: generated/

   MonteCarloMethod
   PermutationMethod
   BootstrapMethod

Multiple Hypothesis Testing and Meta-Analysis
---------------------------------------------
These functions are for assessing the results of individual tests as a whole.
Functions for performing specific multiple hypothesis tests (e.g. post hoc
tests) are listed above.

.. autosummary::
   :toctree: generated/

   combine_pvalues
   false_discovery_control


The following functions are related to the tests above but do not belong in the
above categories.

Random Variables
================

.. autosummary::
   :toctree: generated/

   make_distribution
   Normal
   Uniform
   Mixture
   order_statistic
   truncate
   abs
   exp
   log

Quasi-Monte Carlo
=================

.. toctree::
   :maxdepth: 4

   stats.qmc

Contingency Tables
==================

.. toctree::
   :maxdepth: 4

   stats.contingency

Masked statistics functions
===========================

.. toctree::

   stats.mstats


Other statistical functionality
===============================

Transformations
---------------

.. autosummary::
   :toctree: generated/

   boxcox
   boxcox_normmax
   boxcox_llf
   yeojohnson
   yeojohnson_normmax
   yeojohnson_llf
   obrientransform
   sigmaclip
   trimboth
   trim1
   zmap
   zscore
   gzscore

Statistical distances
---------------------

.. autosummary::
   :toctree: generated/

   wasserstein_distance
   wasserstein_distance_nd
   energy_distance

Sampling
--------

.. toctree::
   :maxdepth: 4

   stats.sampling

Fitting / Survival Analysis
---------------------------

.. autosummary::
   :toctree: generated/

   fit
   ecdf
   logrank

Directional statistical functions
---------------------------------

.. autosummary::
   :toctree: generated/

   directional_stats
   circmean
   circvar
   circstd

Sensitivity Analysis
--------------------

.. autosummary::
   :toctree: generated/

   sobol_indices

Plot-tests
----------

.. autosummary::
   :toctree: generated/

   ppcc_max
   ppcc_plot
   probplot
   boxcox_normplot
   yeojohnson_normplot

Univariate and multivariate kernel density estimation
-----------------------------------------------------

.. autosummary::
   :toctree: generated/

   gaussian_kde

Warnings / Errors used in :mod:`scipy.stats`
--------------------------------------------

.. autosummary::
   :toctree: generated/

   DegenerateDataWarning
   ConstantInputWarning
   NearConstantInputWarning
   FitError

Result classes used in :mod:`scipy.stats`
-----------------------------------------

.. warning::

    These classes are private, but they are included here because instances
    of them are returned by other statistical functions. User import and
    instantiation is not supported.

.. toctree::
   :maxdepth: 2

   stats._result_classes

"""  # noqa: E501

from ._warnings_errors import (ConstantInputWarning, NearConstantInputWarning,
                               DegenerateDataWarning, FitError)
from ._stats_py import *
from ._variation import variation
from .distributions import *
from ._morestats import *
from ._multicomp import *
from ._binomtest import binomtest
from ._binned_statistic import *
from ._kde import gaussian_kde
from . import mstats
from . import qmc
from ._multivariate import *
from . import contingency
from .contingency import chi2_contingency
from ._censored_data import CensoredData
from ._resampling import (bootstrap, monte_carlo_test, permutation_test, power,
                          MonteCarloMethod, PermutationMethod, BootstrapMethod)
from ._entropy import *
from ._hypotests import *
from ._page_trend_test import page_trend_test
from ._mannwhitneyu import mannwhitneyu
from ._bws_test import bws_test
from ._fit import fit, goodness_of_fit
from ._covariance import Covariance
from ._sensitivity_analysis import *
from ._survival import *
from ._distribution_infrastructure import (
    make_distribution, Mixture, order_statistic, truncate, exp, log, abs
)
from ._new_distributions import Normal, Uniform
from ._mgc import multiscale_graphcorr
from ._correlation import chatterjeexi


# Deprecated namespaces, to be removed in v2.0.0
from . import (
    biasedurn, kde, morestats, mstats_basic, mstats_extras, mvn, stats
)


__all__ = [s for s in dir() if not s.startswith("_")]  # Remove dunders.

from scipy._lib._testutils import PytestTester
test = PytestTester(__name__)
del PytestTester