Statistics¶
Note
This module has been deprecated. See the stats
module.
The statistics
module in SymPy implements standard probability distributions
and related tools. Its contents can be imported with the following statement:
>>> from sympy import *
>>> from sympy.statistics import *
>>> init_printing(use_unicode=False, wrap_line=False, no_global=True)
Normal distributions¶
Normal(mu, sigma)
creates a normal distribution with mean value mu
and
standard deviation sigma
. The Normal
class defines several
useful methods and properties. Various properties can be accessed directly as
follows:
>>> N = Normal(0, 1)
>>> N.mean
0
>>> N.median
0
>>> N.variance
1
>>> N.stddev
1
You can generate random numbers from the desired distribution with the
random
method:
>>> N = Normal(10, 5)
>>> N.random()
4.914375200829805834246144514
>>> N.random()
11.84331557474637897087177407
>>> N.random()
17.22474580071733640806996846
>>> N.random()
9.864643097429464546621602494
The probability density function (pdf) and cumulative distribution function (cdf) of a distribution can be computed, either in symbolic form or for particular values:
>>> N = Normal(1, 1)
>>> x = Symbol('x')
>>> N.pdf(1)
___
\/ 2
--------
____
2*\/ pi
>>> N.pdf(3).evalf()
0.0539909665131880
>>> N.cdf(x)
/ ___ \
|\/ 2 *(x - 1)|
erf|-------------|
\ 2 / 1
------------------ + -
2 2
>>> N.cdf(-oo), N.cdf(1), N.cdf(oo)
(0, 1/2, 1)
>>> N.cdf(5).evalf()
0.999968328758167
The method probability
gives the total probability on a given interval (a
convenient alternative syntax for cdf(b)-cdf(a)):
>>> N = Normal(0, 1)
>>> N.probability(-oo, 0)
1/2
>>> N.probability(-1, 1)
/ ___\
|\/ 2 |
erf|-----|
\ 2 /
>>> N.probability(-1, 1).evalf()
0.682689492137086
You can also generate a symmetric confidence interval from a given desired confidence level (given as a fraction 0-1). For the normal distribution, 68%, 95% and 99.7% confidence levels respectively correspond to approximately 1, 2 and 3 standard deviations:
>>> N = Normal(0, 1)
>>> N.confidence(0.68)
(-0.994457883209753, 0.994457883209753)
>>> N.confidence(0.95)
(-1.95996398454005, 1.95996398454005)
>>> N.confidence(0.997)
(-2.96773792534178, 2.96773792534178)
Plug the interval back in to see that the value is correct:
>>> N.probability(*N.confidence(0.95)).evalf()
0.950000000000000
Other distributions¶
Besides the normal distribution, uniform continuous distributions are also
supported. Uniform(a, b)
represents the distribution with uniform
probability on the interval [a, b] and zero probability everywhere else. The
Uniform
class supports the same methods as the Normal
class.
Additional distributions, including support for arbitrary user-defined distributions, are planned for the future.
API Reference¶
Sample¶
-
class
sympy.statistics.distributions.
Sample
[source]¶ Sample([x1, x2, x3, ...]) represents a collection of samples. Sample parameters like mean, variance and stddev can be accessed as properties. The sample will be sorted.
Examples
>>> from sympy.statistics.distributions import Sample >>> Sample([0, 1, 2, 3]) Sample([0, 1, 2, 3]) >>> Sample([8, 3, 2, 4, 1, 6, 9, 2]) Sample([1, 2, 2, 3, 4, 6, 8, 9]) >>> s = Sample([1, 2, 3, 4, 5]) >>> s.mean 3 >>> s.stddev sqrt(2) >>> s.median 3 >>> s.variance 2
Continuous Probability Distributions¶
-
class
sympy.statistics.distributions.
ContinuousProbability
[source]¶ Base class for continuous probability distributions
-
probability
(s, a, b)[source]¶ Calculate the probability that a random number x generated from the distribution satisfies a <= x <= b
Examples
>>> from sympy.statistics import Normal >>> from sympy.core import oo >>> Normal(0, 1).probability(-1, 1) erf(sqrt(2)/2) >>> Normal(0, 1).probability(1, oo) -erf(sqrt(2)/2)/2 + 1/2
-
random
(s, n=None)[source]¶ random() – generate a random number from the distribution. random(n) – generate a Sample of n random numbers.
Examples
>>> from sympy.statistics import Uniform >>> x = Uniform(1, 5).random() >>> x < 5 and x > 1 True >>> x = Uniform(-4, 2).random() >>> x < 2 and x > -4 True
-
-
class
sympy.statistics.distributions.
Normal
(mu, sigma)[source]¶ Normal(mu, sigma) represents the normal or Gaussian distribution with mean value mu and standard deviation sigma.
Examples
>>> from sympy.statistics import Normal >>> from sympy import oo >>> N = Normal(1, 2) >>> N.mean 1 >>> N.variance 4 >>> N.probability(-oo, 1) # probability on an interval 1/2 >>> N.probability(1, oo) 1/2 >>> N.probability(-oo, oo) 1 >>> N.probability(-1, 3) erf(sqrt(2)/2) >>> _.evalf() 0.682689492137086
-
cdf
(s, x)[source]¶ Return the cumulative density function as an expression in x
Examples
>>> from sympy.statistics import Normal >>> Normal(1, 2).cdf(0) -erf(sqrt(2)/4)/2 + 1/2 >>> from sympy.abc import x >>> Normal(1, 2).cdf(x) erf(sqrt(2)*(x - 1)/4)/2 + 1/2
-
confidence
(s, p)[source]¶ Return a symmetric (p*100)% confidence interval. For example, p=0.95 gives a 95% confidence interval. Currently this function only handles numerical values except in the trivial case p=1.
For example, one standard deviation:
>>> from sympy.statistics import Normal >>> N = Normal(0, 1) >>> N.confidence(0.68) (-0.994457883209753, 0.994457883209753) >>> N.probability(*_).evalf() 0.680000000000000
Two standard deviations:
>>> N = Normal(0, 1) >>> N.confidence(0.95) (-1.95996398454005, 1.95996398454005) >>> N.probability(*_).evalf() 0.950000000000000
-
static
fit
(sample)[source]¶ Create a normal distribution fit to the mean and standard deviation of the given distribution or sample.
Examples
>>> from sympy.statistics import Normal >>> Normal.fit([1,2,3,4,5]) Normal(3, sqrt(2)) >>> from sympy.abc import x, y >>> Normal.fit([x, y]) Normal(x/2 + y/2, sqrt((-x/2 + y/2)**2/2 + (x/2 - y/2)**2/2))
-
-
class
sympy.statistics.distributions.
Uniform
(a, b)[source]¶ Uniform(a, b) represents a probability distribution with uniform probability density on the interval [a, b] and zero density everywhere else.
-
cdf
(s, x)[source]¶ Return the cumulative density function as an expression in x
Examples
>>> from sympy.statistics import Uniform >>> Uniform(1, 5).cdf(2) 1/4 >>> Uniform(1, 5).cdf(4) 3/4
-
confidence
(s, p)[source]¶ Generate a symmetric (p*100)% confidence interval.
>>> from sympy import Rational >>> from sympy.statistics import Uniform >>> U = Uniform(1, 2) >>> U.confidence(1) (1, 2) >>> U.confidence(Rational(1,2)) (5/4, 7/4)
-
static
fit
(sample)[source]¶ Create a uniform distribution fit to the mean and standard deviation of the given distribution or sample.
Examples
>>> from sympy.statistics import Uniform >>> Uniform.fit([1, 2, 3, 4, 5]) Uniform(-sqrt(6) + 3, sqrt(6) + 3) >>> Uniform.fit([1, 2]) Uniform(-sqrt(3)/2 + 3/2, sqrt(3)/2 + 3/2)
-
-
class
sympy.statistics.distributions.
PDF
(func, (x, a, b)) represents continuous probability distribution with probability distribution function func(x) on interval (a, b)[source]¶ If func is not normalized so that integrate(func, (x, a, b)) == 1, it can be normalized using PDF.normalize() method
Examples
>>> from sympy import Symbol, exp, oo >>> from sympy.statistics.distributions import PDF >>> from sympy.abc import x >>> a = Symbol('a', positive=True)
>>> exponential = PDF(exp(-x/a)/a, (x,0,oo)) >>> exponential.pdf(x) exp(-x/a)/a >>> exponential.cdf(x) 1 - exp(-x/a) >>> exponential.mean a >>> exponential.variance a**2
-
cdf
(x)[source]¶ Return the cumulative density function as an expression in x
Examples
>>> from sympy.statistics.distributions import PDF >>> from sympy import exp, oo >>> from sympy.abc import x, y >>> PDF(exp(-x/y), (x,0,oo)).cdf(4) y - y*exp(-4/y) >>> PDF(2*x + y, (x, 10, oo)).cdf(0) -10*y - 100
-
normalize
()[source]¶ Normalize the probability distribution function so that integrate(self.pdf(x), (x, a, b)) == 1
Examples
>>> from sympy import Symbol, exp, oo >>> from sympy.statistics.distributions import PDF >>> from sympy.abc import x >>> a = Symbol('a', positive=True)
>>> exponential = PDF(exp(-x/a), (x,0,oo)) >>> exponential.normalize().pdf(x) exp(-x/a)/a
-
transform
(func, var)[source]¶ Return a probability distribution of random variable func(x) currently only some simple injective functions are supported
Examples
>>> from sympy.statistics.distributions import PDF >>> from sympy import oo >>> from sympy.abc import x, y >>> PDF(2*x + y, (x, 10, oo)).transform(x, y) PDF(0, ((_w,), x, x))
-