Smooth Distribution

< Probability distributions < Smooth distribution

What is a smooth distribution?

In statistics, the term smooth distribution is often used as a synonym for a continuous probability distribution. However, a smooth distribution can also be the result of smoothing — a technique where a noisy distribution is replaced to filter or dampen noise [2]. For example, kernel density estimation (KDE) is a technique where discrete points in a sample are replaced by an extended probability distribution, called a kernel. The probability density at any point in the space is estimated by the sum of the kernels at the chosen point, after normalization, over all of the discrete samples. The broader the kernel, the smoother the distribution [3].

kernel density estimation result showing a smooth distribution
Kernel density estimation result showing a smooth distribution (the gray bell curve) [1].

In both of the above definitions, the use of the term “smooth” to describe a distribution is informal and used in the layperson’s sense of the word to mean even and fluid, as opposed to jagged and “noisy.” In other words, you won’t find mention of smooth distributions in any text containing probability and statistics definitions. However, in differential geometry, the meaning of smooth distribution is defined precisely, and is quite different from the usage in statistics.

Smooth distribution (differential geometry)

In differential geometry, a smooth distribution is a set of vector fields that vary smoothly from point to point on a manifold (a curve or surface in higher dimensional space). More precisely, a smooth distribution is a subbundle of variable dimension inside the tangent bundle of a smooth manifold, assuming the distribution is the pointwise span of a family of smooth vector fields [4]. The vector fields in a such a distribution can be added together and multiplied by scalars, and they will still be in the same distribution. The family of smooth vector fields that spans the smooth distribution is called a basis for the distribution. The dimension of the distribution is the number of vector fields in the basis.

References

[1] M. W. Toews, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

[2] Wang XF, Fan Z, Wang B. Estimating smooth distribution function in the presence of heteroscedastic measurement errors. Comput Stat Data Anal. 2010 Jan 1;54(1):25-36. doi: 10.1016/j.csda.2009.08.012. PMID: 20160998; PMCID: PMC2756710.

[3] Kernel density estimation

[4] Drager, L. et al. (2010). Smooth distributions are finitely generated. Ann. Global Anal. Geom. 41 (2012), no. 3, 357-369. arXiv:1012.5641


Comments? Need to post a correction? Please Contact Us.

Leave a Comment