Categorical distribution
Story
A probability is assigned to each of a set of discrete outcomes.
Example
A hen will peck at grain A with probability \(\theta_\mathrm{A}\), grain B with probability \(\theta_\mathrm{B}\), and grain C with probability \(\theta_\mathrm{C}\).
Parameters
The distribution is parametrized by the probabilities assigned to each event. We define \(\theta_y\) to be the probability assigned to outcome \(y\). The set of \(\theta_y\)’s are the parameters, and are constrained by
Support
If we index the categories with sequential integers from 1 to N, the distribution is supported for integers 1 to N, inclusive when described using the indices of the categories.
Probability mass function
Moments
Moments are not defined for a Categorical distribution because the value of \(y\) is not necessarily numeric.
Usage
Package |
Syntax |
---|---|
NumPy |
|
SciPy |
|
Stan |
|
Notes
This distribution must be manually constructed if you are using the
scipy.stats
module usingscipy.stats.rv_discrete()
. The categories need to be encoded by an index. For interactive plotting purposes, below, we need to specify a custom PMF and CDF.To sample out of a Categorical distribution, use
numpy.random.choice()
, specifying the values of \(\theta\) using the p kwarg.