Ch. 3 - Discrete Distributions
Class: STAT-211
Notes:
Outline
- Random Variables
- Discrete Probability Distribution
- Cumulative Distribution Function
- Expectation and Variance
- Binomial Distribution
Random Variables
Random Variables
A real-valued random variable is a quantitative variable which takes its values depending on the outcome of a random experiment.
- It is a variable because different numerical values are possible.
- It is random because the observed value of the variable depends on the outcome of the random experiment.
Typically, we denote random variables by upper-case letters like X ,Y ,Z , etc., and their different realizations when the random experiment is performed by lower-case letters like x,y ,z, etc.
However, the lower-case letters need not always match the upper-case letters (i.e. the name of the random variables).
Example
- Consider the random experiment of flipping a fair coin thrice. Here the sample space Sis given by
-
Let X denote the number of heads that appear. Observe that
- X = 0 if and only if {TTT} occurs
- No heads occur.
- X = 1 if and only if {HTT , THT , TTH} occurs
- Exactly one heads appeared.
- X = 2 if and only if {HHT , HTH, THH} occurs
- Exactly two heads appeared.
- X = 3 if and only if {HHH} occurs
- Exactly three heads appeared.
- X = 0 if and only if {TTT} occurs
-
Clearly, X is a numerical variable taking the values 0,1,2,3, depending on the outcome of the random experiment. Hence, X defined as above is a random variable.
- Note that by definition a qualitative/categorical variable can never be a random variable
Discrete Random Variables
A random variable is discrete if it takes on countably many values (finite or infinite) only.
- Number of heads in 10 flips of a fair coin
- Sum of two faces when a six–faced fair die is rolled twice
- Number of occurrences of car accidents at an intersection
- Number of data packets arriving in a network device
Continuous Random Variable
A random variable is continuous if it assumes its values within an
interval (bounded or unbounded).
5. Arrival time of a message on a communication network.
6. Life time of an electrical or electronic device
7. Strength of concrete weld
Discrete Probability Distribution
Distribution of a Random Variable
Any outcome involving the occurrences of various values of a random
variable
- As we discussed, a random variable will assume its different values depending on the outcome of the experiment
- Will be depended on one ore more than one outcomes
- The occurrences of various values of a random variable X can be considered as events, therefore you can attach probabilities to these events.
- That means you are interested in learning what is the likelihood of the occurrences of such events.
Therefore, we can attach probabilities to find the chances or the
likelihoods of the occurrences of such events.
Probability Distribution
The probability distribution of a random variable
If we know the probability distribution of
- Discrete Distributions
- Continuous Distributions
- Mixture Distributions (beyond the scope of this course)
Discrete Probability Distribution
Probability Mass Function
Let
function
that must satisfy the following two properties:
- The probability of each possible outcome is between 0 and 1:
- The sum of these individual probabilities must be 1:
Informally, the PMF says “if you tell me the value of
For example,
Write down the probability of the following events in terms of
-
P(3 ≤ X ≤ 6)
- {3 ≤ X ≤ 6} = {X = 3} OR {X = 4} OR {X = 5} OR
- They are all disjoint, cannot happen at the same time
- Probability of their union is equal to the sum of their individual probabilities
- P(3 ≤ X ≤ 6)
- P(X=3) + P(X=4) + P(X=5) + P(X=6)
- = f(3) + f(4) + f(5) + f(6)
-
P(3 < X ≤ 6)
- {3 < X ≤ 6} = {X=4} OR {X=5} OR
- P(3 < X ≤ 6) = ...
-
P(3 < X < 6)
Example
Suppose you flip a fair coin thrice. Here, the sample space
Let X denote the number of heads that appear. Then X ∈{0,1,2,3}.
-
X = 0 ⇔ {TTT} occurs ⇒ P(X = 0) = 1/8 > 0
-
X = 1 ⇔ {HTT, THT, TTH} occurs ⇒ P(X = 1) = 3/8 > 0
-
X = 2 ⇔ {HHT, HTH, THH} occurs ⇒ P(X = 2) = 3/8 > 0
-
X = 3 ⇔ {HHH} occurs ⇒ P(X = 3) = 1/8 > 0
-
You can apply the classical definition of probability to enumerate probabilities
/STAT-211/Visual%20Aids/Pasted%20image%2020250916153359.png)
Example (Contd.)
-
What is the probability of observing exactly one head?
Answer:
-
What is the probability of observing at most one head?
Answer:
- Observe that, {X ≤1}= {X = 0}∪{X = 1}.
- The events {X = 0}, and {X = 1}are disjoint.
- P(X ≤1) = P(X = 0) + P(X = 1) = 1/8 + 3/8 = 4/8 = 1/2
-
What is the probability of observing at least one head?
Answer:
- {X ≥ 1}= {X = 1} ∪ {X = 2} ∪ {X = 3}.
- {X = 1}, {X = 2}, and {X = 3} are disjoint.
- P(X ≥ 1) = P(X = 1) + P(X = 2) + P(X = 3) = 3/8 + 3/8 + 1/8 = 7/8
Example
Suppose we roll two six–faced fair dice simultaneously. Then the sample space S would comprise of 36 many equally likely paired outcomes as
- Observe that the sample space here is finite because it contains 36 many elementary outcomes and all of these elementary outcomes are equally likely to occur.
Let X denote the sum of the two faces obtained. Then X takes the values 2,3,4,...,11,12, depending on the outcome of the experiment.
Observe that
X =
2 if & only if (1,1) occurs
3 if & only if (1,2) or (2,1) occurs
4 if & only if (1,3), (2,2) or (3,1) occurs
5 if & only if (1,4), (2,3), (3,2) or (4,1) occurs
6 if & only if (1,5), (2,4), (3,3), (4,2) or (5,1) occurs
7 if & only if (1,6), (2,5), (3,4), (4,3), (5,2) or (6,1) occurs
8 if & only if (2,6), (3,5), (4,4), (5,3) or (6,2) occurs
9 if & only if (3,6), (4,5), (5,4) or (6,3) occurs
10 if & only if (4,6), (5,5) or (6,4) occurs
11 if & only if (5,6) or (6,5) occurs
12 if & only if (6,6) occurs
Now our next aim is to find the probability mass function of X.
- Here the porbabilityt mass function of X is
- f(x) = P(X = x),
- where x = 2,3,...12.
- 0 <= f(x) <= 1, for all x.
f(x) = 1 - "Summation should be equal to 1"
Example with f(2), f(3):
- f(2) = P(X=2) = P({(1,1)}) = 1/36
- Follows from the classical definition of probability
- f(3) = P(X=3) = P({(1,2), (2,1)}) = 2/36
- f(4) = P(X=4) = P({(1,3), (2,2), (3,1)}) = 3/36
/STAT-211/Visual%20Aids/Pasted%20image%2020250920124810.png)
- This is the probability mass function table for the discrete random variable
. - In the first row you can see different possible
values arranged in ascending order left to right - In the second row you will find various values of the corresponding probability mass function
- Corresponding probabilities of occurrences
- What is the probability that the sum
is an even number?
Since the above events are mutually exclusive, using Axiom 3, we obtain
P(X is even)
= P({X = 2} ∪ {X = 4} ∪ {X = 6} ∪ {X = 8} ∪ {X = 10} ∪ {X = 12})
= P(X = 2) + P(X = 4) +···+ P(X = 12)
= 1/36 + 3/36 + 5/36 + 5/36 + 4/36 + 1/36 = 1/2.
-
P(X ≤ 4) = P(X = 2) + P(X = 3) + P(X = 4) = 6/36 = 1/6
- Observe that if x is either 2 or 3 or 4, then we must have X <= 4.
- These are 3 disjoint events
- Follows from Axiom 3 of probability
-
P(X > 4) = 1 − P(X ≤ 4) = 1−1/6 = 5/6
- Note that the event X > 4 is nothing but the complement of the event X <= 4.
- {x > 4} = {x <= 4}c
- Note that the event X > 4 is nothing but the complement of the event X <= 4.
Cumulative Distribution Function
Cumulative Distribution Function
The cumulative distribution function (cdf)
- Sum taken over all of those lowercase 'y' values such that 'y' is less than or equal to x.
- The cumulative distribution function is defined for any arbitrary real number
- cdf is the cumulative probability that the random variable capital
is at most . - This means the value of the cumulative distribution function must always lie in between zero and one, both inclusive for any real number
.
- This means the value of the cumulative distribution function must always lie in between zero and one, both inclusive for any real number
For any real number
The CDF must satisfy the following properties:
-
for any real number -
and -
It is non-decreasing: If
then -
If
-
F is right-continuous everywhere:
for all -
Note that if you know the cdf of a discrete random variable X, then you can obtain the probability that x should be smaller than X <= y as the CDF F(y) - CDF F(x).
Remarks
- The pmf can be obtained from the cdf as
- f: PMF
- F: CDF
- The probability of events of the form
[a < X ≤ b]is given in terms of the pmf as
- Not this sum can also be written as (example):
-
- Writing the sum as the difference of these two sums
and the cdf
Important Remark If X is discrete with cdf F, then
need not be true.
- It may not be true
- However we will see later on that if X is a continuous random variable, this equality will actually always be true.
Example:
-
X : Discrete rv
-
f : PMF of X
-
F: CDF of X
-
f(i) > 0 for i = 1,2,...,10
- These points are called "Mass points"
-
P(3 < X <= 6) = P(X = 4) + P(X = 5) + P(X = 6)
-
-
Now what is the first term?
- Nothing but P(X<=6)
-
The second term?
- P(X<=3)
-
You can write this probability as
- F(6) - F(3)
- Since
- F(6) = P(X<=6), and
- F(3) = P(X<=3)
Expectation and Variance
Expectation of a Discrete Random Variable
Suppose
provided
(In this course, we shall always assume that
Important:
If
(In this course, we shall always assume that
Long-Term Interpretation of E(X)
The expected value is often referred to as the ‘long-term’ average or
mean. This means:
”If the random experiment is replicated a large number of times under
identical conditions (a hypothetical situation), then the average of the
values observed in those large number of replications of the random
experiment will be approximately
- Replicate the same random experiment under identical conditions a sufficiently large number of times, say,
, many times, where is a large positive integer. - Record the values of
observed in those identical replications of the random experiment, say, . - Then,
provided N is sufficiently large.
- Also applicable for continuous random variables
Note: The expectation of X is nothing but a measure of center of a give probability distribution.
- Note that the expected value of X may not be among one of the possible X values.
- It may or may not belong to the set of possible X values.
Example 1
Suppose, a fair coin is flipped thrice. Let X be the number of heads that
Then X takes the values 0,1,2, and 3, with probabilities
respectively
/STAT-211/Visual%20Aids/Pasted%20image%2020250920192218.png)
- For instance:
- The first entry in the third column can be obtained by multiplying 0 with its corresponding probability of occurrence (1/8)
- Likewise the second entry on the third column is 3/8 which can be obtained by multiplying the probable x value of one on the first column with a corresponding probability of occurrence 3/8 in the second column
- Once enumerated, then you can simply take the sum of all of them
- The expected value of X would be equal to 1.5
Hence,
- This can be interpreted with the help of long term frequent interpretation of expected values
Example 2
Let the random variable X denote the sum of the two faces when two six-faced fair dice are rolled simultaneously.
/STAT-211/Visual%20Aids/Pasted%20image%2020250920192740.png)
- Hence,
, which is a probable value of .
Variance of discrete Random Variable
The most common measure of spread of a r.v. is its variance.
Suppose
Then the variance of
provided the sum is finite.
The standard deviation is
Important: Again, these are population level quantities, not to be confused with the sample variance or sample standard deviation.
- These are the population variance and the population standard deviation of the random variable
. This is the most accurate definition. - Not to be confused with sample variance and sample standard deviation
Fact:
- The variance of a discrete rv
can be written as the difference of the expected value of minus the square of the expected value of - This holds true for any arbitrary probability distribution, either discrete or continuous.
Binomial Distribution
Bernoulli Trial
A random experiment whose outcomes can be classified into one of two
mutually complementary categories: either a ‘success’ or, a ‘failure’, is
called a Bernoulli trial.
- Flipping a coin once
- ‘Occurrence of a Head’ = ‘Success’
- ‘Occurrence of a Tail’ = ‘Failure’
- Rolling a six faced die once
- ‘Occurrence of an even number’ = ‘Success’
- ‘Occurrence of an odd number’ = ‘Failure’
- Drawing a card at random from a well-shuffled deck of 52 cards
- ‘Drawing a Spade’ = ‘Success’
- ‘Not drawing a Spade’ = ‘Failure’
- Drawing a ball at random from an urn containing 6 blue and 4 red balls
- ‘Drawing of a blue ball’ = ‘Success’
- ‘Drawing of a red ball’ = ‘Failure’
For each of these examples you can see that the outcomes of the underlying random experiments can be classified in one of two mutually complimentary categories, namely a "success" and a "failure".
Binomial Experiment
Consider a random experiment consisting of a sequence of Bernoulli trials
such that
- the number of trials is fixed, say,
, before performing the experiment; - the trials are independent; and
- Means the outcome of one trial should not have any influence on the outcomes of the remaining trials
- the probability of ‘success’, say,
, remains fixed across the trials.
Such an experiment is referred to as a Binomial experiment with
parameters
Examples
- A coin is flipped 10 times independently, and under identical conditions, and the number of heads are recorded: Yes−Binomial
- Observe that when a coin is flipped once, then either a head or a tail appears.
- Define head as "success" and tail as "failure"
- One single flip of the coin can be regarded as a Bernoulli Trial
- There are 10 such Bernoulli Trials in total, that means the number of trials here is fixed.
- The trials are of course independent, the outcome of one flip of the coin does not have any influence on the outcome of the remaining flips
- The probability of success is fixed since the coin is flipped under identical conditions 10 times.
- Cards are drawn at random, one by one and with replacement, from a well-shuffled deck of 52 cards until we get 5 jacks: No-Number of trials is not fixed
- The drawing of each card can be regarded as a Bernoulli Trial
- The outcomes are dichotomous, either we get a jack or not
- Are the drawings independent? - yes
- Since this is done with "replacement", it means the outcome of one drawing will not have any influence on the remaining drawings
- The probability of drawing a jack remains the same across all these Bernoulli Trials.
- But here we do not know what should be the total number of trials before we get five jacks!
- Observe that in order to get 5 jacks you need to have at least 5 many such draws
- Theoretically it can go up to infinity.
- The drawing of each card can be regarded as a Bernoulli Trial
- 8 cards are drawn at random, one by one and with replacement, from a well-shuffled deck of 52 cards, and the number of spades are recorded: Yes−Binomial
- Is it a series of Bernoulli Trials?
- Yes, drawing of each card can be regarded as a Bernoulli Trial
- Is the number of trials defined?
- Yes, it is fixed before the random experiment is performed
- Are the trials independent?
- Yes, done at random, one by one and with replacement
- What is the probability of success in each trial?
- 13/25 - it remains fixed across these 8 independent Bernoulli Trials.
- Is it a series of Bernoulli Trials?
- 5 individuals are drawn at random, one by one and without replacement, from a group of 22 males and 15 females to form a committee, and the number of females selected is recorded: No−Trials are dependent
- Is it a series of Bernoulli Trials?
- Observe that here drawing of each committee member may result in any one of the two outcomes, either it is a male or a female
- one is success, the other is failure.
- Is the number of trials defined?
- Since 5 committee members are to be chosen, therefore there are 5 such Bernoulli Trials
- Are the trials independent?
- No, done at random, one by one, but without replacement.
- The outcome of each draw will depend on the outcome of the remaining draws, as a result the trials are not independent, they are dependent.
- Is it a series of Bernoulli Trials?
Binomial Distribution
Consider a Binomial experiment with parameters
- A finite sequence of
many independent Bernoulli Trials, such that the probability of success remains fixed across those many Bernoulli Trials
Let
follows a Binomial distribution with parameters
- The ~ sign means "follows"
The probability mass function (p.m.f.) of
- Observe that since the random variable
here is the number of successes out of many binary trials, therefore the possible values will be 0,1,..., . - This means there will be
many possible values.
For each
- Note we can read
as " choose ". - This notation is also sometimes found as
- The formula for evaluating this constant is the following:
- Where
. (the product of the first natural numbers) . (the product of the first natural numbers) . (by convention)
- Where
The mean and the variance of
- Population mean:
- Population variance:
where
Examples
Suppose we toss a fair coin independently 10 times. What would be the
distribution of
- There are 10 such trials
- Trials are independently of each other
- Is the probability of success
fixed? yes, the coin is fair. - Thus we have a Binomial Experiment.
- What is the probability of exactly 4 heads?
- P(X = 4) = 0.20508
- What is the probability of at most 3 heads?
- P(X <= 3) = 0.17188
- What is the probability of seeing heads between 5 and 7?
- P(5 <= X <= 7)
- Note we can also write it as P(4 < X <= 7)
- And remember that
- For any discrete rv
- For any discrete rv
- So lets rewrite it as P(X <= 7) - P(X <= 4)
- Now we can use the calculator!
- P(X <= 7) = 0.94531
- P(X <= 4) = 0.37695
- 0.94531 - 0.37695 = 0.56842
- What is probability of seeing 2 tails?
- How can you express this event in terms of the rv
. is the number of heads out of those 10 Bernoulli Trials - Then note that
is the number of tails out of those 10 Bernoulli Trials.
- So you need to find
- P(X = 8) = 0.04395
- How can you express this event in terms of the rv
Use this Online Binomial Distribution Calculator.
- We can use it on exams and homeworks!
- Insert
- Insert
- the probability of success - Insert
- the given value - Select between =, <=, or >= relation.
- Insert
Example
Air-USA has a policy of booking as many as 25 persons on a small
airplane that can seat only 22 passengers. Past studies have revealed
that only 85% of the booked passengers actually arrive for the flight.
Find the probability that if Air-USA books 25 seats, not enough seats will
be available.
- Assume that the tickets were purchased independently of each other, which means the passengers ride independently of each other.
Let the r.v.
Then,
- For each passenger there are 2 possibilities
- Either the passenger shows up ("success")
- Or the passenger doesn't show up ("failure")
- There are 25 such Bernoulli Trials, which are independent according to our assumption.
- The probability of success remains fixed because past studies have revealed that only 85% of the booked passengers actually arrive for the flight, that means the probability of "success" is 0.85, and it remains the same across all 25 passengers.
is nothing but the number of successes out of those 25 Bernoulli Trials
Passenger capacity of the flight is 22, and 25 seats were booked. Hence,
not enough seats will be available if and only if
Required probability:
- Use the Online Binomial Distribution Calculator.
- P(X >= 23) = 0.25374