Ch. 3 - Discrete Distributions

Class: STAT-211


Notes:

Outline

Random Variables

Random Variables

A real-valued random variable is a quantitative variable which takes its values depending on the outcome of a random experiment.

Typically, we denote random variables by upper-case letters like X ,Y ,Z , etc., and their different realizations when the random experiment is performed by lower-case letters like x,y ,z, etc.

However, the lower-case letters need not always match the upper-case letters (i.e. the name of the random variables).

Example

S={HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}.

Discrete Random Variables

A random variable is discrete if it takes on countably many values (finite or infinite) only.

  1. Number of heads in 10 flips of a fair coin
  2. Sum of two faces when a six–faced fair die is rolled twice
  3. Number of occurrences of car accidents at an intersection
  4. Number of data packets arriving in a network device

Continuous Random Variable

A random variable is continuous if it assumes its values within an
interval (bounded or unbounded).
5. Arrival time of a message on a communication network.
6. Life time of an electrical or electronic device
7. Strength of concrete weld

Discrete Probability Distribution

Distribution of a Random Variable

Any outcome involving the occurrences of various values of a random
variable X is an event.

Therefore, we can attach probabilities to find the chances or the
likelihoods of the occurrences of such events.

Probability Distribution

The probability distribution of a random variable X is a law or function that completely characterizes the probabilities of the occurrences of all possible values of X.

If we know the probability distribution of X , we know everything about
X , at least, theoretically.

Discrete Probability Distribution

Probability Mass Function
Let X be a discrete random variable taking countably many values, say,
x1,x2,..., only. Then the probability mass function (pmf) of X is a
function f defined at each xi such that

f(xi)=P(X=xi), for i ≥1

that must satisfy the following two properties:

0f(xi)=P(X=xi)1, for all i ≥1; if(xi)=iP(X=xi)=1

Informally, the PMF says “if you tell me the value of x, I will give you back the probability with which X takes the value x”.

For example, f(2)=P(X=2) denotes the probability that the resulting X value is 2.

Write down the probability of the following events in terms of f(x) when the random variable X assumes the values 0,1,...,10:

Example

Suppose you flip a fair coin thrice. Here, the sample space

S={HHH,HHT,HTH,THH,HTT,THT,TTH,TTT}.

Let X denote the number of heads that appear. Then X ∈{0,1,2,3}.

Pasted image 20250916153359.png|150

Example (Contd.)

  1. What is the probability of observing exactly one head?

    Answer: P(X=1)=38

  2. What is the probability of observing at most one head?

    Answer:

    • Observe that, {X ≤1}= {X = 0}∪{X = 1}.
    • The events {X = 0}, and {X = 1}are disjoint.
    • P(X ≤1) = P(X = 0) + P(X = 1) = 1/8 + 3/8 = 4/8 = 1/2
  3. What is the probability of observing at least one head?

    Answer:

    • {X ≥ 1}= {X = 1} ∪ {X = 2} ∪ {X = 3}.
    • {X = 1}, {X = 2}, and {X = 3} are disjoint.
    • P(X ≥ 1) = P(X = 1) + P(X = 2) + P(X = 3) = 3/8 + 3/8 + 1/8 = 7/8

Example

Suppose we roll two six–faced fair dice simultaneously. Then the sample space S would comprise of 36 many equally likely paired outcomes as

S=(x,y):x,y=1,2,3,4,5,6.

Let X denote the sum of the two faces obtained. Then X takes the values 2,3,4,...,11,12, depending on the outcome of the experiment.

Observe that

X =
2 if & only if (1,1) occurs
3 if & only if (1,2) or (2,1) occurs
4 if & only if (1,3), (2,2) or (3,1) occurs
5 if & only if (1,4), (2,3), (3,2) or (4,1) occurs
6 if & only if (1,5), (2,4), (3,3), (4,2) or (5,1) occurs
7 if & only if (1,6), (2,5), (3,4), (4,3), (5,2) or (6,1) occurs
8 if & only if (2,6), (3,5), (4,4), (5,3) or (6,2) occurs
9 if & only if (3,6), (4,5), (5,4) or (6,3) occurs
10 if & only if (4,6), (5,5) or (6,4) occurs
11 if & only if (5,6) or (6,5) occurs
12 if & only if (6,6) occurs

Now our next aim is to find the probability mass function of X.

Example with f(2), f(3):

Pasted image 20250920124810.png|600

  1. What is the probability that the sum X is an even number?
X is evenX=2X=4X=6X=8X=10X=12

Since the above events are mutually exclusive, using Axiom 3, we obtain

P(X is even)
= P({X = 2} ∪ {X = 4} ∪ {X = 6} ∪ {X = 8} ∪ {X = 10} ∪ {X = 12})
= P(X = 2) + P(X = 4) +···+ P(X = 12)
= 1/36 + 3/36 + 5/36 + 5/36 + 4/36 + 1/36 = 1/2.

  1. P(X ≤ 4) = P(X = 2) + P(X = 3) + P(X = 4) = 6/36 = 1/6

    • Observe that if x is either 2 or 3 or 4, then we must have X <= 4.
    • These are 3 disjoint events
    • Follows from Axiom 3 of probability
  2. P(X > 4) = 1 − P(X ≤ 4) = 1−1/6 = 5/6

    • Note that the event X > 4 is nothing but the complement of the event X <= 4.
      • {x > 4} = {x <= 4}c

Cumulative Distribution Function

Cumulative Distribution Function

The cumulative distribution function (cdf) F(x) of a discrete r.v. (random variable) X with pmf f(x) is defined as:

F(x)=P(Xx)=y:yxf(y)

For any real number x, F(x) is the probability that the observed value of X will be less than or equal to x.

The CDF must satisfy the following properties:

Remarks

f(x1)=F(x1), f(xi)=F(xi)F(xi1), i=2,3,...

- f: PMF
- F: CDF

P(a<Xb)=a<xibf(xi)

- Not this sum can also be written as (example):
- [f(1)+f(2)+f(3)+f(4)+f(5)+f(6)][f(1)+f(2)+f(3)]
- Writing the sum as the difference of these two sums

and the cdf

P(a<Xb)=F(b)F(a)

Important Remark If X is discrete with cdf F, then

P(aXb)=F(b)F(a)

need not be true.

Example:

Expectation and Variance

Expectation of a Discrete Random Variable

Suppose X is a discrete r.v. with pmf f, and taking countably many values, say, x1,x2,.... Then, the expectation (or, expected value) of X, denoted E[X] or µX , is defined as:

µX=E[X]=ixif(xi)

provided i|xi|f(xi)<.

(In this course, we shall always assume that µX=E[X] is well-defined unless stated otherwise.)

Important: E[X] or µX is also known as the population mean or, population average of X.

If h(X) is any real-valued function of X with i|h(xi)|f(xi)<, the expected value of h(X) is defined as

E[h(X)]=ih(xi)f(xi)

(In this course, we shall always assume that E[h(X)] is well-defined unless stated otherwise.)

Long-Term Interpretation of E(X)

The expected value is often referred to as the ‘long-term’ average or
mean. This means:

”If the random experiment is replicated a large number of times under
identical conditions (a hypothetical situation), then the average of the X
values observed in those large number of replications of the random
experiment will be approximately E(X).”

E(X)1N[X1+X2++XN]

provided N is sufficiently large.

Note: The expectation of X is nothing but a measure of center of a give probability distribution.

Example 1

Suppose, a fair coin is flipped thrice. Let X be the number of heads that
Then X takes the values 0,1,2, and 3, with probabilities

f(0)=1/8, f(1)=3/8, f(2)=3/8, and f(3)=1/8,

respectively

Pasted image 20250920192218.png|300

Hence, E(X)=xxf(x)=1.5, which is not a probable X−value.

Example 2

Let the random variable X denote the sum of the two faces when two six-faced fair dice are rolled simultaneously.

Pasted image 20250920192740.png|350

Variance of discrete Random Variable

The most common measure of spread of a r.v. is its variance.

Suppose X is a discrete rv with pmf f, and taking countably many values, say, x1,x2,.... Let E[X]=µX

Then the variance of X, denoted σX2, is defined as

σX2=Var[X]=E[(XµX)2]=i(xiµX)2f(xi),

provided the sum is finite.

The standard deviation is σX=Var[X]

Important: Again, these are population level quantities, not to be confused with the sample variance or sample standard deviation.

Fact: σX2=E[X2]µX2=ixi2f(xi)µX2

Binomial Distribution

Bernoulli Trial

A random experiment whose outcomes can be classified into one of two
mutually complementary categories: either a ‘success’ or, a ‘failure’, is
called a Bernoulli trial.

  1. Flipping a coin once
    • ‘Occurrence of a Head’ = ‘Success’
    • ‘Occurrence of a Tail’ = ‘Failure’
  2. Rolling a six faced die once
    • ‘Occurrence of an even number’ = ‘Success’
    • ‘Occurrence of an odd number’ = ‘Failure’
  3. Drawing a card at random from a well-shuffled deck of 52 cards
    • ‘Drawing a Spade’ = ‘Success’
    • ‘Not drawing a Spade’ = ‘Failure’
  4. Drawing a ball at random from an urn containing 6 blue and 4 red balls
    • ‘Drawing of a blue ball’ = ‘Success’
    • ‘Drawing of a red ball’ = ‘Failure’

For each of these examples you can see that the outcomes of the underlying random experiments can be classified in one of two mutually complimentary categories, namely a "success" and a "failure".

Binomial Experiment

Consider a random experiment consisting of a sequence of Bernoulli trials
such that

Such an experiment is referred to as a Binomial experiment with
parameters n and p.

Examples

  1. A coin is flipped 10 times independently, and under identical conditions, and the number of heads are recorded: Yes−Binomial
    • Observe that when a coin is flipped once, then either a head or a tail appears.
    • Define head as "success" and tail as "failure"
      • One single flip of the coin can be regarded as a Bernoulli Trial
    • There are 10 such Bernoulli Trials in total, that means the number of trials here is fixed.
    • The trials are of course independent, the outcome of one flip of the coin does not have any influence on the outcome of the remaining flips
    • The probability of success is fixed since the coin is flipped under identical conditions 10 times.
  2. Cards are drawn at random, one by one and with replacement, from a well-shuffled deck of 52 cards until we get 5 jacks: No-Number of trials is not fixed
    • The drawing of each card can be regarded as a Bernoulli Trial
      • The outcomes are dichotomous, either we get a jack or not
    • Are the drawings independent? - yes
      • Since this is done with "replacement", it means the outcome of one drawing will not have any influence on the remaining drawings
    • The probability of drawing a jack remains the same across all these Bernoulli Trials.
    • But here we do not know what should be the total number of trials before we get five jacks!
      • Observe that in order to get 5 jacks you need to have at least 5 many such draws
      • Theoretically it can go up to infinity.
  3. 8 cards are drawn at random, one by one and with replacement, from a well-shuffled deck of 52 cards, and the number of spades are recorded: Yes−Binomial
    • Is it a series of Bernoulli Trials?
      • Yes, drawing of each card can be regarded as a Bernoulli Trial
    • Is the number of trials defined?
      • Yes, it is fixed before the random experiment is performed
    • Are the trials independent?
      • Yes, done at random, one by one and with replacement
    • What is the probability of success in each trial?
      • 13/25 - it remains fixed across these 8 independent Bernoulli Trials.
  4. 5 individuals are drawn at random, one by one and without replacement, from a group of 22 males and 15 females to form a committee, and the number of females selected is recorded: No−Trials are dependent
    • Is it a series of Bernoulli Trials?
      • Observe that here drawing of each committee member may result in any one of the two outcomes, either it is a male or a female
      • one is success, the other is failure.
    • Is the number of trials defined?
      • Since 5 committee members are to be chosen, therefore there are 5 such Bernoulli Trials
    • Are the trials independent?
      • No, done at random, one by one, but without replacement.
      • The outcome of each draw will depend on the outcome of the remaining draws, as a result the trials are not independent, they are dependent.

Binomial Distribution

Consider a Binomial experiment with parameters n and p.

Let X be the number of successes out of those n trials. We say that X
follows a Binomial distribution with parameters n and p, and write

XBinomial(n,p).

The probability mass function (p.m.f.) of X is given by

P(X=x)=(nx)px(1p)nx,for x=0,1,,n.

For each x = 0,1,...,n, (nx) denotes the number of ways through which x objects can be drawn from n objects at a time.

The mean and the variance of XBinomial(n,p) are given by

where q=1p is the probability of failure in a single trial.

Examples

Suppose we toss a fair coin independently 10 times. What would be the
distribution of X, the number of heads that appear?

XBinomial(n=10,p=0.5)
  1. What is the probability of exactly 4 heads?
    • P(X = 4) = 0.20508
  2. What is the probability of at most 3 heads?
    • P(X <= 3) = 0.17188
  3. What is the probability of seeing heads between 5 and 7?
    • P(5 <= X <= 7)
    • Note we can also write it as P(4 < X <= 7)
    • And remember that P(a<Xb)=P(Xb)P(Xa)
      • For any discrete rv X
    • So lets rewrite it as P(X <= 7) - P(X <= 4)
    • Now we can use the calculator!
      • P(X <= 7) = 0.94531
      • P(X <= 4) = 0.37695
    • 0.94531 - 0.37695 = 0.56842
  4. What is probability of seeing 2 tails?
    • How can you express this event in terms of the rv X.
    • X is the number of heads out of those 10 Bernoulli Trials
    • Then note that nX=10X
      • 10X is the number of tails out of those 10 Bernoulli Trials.
    • So you need to find P(10X=2)=P(X=8)
    • P(X = 8) = 0.04395

Use this Online Binomial Distribution Calculator.

Example

Air-USA has a policy of booking as many as 25 persons on a small
airplane that can seat only 22 passengers. Past studies have revealed
that only 85% of the booked passengers actually arrive for the flight.
Find the probability that if Air-USA books 25 seats, not enough seats will
be available.

Let the r.v. X denote the number of passengers who actually show up.
Then,

XBinomial(n=25,p=0.85).

Passenger capacity of the flight is 22, and 25 seats were booked. Hence,
not enough seats will be available if and only if X23.

Required probability:

P(X23)=x=2325P(X=x)=0.25374