Ch. 3 - Continuous Distributions
Class: STAT-211
Notes:
Outline
- Probability Density Function (pdf)
- Cumulative Distribution Function (cdf)
- Expectation & Variance of Continuous RVs
- Uniform Distribution
- Normal Distribution
Probability Density Function (PDF)
From Discrete to Continuous
Recall that, if
It makes sense in this case to:
-
Add the values of the pmf as there are ‘only a few values’
can take within an interval.
(That is what we mean by discrete) -
Pay attention to the end points of the intervals
(Does the interval contain its end points?)
For a continuous random variable
to ‘add’ a pmf. Besides, the probability that
value
- We cannot use the pmf here, we need some other kind of function to model a continuous probability distribution
- This is where the probability density function comes in.
Probability Density Function (1)
The probability density function (pdf),
, for every real number , and - Implication: function f is non-negative everywhere.
. - Implication: The area under the distribution curve must equal to 1.
If X is a continuous random variable with pdf
two real numbers a, and b, with a < b,
Pictorially represented:
/STAT-211/Visual%20Aids/Pasted%20image%2020250923142907.png)
-
Note that by definition,
will be to the left of the point - The are under the curve to the left of the point
- The are under the curve to the left of the point
-
Also note that the area under the curve to the right of the point
will be . -
The area of the blue region is nothing but
-
By definition:
Facts:
- {X <= a} = {X < a}
- They are disjoint events
- {X <= a} = {X < a}
- It doesn't matter if we include or not the endpoints of the boundary points
- Area under the curve over the intervale
- Isn't this are under the curve the same as the area under the curve up to the point
minus the area under the curve up to the point ? - Yes! this will be used repeatedly.
- Area under the curve over the intervale
- {X > b} =
Probability Density Function (2)
- by definition, by itself, it is not a probability
For a continuous random variable X having pdf f (·), and any pair
of real numbers a and b, with a <b:
- P(X= a) = 0,
- P(X ≤a) = P(X <a),
- P(X ≥a) = P(X >a) = 1−P(X ≤a),
- P(a ≤X ≤b) = P(a ≤X <b) = P(a <X ≤b) = P(a <X <b),
- P(a ≤X ≤b) = P(X ≤b)−P(X ≤a).
Fact: The pdf f (·) of a continuous random variable is NOT any probability (unlike the pmf). It may take values greater than 1.
Example
Consider a continuous random variable X with pdf
/STAT-211/Visual%20Aids/Pasted%20image%2020250923144229.png)
Then
- Here the density function cannot be a probability
Example
Let the continuous random variable X have pdf
/STAT-211/Visual%20Aids/Pasted%20image%2020250923144322.png)
- Find the value of the (normalizing) constant c
- Note it cannot be negative, and it cannot be 0, since then the density function will be 0 everywhere over the entire line
- Normalizing constant has to be positive:
- Find
- Find
...
Example (2)
Let the continuous random variable X have pdf
/STAT-211/Visual%20Aids/Pasted%20image%2020250923144934.png)
-
Can this normalizing constant be negative? NO it cannot!
-
Can it be 0? NO it cannot!
-
Find the value of the (normalizing) constant c.
- Observe that
- Find
- Find
- Find
...
Cumulative Distribution Function (CDF)
Cumulative Distribution Function
The cumulative distribution function (CDF) for a continuous random variable X , denoted F (·), is defined as
- Area under the curve up to the point
- By definition, the CDF is a probability.
- Observe the CDF is not decreasing everywhere
From the Fundamental Theorem of Calculus, the PDF
except possibly at a "few points".
If
for any two real numbers
Facts:
- For all real no.
- For all real no.
for all (x, y) with x < y
: Continuous everywhere
When to use PDF and when to use CDF?
- The answer is: what are you trying to find?
- Example: from the density function you can try to evaluate the CDF, then using the CDF you can evaluate requested probabilities
Example
Suppose X is a continuous random variable having PDF
/STAT-211/Visual%20Aids/Pasted%20image%2020250923150634.png)
Then, its CDF:
/STAT-211/Visual%20Aids/Pasted%20image%2020250923150652.png)
- How did we evaluate the CDF here?
-
Case 1:
for all -
Case 2:
-
Case 3:
-
Some probabilities**:
Expectation & Variance of Continuous RVs
Expectation of a Continuous RV
Let X be a continuous random variable with p.d.f. f.
Then the expected value (expectation) or, population mean of X , denoted
provided
(In this course, we shall always assume that∞ −∞|x|f (x)dx <∞unless stated otherwise.)
- Integral must be finite for
to exists (in this course you can always assume this)
More generally, if h(X ) is any real-valued function of X with ∞−∞|h(x)|f (x)dx <∞, the expected value of h(·) is defined as
...
(In this course, we shall always assume that∞ −∞|h(x)|f (x)dx <∞unless stated otherwise.)
Variance of a Continuous RV
Let X be a continuous random variable with p.d.f. f . Then the
variance of X , denoted σ2 X , is defined as
...
provided∞ −∞(x−µX )2f (x)dx <∞.
The standard deviation of X , denoted σX , is defined as the positive square root of its variance σ2 X , that is,
...
Note:
...
- Observe that these are population level quantities, not sample level quantities
- Without the knowledge of probability distribution theory it is not possible to get population expectation or population variance.
Fact:
Example
Suppose X is a continuous rv having p.d.f.
/STAT-211/Visual%20Aids/Pasted%20image%2020250923151841.png)
Then,
...
Uniform Distribution
Uniform Distribution
- Perharps the simpliest type of a continuous probability distribution
A continuous random variable X is said to have a Uniform(a,b)
distribution over an interval
density function
/STAT-211/Visual%20Aids/Pasted%20image%2020250923152213.png)
- Over the interval
, the density function is constant everywhere (doesn't change its values) - The density function would be a horizontal line segment
We write,
/STAT-211/Visual%20Aids/Pasted%20image%2020250923152241.png)
Uniform Distribution - CDF
Suppose,
/STAT-211/Visual%20Aids/Pasted%20image%2020250923152453.png)
/STAT-211/Visual%20Aids/Pasted%20image%2020250923152512.png)
Uniform Distribution - Properties
If X∼Uniform(a,b), then
-
for any pair of points c and d with
, = = - =
- =
-
E(X) = (a + b)/2 and SD(X ) = (b−a)/√12, because
...
Uniform Distribution - Example
Let the random variable
/STAT-211/Visual%20Aids/Pasted%20image%2020250923152809.png)
- Find the probability that X will lie in the interval (3,6). Ans: 0.3
- There is no harm in including both of the endpoints or you can exclude them with no harm as well.
- Find the conditional probability that X will be smaller than 4 given that X isn’t larger than 6. Ans: 2/3 ≈0.667.
...
- Find the probability that X = 2.47. Ans: 0
Normal Distribution
Normal Distribution
- The most widely used, -> has very vast applications in statistics and science
The normal or, Gaussian distribution plays a key role in probability
and statistics Write its pdf as:
where
(The parameters are
We write
- (The cdf,
, does not have a closed form expression) - Note:
- E(X) =
- Var(X) =
: location parameter : scale parameter
- E(X) =
- Note:
- If
, then the distribution of is independent of
- If
What is the area under the curve to the right of µ?
- It should be evenly distributed on both sides of µ
- right area:
- left area =
- right area:
- mean(X) = median = µ
- Expected value is the same as its median (for a normal distribution)
Standard Normal Distribution
When
Notation:
is the pdf - Lowercase phi
- =
is the cdf - Capital phi
- =
/STAT-211/Visual%20Aids/Pasted%20image%2020250925141811.png)
Example:
- Suppose
- Define:
- and that
- Independent of
- Observe that the converse is also true
- If
then,
- If
- and that
Facts:
- If
then, - CDF of
can be expressed in terms of the cdf of a standard normal distribution - This is the definition of the CDF of a continuous rv
- We can write
- Is the same, you are subtracting the same quantity from both sides
- The direction of the inequality does not change
- So we can also write
- Which is the same as saying
- So we get that:
- Suppose
- Then given
- A point
is called the 100pth percentile (or equivalently the p-th quantile) of if and only if the area under the curve towards the left of is and to the right it is - Probabilistically speaking we can write
- Normal Dist.:
- Standard Normal Dist.:
- Normal Dist.:
- Then given
- CDF of
- If
then, - If
then, - for any x: real no. in a
.
- For any real no.
: - = phi... (complete)
Example of Standard Normal Distribution
- Remember the '=' in <= does not really matter.
- The area under the curve towards the left side of -2 is the same as the area under the curve towards the right side of 2.
- Waht is the total area under the curve? It is 1
- What is the total area of these two shaded regions
- It should be twice one of them (since both are the same)
- Note that often times this probability is expressed by writing:
- So this means:
- You cannot have a variable that satisfies these two conditions, therefore these two are disjoint sets
- So we can get it by
. - which is basically the sum of the two shaded regions
- Note: In these facts, you can replace the real number '
' by any arbitrary positive real number, say .
Normal Distribution PDF
/STAT-211/Visual%20Aids/Pasted%20image%2020250925141856.png)
- Blue distribution
- This normal distribution is represented by the blue color solid density curve
- Note that it is bell shaped and it has its center at 0
- It has a small amount of variability around its center
- Has a very high peak at
- Red distribution
- Bell shaped and symmetric
- Quite larger variance than the normal distribution
- There must be some x values that are away from the center with higher probability
- This distribution has a smaller peak
- Yellow distribution
- Just like the first two cases, it has a bell shaped symmetric curve around
- There should be extreme values on the tails of the distribution because of the large variability
- The peakiness of the density curve should be smaller than the other 2.
- If variance parameter is:
- low -> high peak along the center
- large -> low peak and more flat along the center
- Just like the first two cases, it has a bell shaped symmetric curve around
- Green distribution
- Observe that the density curve is bell shaped around the point µ = -2
- By changing the value of µ we have shifted the location of this distribution
- Changing µ you can shift only the center of the distribution, but not its shape.
- Its location is determined by µ and its shaped is governed by the scale parameter σ.
Normal Distribution CDF
area under up to
/STAT-211/Visual%20Aids/Pasted%20image%2020250925142128.png)
- Plot of $F(x) =
as the sigmoid (S - shaped) curve
/STAT-211/Visual%20Aids/Pasted%20image%2020250925142242.png)
Example (1)
Suppose
-
Note: σ = 2.9
-
- You should select the
option in the calculator
-
- Recall that if
then,
- Recall that if
-
- Recall that for any real number
(using normal distribution calculator)
- Recall that for any real number
Example (2)
Suppose
-
the 95-th percentil of
-
(95−th percentile of ) -
-
-
The area under the curve towards the left of the point
should be 0.95 by definition, and to the right it should be . - If you choose the
option, what area must you specify to get the 95-th percentile of ? - You need to insert
- You need to insert
- If you choose the
-
-
-
-
the third quartile of
-
= 75−th percentile of X
-
(75−th percentile of Z ) -
-
-
- =
- =
- =
-
Note:
- Observe that 0.677449 =
- The area under the curve towards the right of
is 0.25 - The area under the curve towards the left of
is 0.25 as well - Therefore we observe that
- =
- (Does not require knowledge of µ)
-
Fact:
- If and only if
- for any real a. and any
- If and only if