Ch. 2 - Introduction to Probability
Class: STAT-211
Notes:
Outline:
- Sample spaces and events
- Set operations
- Probability
- Classical Definition
Sample Spaces and Events
Random Experiment
An experiment is a planned operation carried out under controlled
conditions. If the outcome of an experiment is uncertain but statistically predictable, the experiment is called a random experiment.
Examples
- Flipping a coin
- Outcome cannot be decided with certainty
- Rolling a fair six-faced die
- Only once: we know for sure any of the six-numbers will appear
- Exactly which one will appear? that cannot be predicted accurately
- Bit there is a statistical predictability here
- Lifetime of an electronic device
- Observing the lifetime of a smartphone for example
- We know for sure that the lifetime will be any positive number between 0 and infinity
- We cannot predict exactly what this number will be
- It is often modeled to a statistical or mathematical model
- So there is a statistical predictability here
- Result of a Covid-19 test (Positive/Negative)
- We know for sure that the outcome of the test will be either positive or negative
- The result cannot be predicted exactly
- This is a random experiment
- Observing the number of messages to a communication device
- We know that the number of messages of a device can be anyone of these non-negative integers 0,1,2,3,4,5,...
- Which one exactly will be cannot be predicted exactly
- There is a form of statistical prediction here
Probability is a measure associated with the outcomes of a random experiment that quantifies how likely a given outcome is to occur when the experiment is performed.
Events and Sample Space
The result or outcome of a random experiment is called an event. Upper case letters like A, B etc are used to denote events.
- If an event cannot be written as a composition of more than one events, it is called an elementary event.
- If an event is not elementary, it is called a composite event.
The sample space S a random experiment is the set of all possible
elementary events.
- It is a set/collection
Examples
- Flipping a fair coin once. There are only two possibilities: either a head (H) or a tail (T). The sample space is S = {H, T}.
- These are two only possible outcomes
- They cannot be written as compositions as more than one outcomes
- How many elementary events are possible here? there only these 2 (outcomes)
- Note the set notation for S.
- Flipping a fair coin twice. Here, S = {HH, HT , TH, TT}. The event of ‘resulting in a head in the first flip’ = {HH, HT} is composite.
- Make use of a three diagram to enumerate all possible of elementary outcomes
- The event of observing a head in the first flip of a coin can happen in
- This is not an elementary event
- This event consist in more than one elementary events in the sample space
- Rolling a fair die once. Here, S = {1, 2, 3, 4, 5, 6}. The event of getting an even number {2, 4, 6} is composite.
- The ocurrance of each of these numbers, cannot be represented as compositions of more than 1 numbers
- These are elementary outcomes, there are 6 in the sample space
- Even number: it is a multiple of 2
- Which of the 6 elements in S are even?
- 2, 4, 6
- the event of observing an even number can occur if anyone of these numbers appears
- This event consist of 3 elementary outcomes from the sample space
- This is a composite event.
"An event is nothing but a subset of the sample space S"
Events
When enumerating all of the outcomes in an event is burdensome, we
can just state the event.
Example
Draw a card at random from a well-shuffled deck of 52 cards:
-
4 different suits
- Row 1: Suit of Clubs
- Row 2: Suit of Spades
- Row 3: Suit of Hearts
- Ro2 4: Suit of Diamonds
-
13 different denominations of cards for each suit
- Ace, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K
- Ace + numbers + Face cards
- Ace, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K
-
In a deck, there will be 4 different Aces and 12 Face cards
-
Event: Drawing of each card would be an elementary outcome
- There are 52 elementary outcomes
- S =
-
Event: We are interested in whether a card is a Spade or not.
- The elements of this event would be only 13 elementary outcomes from the sample space
Write A = the card is a Spade to be the event, which consists of 13
elementary outcomes.
An event A is a collection or set consisting of one or more elementary outcomes of a random experiment.
- Nothing but a subset of the sample space S
- Outcome of a random experiment must be an element of the event
- Events are nothing but sets
For an event A to occur, the outcome of the random experiment must be contained in A.
Therefore, we need some notation/language for manipulating sets.
Set Operations
Basic Set Theoretic Operations
- The union of two events A and B (denoted A ∪ B or, A OR B) is defined as the occurrence of at least one of the events A and B.
- Occurrence of "either" event A or B
- The intersection of two events A and B (denoted A ∩ B or, A AND B) is defined as the occurrence of events A and B together.
- Occurrence of events A and B together
- Event A occurs as well as event B occurs
- The complement of an event A (denoted Ac or, A′) is defined as the non-occurrence of event A.
- Two events A and B are said to be disjoint or, mutually exclusive, denoted A ∩ B= ∅, if events A and B can never occur together. Here '∅' denotes the empty (or, the null) event.
- '∅' Means when no outcome occurs
- The difference A − B or, A ∩ Bc is defined as the event of the occurrence of event A, but the non-occurrence of event B.
- A - B: Event A occurs but event B doesn't occur
- B - A: Even B occurs but event A doesn't occur
- Event A is said to be a sub-set of event B, denoted A ⊂ B, if the occurrence of event A necessarily implies the occurrence of event B.
Venn Diagrams
/STAT-211/Visual%20Aids/Pasted%20image%2020250909141752.png)
-
What is the largest possible set/event that can occur when a random event is performed?
- That is the sample space, this will be our universal space
-
Consider any two events A and B (not necessarily disjoint)
- A U B is the occurrence of either A or B
- Note also: Either A or B or both.
- A - B
- Is that portion of A which exclusively belongs to A and not overlaps with B
- B - A
- That portion of B which belongs to B exclusively, it does not have any overlap with A
- A U B is the occurrence of either A or B
-
Some important observations:
- Do you see any overlap between A - B and A ∩ B?
- No there is no overlap
- There is also no overlap between B - A and A ∩ B?
- This means that these three events are disjoint or mutually exclusive
- (A - B) ∪ (A ∩ B) ∪ (B - A)
- The union of these three events will constitute the event A ∪ B
- A = (A - B) ∪ (A ∩ B)
- B = (B - A) ∪ (A ∩ B)
- If we have A ∩ B = ∅
- This means A must be a subset of B compliment
- A ∩ B = ∅ <-> A ⊂ Bc
- Do you see any overlap between A - B and A ∩ B?
Set Operations
-
Commutative Laws: a) A ∪ B = B ∪ A,
b) A ∩ B = B ∩ A -
De Morgan’s Laws: a) (A ∪ B)c = Ac ∩ Bc
b) (A ∩ B)c = Ac ∪ Bc- What is the meaning of (A ∩ B)c occurs
- Means that A ∪ B doesn't occur
- Either of this two doesn't occur
- How is this possible?
- It is possible if and only if neither of them occurs
- Written as Ac ∩ Bc.
- (A ∩ B)c = Ac ∪ Bc
- The event that both of them happen together, doesn't occur
- Eitehr Ac occurs or Bc occurs
- What is the meaning of (A ∩ B)c occurs
-
For any event A, A and Ac will always be disjoint.
-
For any two non-empty events A, and B, the events A−B, A ∩B, and B−A are always disjoint, and
- A = (A−B) ∪(A ∩B)
- B = (A ∩B) ∪(B−A)
- A ∪B = (A−B) ∪(A ∩B) ∪(B−A)
-
A ⊂ B if and only if A ∩ B = A
-
A ∩ B= ∅ ⇔ A ⊂ Bc.
Set Operations: Example
Let T be the event that a firm will open a branch office in Toronto, and M be the event that it will open an office in Mexico City.
Express the following events in terms of events T and M:
- The firm will not open its office in Toronto:
Tc - The firm will open its office in both the cities:
T ∩ M - The firm will not open its office in both the cities:
(T ∩ M)c = Tc ∪ Mc - The firm will open its office in at least one of the two cities:
T ∪ M - The firm will open its office in neither of the two cities:
Tc ∩ Mc = (T ∪ M)c - The firm will open its office in Toronto, but not in Mexico City:
T − M = T ∩ Mc - The firm will open its office in Mexico City, but not in Toronto:
M − T = M ∩ Tc - The firm will open its office in exactly one of the two cities:
(M − T) ∪ (T − M)- These two event will always be disjoint to each other
Probability
Definition of probability
Probability is a measure or ‘size’ associated with the outcomes of a
random experiment that quantifies how likely a given outcome is to occur when the experiment is performed.
We write the probability of an event A as P(A) or Pr(A).
The definition of probability must satisfy these three axioms:
- Axiom 1: 0 ≤ P(A) ≤ 1, for any event A
- The probability of A cannot be a negative number, neither can it be larger than 1
- Axiom 2: P(S) = 1
- Probability of the sample space is equal to 1
- The sample space is the larges possible event, so it should have the largest possible measure of probability = 1
- Axiom 3: If A1, A2, . . . are disjoint
- Probability of disjoint events if their sum
Together, the axioms imply: P(∅) = 0 where ∅ denotes the null (or, the
empty ) event.
Probability Laws
Law 1: For any event A, P(Ac) = 1 − P(A).
- P(A) = 0.65
- P(Ac) = 1 - P(A) = 1-0.65 = 0.35
Law 2: For any two events A and B (not necessarily disjoint),
- Equal to saying that P(A ∩ B) = P(A) + P(B) - P(A ∪ B)
- Example:
- P(A) = 0.65, P(B) = 0.35
- P(A ∩ B) = 0.15
- P(A ∪ B) = P(A) + P(B) - P(A ∩ B) = 0.85
Law 3: If A and B are disjoint,
Law 4: If A and B are disjoint,
- Sum of individual probabilities
- Special case of law number 2
- Derived by inducing law number 3 in law number 2
- A ∪ B = 0 -> P(A ∩ B) = 0
- P(A ∪ B) = P(A) + P(B) - P(A ∩ B) = P(A) + P(B)
- Derived by inducing law number 3 in law number 2
- Common error:
- P(P ∪ B) =/ P(A) + P(B)
- P(A ∪ B) = 0.85 =/ P(A) + P(B) = 1
Law 5: For any two events A and B (not necessarily disjoint),
- Likewise
- Implies that
- P(A - B) = P(A) - P(A ∩ B)
- P(B - A) = P(B) - P(B ∩ A)
- P((A - B) ∪ (B - A)) = P(A - B) + P(B - A)
- Example:
- P(A - B) = 0.65 - 0.15 = 0.50
- P(B - A) = 0.35 - 0.15 = 0.20
Law 6: If A ⊂ B, P(A) ≤ P(B).
- Intitituively quite obvious
Example
A company has two maintenance workers, A and B. The probability that
maintenance worker A is on duty on any given day is 0.80, and the
probability that maintenance worker B is on duty on any given day is
0.75. The probability that both maintenance workers are on duty on the
same day is 0.60.
What is the probability that at least one of the maintenance workers is on duty on any given day?
Let EA and EB denote the events that worker A is on duty on any given
day, and worker B is on duty on any given day, respectively.
According to the problem,
What is the probability that neither of the two workers are on duty on any given day?
- Applied De Morgans Law
Example 2
A college student is taking two courses. The probability she passes the first course (say, event A) is 0.64. The probability she passes the second course (say, event B) is 0.75. The probability she passes at least one of the courses is 0.85.
By the problem, P(A) = 0.64, P(B) = 0.75, and P(A ∪B) = 0.85.
- What is the probability that she passes both courses?
- Apply Union formula (law 2)
- What is the probability she does not pass either course?
- She neither passes course 1 nor she passes course 2
- Follows from De Morgans Law
- What is the probability she does not pass both courses?
- Follows from Demorgans law
- What is the probability she passes exactly one course?
- Passes course 1 but fails in course 2: (A - B)
- Passes course 2 but fails in course 1: (B - A)
- Sum of individual probabilities
- Follows from law 4
- Probability of the union of disjoint sequnce of event is equal to the sum of the individual porbabilities
- Break down using law 5
Example 3
The probability that a firm will open a branch office in Toronto is 0.7, that it will open one in Mexico City is 0.4, and that it will open an office in at least one of the cities is 0.8.
Find the probabilities that the firm will open an office in:
- neither of the cities (Ans: 0.2)
- Follows from De Morgans law
- both the cities (Ans: 0.3)
- They are already given
- exactly one of the cities (Ans: 0.5)
- Sum of individual probabilities
- Break down using law 5
Note:
- Please make sure to not compute
- P(T ∩ M) = P(T)P(M)
- This is not correct!
Classical Definition
Classical Definition of Probability
Assumptions
- The sample space S is finite.
- All the elements of S are equally likely to occur, that is, each elementary event has an equal chance of occurrence.
Then, according to the classical definition,
Examples
-
Tossing a fair coin once: a head (H) and a tail (T) are equally likely to occur.
- S =
- H denotes the occurrance of a Head
- T denotes the occurrance of a Tail
- They are equally likely to occur because the coin is fair, there is a 50/50 chance for them to occur.
- P(H) = 0.5 = P(T)
- S =
-
Tossing a fair, six-sided die once: each of the faces 1, 2, 3, 4, 5, 6 is as likely to occur as any other face.
- S =
- Here the sample space is finite
- The die is fair, so each face is as equally likely to occur as any other face
- This means we can apply classical definition here
- S =
-
Randomly guessing the answer to a True/False question on an exam: (typically) a correct answer or, an incorrect answer are equally likely to be selected.
- Typically a correct or incorrect answer is equally likely to be selected
Coin Tossing Examples
Example 1
Suppose you toss a fair coin twice. Here, the sample space S has four (2^2) equally likely elementary events:
Define the event A as ‘getting exactly one head’. Then A = {HT , TH}.
- Either we have a Tail in the first flip and then a head in the second or we have a head in the first flip and a tail in the second flip
Hence, by the classical definition P(A) =
- We can represent this with a tree diagram
Example 2
Suppose you toss a fair coin thrice. Here, the sample space S has eight (2^3) equally likely elementary events:
Define the event A as ‘getting a head at the second flip’. Then
A = {HHH, HHT , THH, THT}.
- Use a tree diagram to visualize this
By the classical definition, P(A) =
Card Drawing Example
Suppose a card is drawn at random from a well-shuffled deck of 52 cards:
/STAT-211/Visual%20Aids/Pasted%20image%2020250911142041.png)
- Each card is equally likely to be drawn
- Both conditions of using the classical definition are satisfied
Let A and B denote the events that ”the card drawn is a Spade” and
”the card drawn is a Face card” (that is, it is either ”J”, ”Q” or ”K”), respectively. By the Classical Definition,
- By A intersection B, we mean the vent that consist of the common elements of A and B
- Applying de Morgans Law we can rewrite the event
Rolling a Fair Die Twice
Suppose we roll a fair die twice. The sample space would comprise of
6 × 6 = 36 many paired outcomes as follows:
S= {
(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6),
(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6),
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6),
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6),
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6),
(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)
}
For each pair, the first component denotes the outcome of the first roll, and the second one denotes the outcome of the second roll. Note that the probability of each outcome in S is 1/36
- (x, y)
- x: outcome of 1st roll
- y: outcome of 2nd roll
- x,y = 1,2,3,4,5,6
- S = {x, y}: x, y = 1,2,3,4,5,6}
Let A be the event that the sum of the two faces equals 7. Then,
A= {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}.
Hence, P(A) =
Let B be the sum of the two faces (sum of the two complements) is an even number (that means it is a multiple of 2)
-
Note that the sum of 2 numbers can be a multiple of two iff both of them are even numbers or both of them are odd numbers.
- Select these elements from the sample space.
B = {(1,1), (1,3), (1,5), (2,2), ...}
Hence, P(B) ==
Other examples
The following table classifies a population of 100 plastic disks in terms of their scratch and shock resistance shock resistance
/STAT-211/Visual%20Aids/Pasted%20image%2020250911144630.png)
- Note that the sum must be equal to 100, the population size.
- Let us compute some more numbers:
- Add an additional row and column to compute the frequency of different levels
- Shock Resistance High: 70+16 = 86
- Shock Resistance Low: 9+5 = 14
- Scratch Resistance High: 70+9 = 79
- Scratch Resistance Low: 16+5 = 21
- Total = 100.
Suppose that one of the 100 disks is randomly selected. What is the
probability that
-
“the selected disk has low shock resistance and low scratch resistance” (Ans :
) - Take a look at the intersection of these two categories
- Their point of intersection in the table is 5
- Scratch Resistance Low and Shock Resistance Low.
-
“the selected disk has low shock resistance or low scratch resistance” (Ans :
) -
Union of two events, "disk has low shock resistance" or "disk has low scratch resistance"
-
P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
- We know P(A ∩ B) = 0.05
-
P(A) = 14/100 = 0.14
- Of those 100 many, 14 favor A
-
P(B) = 0.21
- Of those 100 many, 21 favor B
-
P(A ∪ B) = P(A) + P(B) - P(A ∩ B) = 014 + 0.21 - 0.05
-
More examples
A large tech company is considering developing a generative artificial
intelligence chatbot. To gauge overall interest in this new project, the company conducted a survey of 50,000 employees across four different departments. Employees were asked to rate their excitement for the project on a scale of 1-5, with 5 indicating the highest level of excitement. The results of the survey are provided below.
/STAT-211/Visual%20Aids/Pasted%20image%2020250911145634.png)
-
Sample space consists of 50,000 employees
-
1/50,000 is the chance for one employee to be selected
-
What is the probability that a randomly selected employee works in the ’Research and Development’ Department and gave a rating of 3 or higher?
= 0.074 - Research and Development department: 300 + 300 + 3100 = 3700 in total (which favor this department)
-
What is the probability that a randomly selected employee gave a rating of 3 or higher?
= 0.682 - To enumerate this probability, you first find the marginal or total frequency for this group of people
- Nothing but the sum of the marginal frequencies for columns 3, 4, and 5
- 9100 + 900 + 16000 = 34100 favor the occurrence of this event
Shortcomings of the Classical Definition
- The definition is circular in nature.
- The second condition: Elements must be equally liked, equally probable to occur
- Equally likely elementary events are rare in practice.
- Smay have an infinitely or uncountably many elements
Examples
- The probabilities of a head (H) and a tail (T) when a biased coin is thrown once are not the same.
- Second condition is not met
- Observing the number of messages waiting in a queue to arrive in a communication device. Here, the sample space S = {0, 1, 2, 3, . . . } (all non-negative whole numbers) is countably infinite and the elements of S are not equally likely.
- Both conditions are violated here
- The probability that a randomly chosen number from (0, 1) will lie in (0.2, 0.5) can’t be determined using the classical definition because here the sample space S = (0, 1) is uncountable.
- Simply uncountable, the sample space here is not finite
Long-term Interpretation of Probability
Let A be an outcome of an random experiment, and we wish to find P(A), the true probability of the occurrence of event A.
- Repeat the experiment under identical conditions, a sufficiently large number of times, say, N many times, where N is a large positive integer.
- Record the number of times the event A occurred out of those N many identical replications of the experiment.
- Compute the proportion of times event A occurs:
Then, for sufficiently large N,