The empirical concept of probability is that of "relative frequency", the ratio of the total number of occurrences of a situation to the total number of times the experiment is repeated.
When the number of trials is large, the relative frequency provides a satisfactory measure of the probability associated with a situation of interest. This is one of the so-called laws of large numbers of probability theory.
-- the sample space
In probability and statistics the term experiment is used in a very wide sense and refers to any procedure that yields a collection of outcomes. A random experiment is one whose outcomes depend only on chance and it can be repeated under identical conditions.
The set whose elements are all the possible outcomes of an experiment is
called the sample space of the experiment. The concept of the sample space
plays a fundamental role in determining probabilities for individual
outcomes of a random process. Your understanding of the sample space
should be that of a complete and as detailed as possible listing of all
the
outcomes of a random experiment.
--events
In most cases we are not interested in just one single outcome of a random phenomenon, instead we are interested in one or more of them.
“An event is any sub collection of simple outcomes from the sample space.”
We say that an event has occurred if one of the outcomes
that make up the event takes place.
--probabilities
are assigned to simple outcomes in the sample space. The probability of an event can then be found by adding up the probabilities assigned to the outcomes included in that event .
The following two general rules apply to the numbers that we call probabilities
1. A probability can never be negative and is never greater than one.
2. The sum of the probabilities of all the outcomes
that are included in the sample space equals one .
--equally likely assumption
We say that the outcomes of an experiment are equally likely to happen, if they all have the same probability of occurring.
LAWS OF PROBABILITY
-- disjoint or mutually exclusive events
Two or more events are said to be disjoint, or mutually
exclusive, if they do not have any outcomes in common. Consequently,
disjoint events cannot occur simultaneously.
-- the addition law of probability for disjoint events
For any two disjoint events A, and B
P(A or B) = P(A) + P(B)
-- law of the complement
Assume the event A. The event whose outcomes are all the outcomes that are not in A, is called the complement of A and is denoted by A¢.
The law of the complement states that for any event A,
P(A¢) = 1 - P(A)
-- independent events
We say that two events are dependent if the occurrence of one event causes a change in the probability of occurrence (or nonoccurrence) of another event.
In some situations the fact that an event A has already occurred does not influence the probabilities associated with another event B. In this case we say that events A and B are independent.
-- the multiplication law for independent events
The multiplication law of probability for independent events states that:
P(A and B) = P(A) ´ P(B)
i.e. the probability of the simultaneous occurrence of two independent
events A and B equals the product of the probability of A and the
probability of B
RANDOM VARIABLES
A random variable X ,Y, or Z, ... , assigns of a numerical value to each outcome of a random process.
-- example 1
N = The number of people in line at a teller
machine.
Possible Values of N = { 0, 1, 2, 3, 4, 5, . . . }
Here we say that N is a discrete random variable. The values of N are finite, or in some cases countable infinite.
-- example 2
T = The time one has to wait in line at a teller machine.
Possible Values of T = { any positive interval of time }
Here we say that T is a continuous random variable since
every number greater than zero is a possible value with no exception.
There are no gaps in between successive values that T can take.
-- probability distributions
In the discrete case, the probability distribution of the random variable X is the assignment of probabilities to the values of X.
Any assignment of probabilities to the values of a discrete random variable
(r. v.) should comply with the following two probability axioms:
-- the mean m of a r.v X
I tossed four coins 10 times and counted the number of heads that showed up each time. The results are shown in the following table:

If I continue to toss the four coins for a very large number of times, what will the average number of heads turn out to be be in the long run? (As a kid I was accused of steeling some coin money from a church (!), and I was kept locked up in a room for a whole day. I passed my time tossing four coins for 2000 times! I found out that on the average, for each time that I tossed the coins, heads showed up about 1.99 times )
The following table shows the theoretical probability distribution for the # of heads that can show up when 4 coins are tossed.

The long-run average (also called expected value) of the number of heads that show up each time, can be found from multiplying each of the values 0, 1, 2, 3, or 4 with their corresponding theoretical probabilities and adding up all the products. Here the result turns out to be 2.
The long-run average of a random variable X is called the mean m of the r. v. X
The formula for computing the mean m of a random variable X is:
![]()
-- The Variance of a r.v. X
Along the same lines we can define the long-run variance s2 for a r. v. X. The square root of the long-run variance is the long-run Standard Deviation s of the r. v. X.
The formula for computing the Variance s2 of a r. v. X is:
![]()
EXERCISES
1. The probability distribution of the random variable N, ( N = the number of people in line at a teller machine) is given in the following table.

Answer: m = 1.37, s = 1.262
2. The probability distribution of the number of raisins N in a cookie is as follows:
Find
the mean , variance, and standard deviation for the r. v. N, the number of
raisins in a cookie.
Answer: m = 2.8, s2 = 1.56, s = 1.249
3. A fair die is rolled. If an even number shows up, you win as many dollars as the number that shows up on the die. If an odd number shows up, you lose $2.20. Let X = what you win or lose each time you play the game.
a. What is the probability distribution function of
X?
b. What is the mean and the standard deviation of
X?
Answer:
(a)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4. Each day a bakery bakes four cakes at the cost of $8 each, and prices them at $25 dollars each. (Any cake not sold at the end of the day is discarded.) The demand for cakes on any day has the following probability distribution.

a. Let Y be the profit from cakes sold on any day. Describe the probability distribution of the r. v. Y. (Hint: Profit can also be negative. For example if the bakery does not sell any cakes, then the "profit" for that day will = -$32)
b. Find the expected profit of the bakery when four cakes are baked on a day.
c. The bakery's management wonders whether baking 3 cakes or even 2 will be more profitable for the bakery in the long run. How many cakes should the bakery bake each day (4, 3, or 2) in order to make it most profitable in the long run?
Answer: No answer is given to this problem. You may e-mail me your solution and I will provide any corrections or instructions as necessary.
-- continuous random variables
The probability distribution of a continuous random variable X is described by a density curve, much like normal curves. Here events are expressed as:
{ X > a }, { X < b }, or { a < X < b }.
Probabilities for these events are found from the areas that correspond to
them, under the particular density curve.
Example
Let T = time in minutes that one has to wait in line at a teller machine.
The probability distribution of the r. v. T could be described by a curve such as the one in the figure below. (click here for more details)

EXERCISE
The weight of food packed in certain containers is a normally distributed random variable with a mean weight of 500 lb. and a standard deviation of 5 lb. Suppose a container is picked at random. Find the probability that it contains:
a. more than 510 lb.
b. less
than 498 lb.
c. between 491 and
503 lb
Answer:
a. 0.0228
b. 0.3446
c. 0.6898
-- the Binomial Distribution
We now single out two distributions that are of special importance in statistics. One of these is a discrete distribution called the binomial distribution and the other, a continuous distribution called the normal distribution.
The binomial distribution corresponds to the situation where the outcomes of a random process can be classified into one of two categories, a "success" or a "failure". The process is repeated a given number of times, perhaps n times, which constitute the binomial experiment.
A fundamental assumption in all binomial experiments is that the outcome of each trial is independent of the outcome of any other trial. Furthermore, the probability of success p is known and remains the same from trial to trial.
A random variable is called a binomial random variable if its values equal the possible number of successes in a binomial experiment.
The probability distribution of any binomial random variable X can be computed from the following formula. Tables with binomial probabilities are also available for values of n for up to 15 or 20 and for the most common values of p.

The mean m and
the standard deviation
s of any binomial random
variable can be found from the formulas
EXERCISES.
1. A basketball player takes 4
shots at the basket. On each throw
he makes a basket with probability p = 0.7
and he misses the basket with
probability 0.3 . Let X = # of successful
attempts.
(a). Construct the probability
distribution of X. Is this a symmetric
distribution or is it a skewed distribution ?
(b). What is the mean and what is the standard deviation of this distribution ?
(c). How often does the basketball player makes less than 3 of his attempts ?
Answer: Use the Table of Binomial probabilities in the book for part (a) and
(c). For part (b) the formulas are also in the book. (Send me your questions if you have any)
-- the normal distribution
A random phenomenon follows a normal distribution if most of the observations are clustered around the center (mean) of the distribution, some (a small percentage) are to the far left, and an equal fraction of observations are to the far right.

The following features characterize the
normal distribution:
-- the standard normal distribution.
m = 0 s = 1
The standard normal curve describes the
distribution of a normal random variable
whose mean is 0 and standard deviation
is 1. The random variable itself is
called the standard normal variable and is
denoted by Z. Areas under the standard normal density curve are
given in tables and can be used to compute areas under any normal curve.
-- probabilities under a general normal curve
If X is any normal r. v. with mean m and standard deviation s, then it can be shown that the values of:
![]()
follow the standard normal distribution.
If X is a binomial distribution with parameters n and p, then it can be shown that

NOTE:
The formula gives satisfactory
approximations for large values of n,
and values of p not near 0 or 1 (say, 0.05 < p < 0.95)
. In general the approximations are
good if both np and n(1-p) are at
least 10. The addition and
subtraction of 1/2 is called the
continuity correction. (click here
for more details)
EXERCISES.
1. A new vaccine was
tested on 100 persons to determine its
effectiveness. If the claim of the
drug company is that the vaccine is
80% effective, find the probabilities that:
(a). less than 74 people will
develop immunity
(b). between
74 and 85 people, inclusive, will
develop immunity.
Answer:
a. 0.0668
b. 0.8276
Note: These answers were derived without using the
continuity correction.
2. When a certain seed is
planted, the probability that it will sprout
is 0.1 . If 1000 seeds are planted,
find the approximate probability that:
(a). more than 130 seeds will sprout
(b). between ninety and ninety-five
seeds, inclusive, will sprout.
Answer:
a. 0.0008
b. 0.1512
-- Statistic
A number computed from the sample is
called a statistic.
-- In statistical practice the value of a parameter is not known. A statistic is used to estimate a parameter.
-- Example:
A telemarketing firm in L.A. uses a device that dials
residential telephone numbers in that city at random. Of the first 100 numbers
dialed, 48% are unlisted. This is not surprising because 52% of all L.A.
residential phones are unlisted. here 48% is a
statistic. 52% is a parameter
-- Sampling Variability:
The value of a statistic varies in repeated random sampling.
-- Sampling Distribution ( of a statistic )
It is the distribution of values taken by the statistic in all possible samples of the same size from the same population.
-- Unbiased statistics:
A statistic used to estimate a parameter is said to be unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated.
-- Two population parameters that we are interested in are:
(1) the population proportion (p), and
(2) the population mean (m).


EXAMPLES:
1. In the long run, annual real returns on common stocks
have varied with mean 9% and standard deviation 28%. You plan to retire in 45
years and you are considering investing in stocks. What is the approximate
probability (assuming market conditions do not change dramatically in the next
45 years) that the mean annual return on your investment over the next 45 years
will:
(a)
exceed 15%
(b)
be lower than 5%
Answers: a. 0.0753 b. 0.1690
2. According to government data, 21% of American children under the age of six live in households with incomes less than the official poverty level. A study of learning in early childhood chooses a random sample of 300 children. Find the approximate probability that at least 80 of the children in the sample selected come from households with incomes less than the official poverty level.
Answer: 0.0080