-- empirical concept of probability

The empirical concept of probability is that of "relative frequency", the ratio of the total number of occurrences  of a situation to the total number of times the experiment is repeated.

When the number of trials is large, the relative frequency provides a satisfactory measure of the probability associated with a situation of interest.  This is one of the so-called laws of large numbers of probability theory.

-- the sample space

In probability and statistics the term experiment  is used in a very wide sense and refers to any procedure that yields a collection of outcomes.  A  random experiment  is one whose outcomes depend only on chance  and it can be repeated under identical conditions.

The set whose elements are all the possible outcomes of an experiment is called the sample space of the experiment.  The concept of the sample space plays a fundamental role in determining probabilities for  individual  outcomes of a random process.  Your understanding of the sample space should be that of a complete and  as detailed as possible listing of all the
outcomes of a random experiment.


In most cases we are not  interested in just one single outcome of a random phenomenon, instead  we are interested in one or more of them.

     “An event is any sub collection of simple outcomes from the sample space.”

We say that an event has  occurred  if one of  the outcomes that make up the event takes place.


are assigned to simple outcomes in the sample space.  The probability of an event can then be found by adding up the probabilities assigned  to the outcomes  included in  that event .

The following two general rules apply to the numbers that we call probabilities

1.     A probability can never be negative  and  is never greater than one.

2.     The sum of the probabilities of all the outcomes that are included in the sample space equals one .

--equally likely assumption

We say that the outcomes of an experiment are equally likely to happen, if they all have the same probability of occurring.


-- disjoint or mutually exclusive events

Two  or  more  events are said to be disjoint, or mutually exclusive, if they do not have any outcomes in common.  Consequently,  disjoint events cannot occur simultaneously.

-- the addition law of probability for disjoint events

                          For any two disjoint events  A, and  B

   P(A or B)  =  P(A)  +  P(B)

-- law of the complement

Assume  the  event  A.  The event whose outcomes are all  the outcomes that are not in  A,  is called the complement of  A  and is denoted by   A¢.

                        The law of the complement states that for any event A,

   P(A¢)  =  1  -  P(A)

-- independent events

We say that two events are dependent  if the occurrence of  one event causes a change in the probability of occurrence (or nonoccurrence) of another event.

In some situations the fact that an event A has already occurred does not influence the probabilities associated with another event B.  In this case we say that events  A  and  B   are independent.

-- the multiplication law for independent events

The multiplication law of probability for independent events states that:

P(A  and  B)  =  P(A) ´ P(B)

i.e.  the probability of the simultaneous occurrence of two independent events A  and  B equals the product of the probability of A and the probability of B



 A random variable  X ,Y,  or Z, ... , assigns of a numerical value to each outcome of a random  process.

-- example 1

 N = The number of people in line  at a teller machine.
 Possible Values of N = { 0, 1, 2, 3, 4, 5, . . . }

Here we say that  N  is a discrete random variable.  The values of N are finite, or in some cases countable infinite.

-- example 2

 T = The time one has to wait in line at a teller machine.
 Possible Values of T = { any positive  interval of time }

Here we say that  T  is a continuous  random variable since every number greater than zero is a possible value with no exception.  There are no gaps in between successive values that  T can take.

-- probability distributions

In the discrete case, the probability distribution of the random variable  X  is the assignment of probabilities to  the values  of   X.

Any assignment of probabilities to the values of a discrete random variable (r. v.)  should comply with the following two probability axioms:

-- notation
 P( X = x )  =  p
 P( X = x )     :   Stands for the probability of the event that X equals the value x
         p           :    Is the probability of  occurrence of the event   { X = x }

-- the mean m of a r.v  X

I tossed four coins 10 times and counted the number of heads that showed up each time.  The results are shown in the following table:

Using the information given in the table I computed the sample mean ( for the 10 times that I tossed the coins) and I found that 1.6  was the average number of heads that showed up each time.

If  I continue to toss the four coins for a very large number of times, what will the average number of  heads turn out to be be in the long run? (As a kid I was accused of steeling some coin money from a church (!), and I was kept locked up in a room for a whole day. I passed my time tossing  four coins for 2000 times! I found out that on the average, for each time that I tossed the coins, heads showed up about 1.99 times )

The following table shows the theoretical probability distribution for the # of heads that can show up when 4 coins are tossed.

The long-run  average (also called expected value) of the number of heads that show up each time, can be found  from multiplying  each of the values 0, 1,  2,  3,  or 4  with their corresponding theoretical probabilities and   adding up all the products. Here the result turns out to be  2.

The long-run average of a random variable X  is called the mean m  of the r. v.  X

The formula for computing the mean m of a random variable X is:

-- The Variance of a r.v.  X

Along the same lines we can define the long-run  variance  s2   for a  r. v.  X. The square root of the long-run variance is the long-run Standard Deviation of the r. v.  X.

The formula for computing the Variance s2  of a r. v. X is:


1.   The  probability distribution  of the random variable N, ( N = the number of people in line at a teller machine) is given in the following table.

Compute the mean m and the  standard deviation  s   of  the r. v.  N.

Answer: m = 1.37,   s = 1.262

2.   The probability distribution of the number of raisins  N  in a cookie  is as follows:

Find the  mean , variance, and standard deviation for the r. v. N, the number of raisins in a cookie.

Answer: m = 2.8,  s2 = 1.56, s = 1.249

3.   A fair die is rolled.  If  an even number shows up,  you win as many dollars as the number that shows up  on the die.  If  an odd number shows up,  you lose  $2.20. Let  X = what you win or lose each time you play the game.

 a.   What is the probability distribution function of  X?
 b.   What is the mean and the standard deviation of  X?



(b)   m = $0.90,  s = $3.31

4.   Each day a bakery bakes four cakes at the cost of  $8  each, and  prices them at  $25  dollars  each. (Any cake not sold at the end of the day is discarded.)  The demand for cakes on any day has the following probability distribution.

 a.    Let  Y  be  the profit from cakes sold on any day.  Describe the probability distribution  of the r. v.  Y. (Hint: Profit can also be negative. For example if the bakery does not sell any cakes,  then the "profit" for that day will = -$32)

 b.   Find the expected profit of the bakery when four cakes are baked on a day.

 c.   The bakery's management wonders whether baking  3  cakes or even 2 will be more profitable for the bakery in the long run.  How many cakes  should the bakery bake each day (4, 3, or 2)  in order to make it most profitable in the long run?

Answer: No answer is given to this problem. You may e-mail me your solution and I will provide any corrections or instructions as necessary.

-- continuous random variables

The probability distribution of a continuous random variable X is described by a density curve,  much like  normal curves.   Here events are expressed as:

                           { X > a },   { X < b }, or { a < X < b }.

Probabilities for these events are found from the areas that correspond to them, under the particular density curve.


Let  T = time in minutes that one has to wait in line at a teller machine.

The probability distribution of the r. v.  T  could be described by a curve such as the one in the figure below. (click here for more details)


The weight of food packed in certain containers is a normally distributed random variable with a mean weight of 500 lb.  and a standard deviation of 5 lb.  Suppose a container is picked at random.  Find the probability that it contains:

 a.   more than  510 lb.
 b.   less than 498 lb.
 c.   between  491  and   503  lb

    a.   0.0228
    b.   0.3446
    c.   0.6898

-- the  Binomial  Distribution

We now  single  out  two  distributions  that are  of  special  importance  in  statistics.  One of  these is  a  discrete  distribution  called  the  binomial  distribution  and  the other,   a  continuous  distribution  called  the  normal  distribution.

The  binomial distribution  corresponds to the  situation  where  the  outcomes  of  a  random process  can  be  classified  into  one  of  two  categories,   a  "success"  or  a  "failure". The process   is  repeated  a  given  number  of  times,  perhaps  n  times,  which  constitute the  binomial  experiment.

A  fundamental  assumption  in  all  binomial  experiments  is  that  the  outcome  of  each  trial is  independent  of  the  outcome  of  any  other  trial.   Furthermore,  the   probability  of success  p  is  known  and  remains  the  same   from  trial  to  trial.

A  random  variable  is  called  a binomial  random  variable  if  its  values  equal   the  possible  number  of  successes  in  a  binomial  experiment.

The  probability  distribution  of any  binomial  random  variable  X  can  be  computed  from the  following formula. Tables with binomial probabilities are also available for values of  n  for up to 15 or 20 and  for the most common values of  p.

The  mean  m  and  the  standard  deviation  s  of  any  binomial  random  variable  can be  found  from  the  formulas


1.   A  basketball  player  takes  4  shots  at  the  basket.  On  each  throw  he  makes a  basket  with  probability  p = 0.7  and  he  misses  the  basket  with  probability  0.3 . Let  X =  # of  successful  attempts.
 (a).   Construct  the  probability  distribution  of  X.  Is  this  a symmetric distribution  or  is  it a  skewed  distribution ?

 (b).   What  is  the  mean  and  what is  the  standard   deviation  of  this  distribution ?

 (c).   How  often  does  the  basketball  player  makes  less than  3  of  his attempts ?

Answer: Use the Table of Binomial probabilities in the book for part (a) and (c). For part (b) the formulas are also in the book. (Send me your questions if you have any)

-- the normal  distribution

A  random  phenomenon  follows a normal  distribution  if  most  of  the  observations  are clustered around  the  center  (mean) of  the  distribution,  some (a small percentage) are to the  far left,  and  an  equal  fraction of  observations  are  to the far  right.

The  following  features  characterize  the  normal  distribution:

-- the  standard  normal  distribution.

  m  =  0   s  =  1

The  standard  normal  curve  describes  the  distribution  of  a  normal  random  variable whose  mean  is  0  and  standard  deviation  is  1.  The  random  variable  itself  is  called  the standard  normal  variable  and  is  denoted  by  Z. Areas under the standard normal density curve are given in tables and can be used to compute areas under any normal curve.

-- probabilities  under  a  general  normal  curve

If  X  is  any  normal  r. v.  with  mean  m  and  standard  deviation  s,   then  it  can  be  shown that  the  values  of:

follow  the  standard  normal  distribution.

 (click here for more details

-- normal  approximation  to  binomial  probabilities

If  X  is  a  binomial  distribution  with  parameters   n and  p,  then  it  can  be  shown  that

The  formula  gives  satisfactory  approximations  for  large  values  of  n,  and  values  of  p not near 0 or 1 (say, 0.05 < p < 0.95) .  In  general  the  approximations  are  good  if both np and  n(1-p) are  at least 10.   The  addition  and  subtraction  of  1/2  is  called the  continuity  correction.  (click here for more details)

1.   A  new  vaccine  was  tested  on  100  persons  to  determine  its  effectiveness.  If  the claim   of  the   drug  company  is  that  the  vaccine  is  80%  effective,  find  the  probabilities  that:
 (a).   less than   74  people  will develop  immunity
 (b).   between   74   and  85   people,  inclusive,  will develop  immunity.

    a.   0.0668
    b.   0.8276
Note: These answers were derived without using the continuity correction.

2.   When  a  certain  seed  is  planted,  the  probability  that  it  will sprout  is   0.1 . If  1000  seeds  are  planted,  find  the  approximate  probability  that:
(a).   more  than  130  seeds  will sprout
(b).   between  ninety  and  ninety-five  seeds,  inclusive,  will   sprout.

    a.   0.0008
    b.   0.1512

-- Parameter
 A number that describes the population is called a parameter.

-- Statistic
 A number computed from the sample is called a statistic.

-- In statistical practice the value of a parameter is not known. A statistic is used to estimate a parameter.

-- Example:
A telemarketing firm in L.A. uses a device that dials residential telephone numbers in that city at random. Of the first 100 numbers dialed, 48% are unlisted. This is not surprising because 52% of all L.A. residential phones are unlisted. here  48% is a statistic.   52% is a parameter

-- Sampling Variability:

 The value of a statistic varies in repeated random sampling.

-- Sampling Distribution ( of a statistic )

 It is the distribution of values taken by the statistic in all possible samples of the same size from the same population.

-- Unbiased statistics:

 A statistic used to estimate a parameter is said to be unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated.

-- Two population parameters that we are interested in are:
 (1)  the population proportion (p), and
 (2)  the population mean (m).




1.    In the long run, annual real returns on common stocks have varied with mean 9% and standard deviation 28%. You plan to retire in 45 years and you are considering investing in stocks. What is the approximate probability (assuming market conditions do not change dramatically in the next 45 years) that the mean annual return on your investment over the next 45 years will:
        (a)    exceed 15%
        (b)    be lower than 5%

        Answers:    a. 0.0753    b.  0.1690

2.    According to government data, 21% of American children under the age of six live in households with incomes less than the official poverty level. A study of learning in early childhood chooses a random sample of 300 children. Find the approximate probability that at least 80 of the children in the sample selected come from households with incomes less than the official poverty level.

    Answer:    0.0080

Nikos Psomas: April, 1998