Statistics
Textbooks
Boundless Statistics
Continuous Random Variables
Normal Approximation for Probability Histograms
Statistics Textbooks Boundless Statistics Continuous Random Variables Normal Approximation for Probability Histograms
Statistics Textbooks Boundless Statistics Continuous Random Variables
Statistics Textbooks Boundless Statistics
Statistics Textbooks
Statistics
Concept Version 9
Created by Boundless

Probability Histograms

A probability histogram is a graph that shows the probability of each outcome on the $y$-axis.

Learning Objective

  • Explain the significance of a histogram as a graphical representation of data distribution


Key Points

    • In a probability histogram, the height of each bar showsthe true probability of each outcome if there were to be a very large number of trials (not the actual relative frequencies determined by actually conducting an experiment).
    • By looking at a probability histogram, one can visually see if it follows a certain distribution, such as the normal distribution.
    • As in all probability distributions, the probabilities of all the outcomes must add up to one.

Terms

  • independent

    not dependent; not contingent or depending on something else; free

  • discrete random variable

    obtained by counting values for which there are no in-between values, such as the integers 0, 1, 2, ….


Full Text

Histograms

When examining data, it is often best to create a graphical representation of the distribution. Visual graphs, such as histograms, help one to easily see a few very important characteristics about the data, such as its overall pattern, striking deviations from that pattern, and its shape, center, and spread.

A histogram is particularly useful when there is a large number of observations. Histograms break the range of values in classes, and display only the count or percent of the observations that fall into each class. Regular histograms have a $y$-axis that is labeled with frequency. Relative frequency histograms instead have relative frequencies on the $y$-axis, with data taken from a real experiment. This chapter will focus specifically on probability histograms, which is an idealization of the relative frequency distribution.

Probability Histograms

Probability histograms are similar to relative frequency histograms in that the $y$-axis is labeled with probabilities, but there are some difference to be noted. In a probability histogram, the height of each bar shows the true probability of each outcome if there were to be a very large number of trials (not the actual relative frequencies determined by actually conducting an experiment). Because the heights are all probabilities, they must add up to one. Think of these probability histograms as idealized pictures of the results of an experiment. Simply looking at probability histograms make it easy to see what kind of distribution the data follow.

Let's look at the following example. Suppose we want to create a probability histogram for the discrete random variable $X$ that represents the number of heads in four tosses of a coin. Let's say the coin is balanced, and each toss is independent of all the other tosses.

We know the random variable $X$ can take on the values of 0, 1, 2, 3, or 4. For $X$ to take on the value of 0, no heads would show up, meaning four tails would show up. Let's call this TTTT. For $X$ to take on the value of 1, we could have four different scenarios: HTTT, THTT, TTHT, or TTTH. For $X$ to take on a value of 2, we have six scenarios: HHTT, HTHT, HTTH, THHT, THTH, or TTHH. For $X$ to take on 3, we have: HHHT, HHTH, HTHH, or THHH. And finally, for $X$ to take on 4, we only have one scenario: HHHH.

There are sixteen different possibilities when tossing a coin four times. The probability of each outcome is equal to $\frac{1}{16}=0.0625$. The probability of each of the random variables $X$ is as follows: 

$\displaystyle {P(X=0) = \frac{1}{16} = 0.0625 \\ P(X=1) = \frac{4}{16} = 0.25 \\ P(X=2) = \frac{6}{16} = 0.375 \\ P(X=3) = \frac{4}{16} = 0.25 \\ P(X=4) = \frac{1}{16} = 0.0625 \\}$

Notice that just like in any other probability distribution, the probabilities all add up to one.

To then create a probability histogram for this distribution, we would first draw two axes. The $y$-axis would be labeled with probabilities in decimal form. The $X$-axis would be labeled with the possible values of the random variable $X$: in this case, 0, 1, 2, 3, and 4. Then, rectangles of equal widths should be drawn according to their corresponding probabilities.

Notice that this particular probability histogram is symmetric, and resembles the normal distribution. If we had instead tossed a coin four times in many trials and created a relative frequency histogram, we would have gotten a graph that looks similar to this one, but it would be unlikely that it would be perfectly symmetric.

[ edit ]
Edit this content
Prev Concept
Standard Error
Probability Histograms and the Normal Curve
Next Concept
Subjects
  • Accounting
  • Algebra
  • Art History
  • Biology
  • Business
  • Calculus
  • Chemistry
  • Communications
  • Economics
  • Finance
  • Management
  • Marketing
  • Microbiology
  • Physics
  • Physiology
  • Political Science
  • Psychology
  • Sociology
  • Statistics
  • U.S. History
  • World History
  • Writing

Except where noted, content and user contributions on this site are licensed under CC BY-SA 4.0 with attribution required.