|
Home >
Exploring Excel
>
An Excel Tutorial:How to Create Normal Curves
In Excel, With Shaded Areas
You can create shaded areas in Excel charts
to specify areas of
special interest. Here, we show you how to do this using normal
curves.
by Charley Kyd
October, 2006
(Download the workbook.)
In An Introduction to Excel's Normal Distribution Functions I
presented several figures like the one below. This article explains how
to create these figures in Excel.
Even
if you have no particular reason to chart a normal curve, you might
find the techniques interesting. This is because you might need to use
similar techniques when you create other charts.
One general technique that I used in this workbook was to set up two
worksheets, a Control worksheet and a Data worksheet.
Let's address each of these separately.
First, however, you might wonder how Excel could produce the colors used
for this figure. This link explains how
to get the free workbook that uses these colors and the free
workbook that I explain how to create below.
Set Up the Control Worksheet
In general, you create a normal curve just as you create any other chart
in Excel: You set up the data and then chart it.
To set up
the data, open a new workbook with at least two worksheets. Name one of
these Control and the other Data.
In the Control worksheet, first set up the area shown here.
After you enter the labels in column A, assign them as range names in
column B. To do so, select the range A2:B8 and choose Insert, Name,
Create. Ensure that only Left Column is checked, then choose OK. Do the
same for the range A11:B14.
The numbers in the Settings section are values; enter them as shown.
Here's a short explanation of each:
Mean. The average value of your data.
StdDev.
The Standard Deviation of your data.
NumRows. The number of rows of data to be plotted.
Zmin. The smallest standard deviation to be plotted.
Zmax. The largest standard deviation to be plotted.
PctClear. The percentage of the unshaded area from the left side of
the curve, from .0001% to 99.999%.
PctShade. The percentage of the shaded area from the left side of the
curve, from .0001% to 99.999%.
The numbers in the Calculations section are returned by formulas:
ShadeLeft =NORMSINV(PctClear)*StdDev+Mean
NORMSINV is the inverse of NORMSDIST function. Given the probability
that a variable is within a certain distance of the mean, it finds the z
value...the number of standard deviations. To calculate the extent of
the shade at its left side, we multiply the NORMSINV result by the
amount of the standard deviation, then add the mean.
ShadeRight =NORMSINV(PctShade)*StdDev+Mean
We use the same logic with the PctShade area as for the PctClear area.
By doing so, we define the right side of the shaded area.
CurveMin =Zmin*StdDev+Mean
This formula defines the left side of the curve.
CurveMax =Zmax*StdDev+Mean
This formula defines the right side of the curve.
Set Up the Data Sheet
This
figure shows the top of the Data sheet. Let's take each of these columns
in turn:
The Sequence value in column A is a counter, from 1 to the number of
rows of data. The counter is for convenience only; it's not used in the
formulas. In this example, I used 100 rows.
Enter 1 in cell A2, then select the column from cell A2 through the
bottom of your spreadsheet. To do so, hold down Ctrl and Shift; and
press the Down arrow key. To enter the values, choose Edit, Insert,
Series; enter 100 as the stop value in the Series dialog; then choose
OK.
Column B returns the Z values from the lowest value to the highest
value:
B2: =Zmin
B3: =(Zmax-Zmin)/(NumRows-1)+B2
Column C calculates the chart's X value for each Z value:
C2: =B2*StdDev+Mean
Column D calculates the chart's Y value:
D2: =NORMDIST(C2,Mean,StdDev,FALSE)
Column E calculates the Y values for the area to be shaded:
E2: =IF(C2>ShadeRight,NA(),IF(C2<ShadeLeft,NA(),D2))
Copy cell B3 and the formulas in the range C2:E2 down their columns to
the last sequence number, in row 101.
To complete this step, assign the labels in row 1 of the Data sheet to
the data beneath those labels. To do so, first select the range A1:E101.
Choose Insert, Name, Create. In the Create Names dialog, make sure only
Top Row is checked; then choose OK.
Set Up the Chart
You will use one data series to generate the normal curve and another
data series to display the shaded area.
To set up the chart of the normal curve, select the range C2:D101.
Choose Insert, Chart, or click the Chart Wizard icon. Under Chart Type,
choose XY (Scatter). Under Chart Sub-Type, choose "Scatter with data
points connected by smooth Lines without markers." Then choose Finish.
If you click on the curve, your formula should look something like this:
=SERIES(,Data!$C$2:$C$101,Data!$D$2:$D$101,1)
You'll find this formula easier work with if you convert its cell
references to range names. To do so, edit the formula to:
=SERIES(,NormalCurve.xls!X,NormalCurve.xls!Y,1)
Here, NormalCurve.xls is the name of my workbook. Your SERIES formula could
have a different name.
At this
point, your chart should look something like the one shown here.
To set up the data series for the shaded area beneath the curve, it's
easiest to copy the first data series and then modify it.
To copy the data series, first select the curve in the chart. Then
select the SERIES formula in the formula bar. Copy this formula by
pressing Ctrl+C. Then press Esc to de-select the formula.
To paste the data series, first click anywhere on the edge of the chart
object. After you do so, the SERIES formula will disappear and the words
"Chart Area" will appear at the far left side of your formula bar. To
paste the formula, click in the formula bar area and then press Ctrl+V.
After you
paste the SERIES formula, your chart will look something like this.
At this point, the chart displays two identical data series, Series 1
and Series 2. Because we want the curve to be shown on top of the shaded
area, we'll modify Series 1 to display the shading, and we'll leave
Series 2 undisturbed.
To select Series 1, first click on the curve. Then look in your formula
bar to see which series you've selected. If you see Series 2, press your
Down Arrow key once to select Series 1.
With Series 1 displayed in your formula bar, change the formula so that
it looks like one of these two versions:
=SERIES(,Data!$C$2:$C$101,Data!$E$2:$E$101,1)
=SERIES(,NormalCurve.xls!X,NormalCurve.xls!Area,1)
You'll use the second version if you have assigned range names as I
suggested above.
At this point, your chart will look the same as it did before. It's
now time to change that condition.
With Series 1 selected, choose Format, Selected Data Series. In the Y
Error Bars tab, choose to display the Minus version of the error bar,
and set the Error Amount to a Percentage value of 100%. Then press OK.
Here's
your initial result. It doesn't look very good, but at least we're
heading in the right direction.
We next need to change the format the error bars in two ways. We need
to make them thicker, which will eliminate the banding in the shaded
area, and we typically will change the color of the shaded area.
To format the error bars, first click on the error-bar area in your
chart. Choose Format, Selected Error Bars. In the Patterns tab, modify
the Line settings. Set its color to any color you prefer. And set the
line's weight to its thickest setting. Then choose OK.
After
you make these changes, your chart should look something like this.
Now all you need to do is to clean up the chart slightly.
To eliminate the legend at the right, select the Legend object within
the Chart Area, and then press Delete.
To remove the shading in the Plot Area, first select the Plot Area.
Choose Format, Selected Plot Area and set the Area shading to None. Then
choose OK.
Your
chart now will look like this.
If you want to format the chart so that it looks more like the
figure at the beginning of this article, you need to take several
additional steps.
First, you assign a color patter to the Chart Area, and remove its
border. To do so, select the Chart Area. Choose Format, Selected Chart
Area, in the Patterns tab choose a color for the Area and choose None
for both the Border.
If you want to remove the gridlines, select the chart. Choose Chart,
Chart Options. In the Gridlines tab, deselect all axes. Then choose OK.
If you want to remove the border, select the Plot Area. Choose
Border, Selected Plot Area. Set the Border pattern to None, then choose
OK.
Finally, to assign a number format to the Y axis, first select the
axis. Then choose Format, Selected Axis. In the Number tab choose the
Custom category and type in this number format: #,###.00. Then choose
OK.
Your
chart should look something like the one shown here.
Now, you merely
need to adjust its size as needed.
As I mentioned at the beginning of this article,
use this link to download a free
copy of the workbook described here.
Further Information About Charting Normal Curves
I'd like to thank Jon Wittwer, PhD, for introducing me to this
method for shading curves. He understands a
lot more about techniques for statistical analysis than I do. In
Graphing a Normal Distribution Curve in Excel, he provides
a brief description of this method, and provides a download file with
additional examples.
|