Summary Reporting
Use COUNTIFS, not FREQUENCY, to Calculate Frequency Distribution Tables for Charting Histograms
Excel's FREQUENCY function was first created to calculate frequency distribution tables, which are needed for charting histograms. But the COUNTIFS function
offers more power, and it's easier to use.
by Charley Kyd, MBA Microsoft Excel MVP, 20052014
The Father of Spreadsheet Dashboard Reports

Because the Texas and California governors have been
bickering over the Texan's attempt to poach California
employers, I got curious about the distribution of the unemployment rates in the two states.
So
I downloaded data from the Federal Reserve Bank of St. Louis.
Then I summarize the data and created this chart,
technically a histogram, which I set up as
Charley's Swipe
File #58.
The source data consists of 361 unemployment rates for each state, extending
from December, 1982, through December, 2012.
The chart shows that the unemployment rates in Texas are clumped in the
upper part of the chart. That is, over the past 30 years, Texas has
tended to maintain a lower unemployment rate than California.
You could apply
the logic of this chart to calculate frequency distributions for your
own data. The distributions would compare the number of instances for
nearly any criteria you can count: orders by product by month for two
sales people, failure rates by month for two products, the number of
sick days taken by age by gender in the past
year, and so on.
Using Excel's Frequency Function
The obvious way to generate these frequency distributions is to use
Excel's FREQUENCY function, which has this syntax:
=FREQUENCY(data_array,bins_array)
In this figure, for example, the data_array is the range B3:B9, and the
bins_array is the range D4:D6.
The FREQUENCY function has three unusual characteristics.
First, you must enter the formulas as a multicell array. Here for
example, I selected the range E3:E6, typed the formula...
=FREQUENCY($B$3:$B$9,$D$4:$D$6)
...held down the Ctrl+Shift keys, then pressed Enter to enter the
formula as an array.
Later, if you decide to change the number of bins, you
first must change the
arrayentered function into a normal function, add or delete the rows
you need, then reenter the array. To change the array into a normal
function, select the cells in the array (here, E3:E6), press F2 to edit
the function, then hold down Ctrl and press Enter to enter the formula
in each cell as a normal function.
Second, you must enter one more
FREQUENCY formula than the number of bins. Here,
for example, I entered the FREQUENCY formula in a fourcell range to
reference the three Bins cells.
The extra cell, cell E3 in this case, contains the number of values less
than or equal to the lowest bin number. So in the figure, the value of 3
in cell E3, is the count of the numbers 10, 16, and 20.
Third, the FREQUENCY function returns values that are greater than one
bin and less than or equal to the nexthigher bin. Notice, for example,
that although the Data column contains the value of 20, the Freq column
shows zero for the bin value of 20.
In short, using Excel's FREQUENCY function seems a little strange.
There's got to be a better way.
Using the COUNTIFS Function to Calculate Frequency Distributions
With
the addition of the
COUNTIFS function in Excel 2007, we now have an
easier and morepowerful way to generate frequency distributions.
Here, we use three formulas to generate the same results as with the
FREQUENCY function.
The formula at the top of the frequency range is: E3:
=COUNTIFS($B$3:$B$9,"<="&$D$4)
The formula at the bottom of the frequency range is: E6:
=COUNTIFS($B$3:$B$9,">"&D6)
The remaining formulas are like this: E4:
=COUNTIFS($B$3:$B$9,">"&$D4,$B$3:$B$9,"<"&$D5) Copy this formula
downwards as necessary.
So why is it easier and more powerful to use three formulas rather than
one? There are four reasons.
First, we eliminate the multicell array formula, which allows us to add and delete
rows easily.
Second, we make
the logic obvious for each cell in the Frequency range. In cell E3, for
example, we don't have to figure out what the number 3 represents; the
formula shows us what's happening.
Third, if we don't care about values outside the range of the bins, we
don't have to show them. That is, if we don't care about values less
than 20, we could erase the formula in cell E3.
Fourth, we can change how the frequency distribution is calculated. For
example, it's confusing that although our data contains the value
20, our frequency distribution shows zero results for a bin value of 20.
Therefore,
we could change our distribution so that it shows values greater than or
equal to the bin values, as shown in this figure. This makes the
frequency values easier to understand.
That is, we now can say that there are two values less than 20, one
value in the 20s, two values in the 30s, and two values in the fourties
and above.
The relevant formulas are:
E3: =COUNTIFS($B$3:$B$9,"<"&$D$4)
E4: =COUNTIFS($B$3:$B$9,">="&$D4,$B$3:$B$9,"<"&$D5)
E6: =COUNTIFS($B$3:$B$9,">="&D6)
To see another example of this approach in use, take a closer look at
the Y axis in the chart above from
Charley's Swipe
File #58. Because I used this COUNTIFS method in the
workbook, the unemployment rates are
grouped in bins of 3.0–3.9, 4.0–4.9, and so on.
On the other hand, if I had used the FREQUENCY function to summarize
the data, the rates would have been grouped in bins of 2.1–3.0, 3.1–4.0,
and so on.
Take Your Next Steps
If you're looking for additional help with this topic, I can help you in
three ways. To learn
more, see
Excel Training, Coaching, and Consulting.
