|
Home >
Exploring Excel
>
An Excel Tutorial:Summarize Spreadsheet Data
With Excel's Array Formulas
Array formulas may be Excel's most powerful
feature for
summarizing data. However, they also are one of Excel's
least-used features. Here's how to use this hidden power.
by Charley Kyd
May, 2005
Subtotals can reveal very useful management information.
For example, managers might be interested to learn that sales
increased by 10% last month. But they would be fascinated to learn that
Pat Smith's sales of Widgets in the Northwest region doubled last month,
while the sales of all other people fell sharply.
Array formulas provide a way by which Excel users can discover such
useful information.
Introducing the Data
To
explain the power of array formulas I'll use this database. It shows
sales by Seller, Product, Region, and Customer, with Quantity and Total
Sales, in dollars.
I named each column of data with the label shown in row 1. To do so,
I selected the range A1:F15; chose Insert, Name, Create; chose Top Row;
and then chose OK.
By assigning names in this way we anchor the names in the gray border
rows, rows 2 and 15. By doing so, we can add new data to this table
between the gray rows and be confident that the names will expand as
needed.
Introducing Array Formulas, Example 1
Let's
begin our examination of this data by summarizing the sales for Jill and
Joe. Cells J3 and J4 perform this summary by using array formulas.
Here's the formula for the cell shown:
J3: {=SUM(IF(Seller=$I3,Total,0))}
Notice the braces that surround this formula. You do NOT enter those
characters when you key in the formula. Instead, you type the formula
shown within the braces. But when you're done typing you don't press
Enter. Instead, you hold down the Ctrl and Shift keys, then press Enter.
After you do so, Excel displays the formula in the formula bar with the
braces, as shown above.
Here's the key to understanding how array formulas work: Each array
formula creates temporary arrays in memory, and then the
outside function returns the results from that array.
To illustrate, the "Seller=$I3" part of the formula creates a
temporary array like this: {FALSE;TRUE;FALSE;FALSE;FALSE;TRUE;FALSE;TRUE;
FALSE;TRUE;FALSE;TRUE;FALSE;FALSE} Here, the second element in the array
is TRUE. That is, the second cell of the Seller range, cell A3 contains
the text "Joe".
The formula says that wherever TRUE appears, return the corresponding
value from the Total column. Therefore, the statement
"IF(Seller=$I3,Total,0)" returns the temporary array: {0;12600;0;0;0;
5060;0;1980;0;1540;0;4500;0;0}
Finally, the SUM function returns the sum of this temporary array:
25,680.
The second formula is similar:
J3: {=SUM(IF(Seller=$I4,Total,0))}
You can copy this formula from cell I3, or enter it for the practice. Again,
if you enter it, do NOT manually enter the braces.
Instead, when you enter the formula using Ctrl-Shift Enter, the braces will appear
automatically.
More Examples of Array Formulas
The best way to gain an understanding of array formulas is to see
several examples. All of these refer to the database at the top of this page.
Example 2
This table
summarizes sales by sales person by product. The grand total, as you can
see, matches the grand total of Example 1.
This array formula is slightly longer:
J9: {=SUM(IF((Seller=$I9)*(Prod=J$8),Total,0))}
Here, we multiply the first array ("Seller=$19") by a second array
("Prod=J8"). This returns an array that contains TRUE only when both
corresponding cells in the array are TRUE.
That is, when you work with array formulas, you multiply when you
want an AND relationship.
Also, notice that I enclosed each test in parentheses. Not only does
Excel require these parentheses, they make the formula easier to
understand when you read it.
When you copy the array formula to the remainder of this display,
you must copy in two steps, not one. With normal formulas you could copy
cell J9 to the range J9:L10. But when you try to follow the same
approach with an array formula, Excel complains. That is, Excel objects
when you try to copy an array formula to a multi-cell range that
includes itself.
Therefore, you first copy cell J9 to cell J10, then copy the range
J9:J10 to the range K9:L9.
The formulas in row 11 and in column L are merely SUM formulas for
the columns and rows respectively, as shown here:
J11: =SUM(J9:J10)
L9: =SUM(J9:K9)
For convenience, in the following examples I'll call the range of
calculations above and to the left of the SUMs the Working Range.
Example 3
We can
dig deeper into the data by adding an additional argument. Here, we
enter the label "Hats" in cell K13 to report only on sales of hats.
The array formulas for this example are slightly longer, but follow
the same pattern as before:
J15:
{=SUM(IF((Seller=$I15)*(Region=J$14)*(Prod=$K$13),Total,0))}
Copy this formula to the Working Range, J15:L16.
Example 4
Array
formulas work with more functions than merely the SUM function. And you
can perform additional calculations within the formula.
For example, this formula returns the largest price for Hats that
Joe has sold:
J28: {=MAX(IF((Seller=$I28)*(Prod=J$27),Total/Qty,0))}
Copy this formula to the Working Range.
Example 5
In
several of the following examples we're going to need the units sold by
sales person by product. So let's calculate those numbers now.
The formula is:
P3: {=SUM(IF((Seller=$O3)*(Prod=P$2),Qty,0))}
As before, copy this formula to the Working Range.
Example 6
We can
calculate the average price that Joe has charged for Hats.
Here's one way to do this:
P9: {=AVERAGE(IF((Seller=$O9)*(Prod=P$8),Total/Qty,""))}
Notice the null string in the last argument of this formula. If we
had used a zero, the AVERAGE function would have included all those
zeros in the calculation of its average, which we don't want. Instead, by using a null string, we cause the
AVERAGE function to ignore the cells that don't satisfy our criteria. Alternatively,
we could have used FALSE, like this:
P9: {=AVERAGE(IF((Seller=$O9)*(Prod=P$8),Total/Qty,FALSE))}
No matter which way we calculate this formula, it has a problem: Its
average is not weighted by the number of units sold. Therefore, if you
multiply each average price by each number of units sold in Example 5,
you won't
get a correct grand total.
Example 7
This
version calculates a weighted average price by sales person by product.
When we multiply each value by the equivalent number of units sold from Example 5, and
total the results, we'll get the grand totals shown in Examples 1 and 2.
The key formula is:
P15: {=SUM(IF((Seller=$O15)*(Prod=P$14),Total,""))/
SUM(IF((Seller=$O15)*(Prod=P$14),Qty,""))}
(Because it's too long to fit in one row, I'm showing this formula
in two rows. You would enter it in one long line, of course.)
In this specific formula, we calculate Joe's total sales for hats
and then divide by the total number of hats that Joe sold. This gives us
the weighted-average price for the hats that Joe sold.
As always, copy this formula to the Working Range.
Example 8
This
display checks our results. The first formula is simply:
P21: =P3*P15
This formula multiplies the number of hats that Joe sold (Example 5)
by the weighted average sales price (Example 7) to calculate Joe's total
sales of hats, in dollars. As always, copy this formula to the Working
Range. The bottom-right cell in this display shows that the total of all
sales matches the grant total of the data at the beginning of this
article.
Example 9
In
Example 2, I said that multiplying two tests creates an AND
relationship. That is, the result is TRUE only if all values that you
multiply also are TRUE.
Similarly, summing two results creates an OR relationship.
To illustrate, suppose the sales manager is thinking about merging
the North and the West regions. So he wants a report that treats those
regions as one region.
Here's the first formula:
P29: {=SUM(IF((Seller=$O29)*
((Region=P$27)+(Region=P$28)),Total,0))}
(Because it's too long to fit in one row, I'm showing this formula
in two rows. You would enter it in one long line, of course.)
Here, if the Seller is Joe and if the Region is either North or
West, the array returns the Total; otherwise, it returns 0. As usual,
the SUM function returns the total of the array.
When you copy the formula to cell Q29, the formula returns the total
of Joe's sales for Regions that contain "South" or that are blank.
Extending Array Formulas
Array formulas can be extended in some surprising ways. Specifically,
the Excel functions CHOOSE and OFFSET offer some options to keep in
mind.
To illustrate, consider this formula:
D20: {=SUM(IF(Seller=$A$20,Total,0))}
Suppose you want to expand this formula. Suppose that sometimes you
want to return the Total when Seller equals the label in cell A20, and
at other times you want to return the Total when Region equals the label
in cell B20. You want to specify the test to use by entering the value 1
(for Seller) or 2 (for Region) in cell C20.
To set up this summary, you could use the CHOOSE function in two
different ways:
D20: {=SUM(IF(CHOOSE($C20,Seller=$A$20,Region=$B20),Total,0))}
D20: {=CHOOSE($C20,SUM(IF(Seller=$A$20,Total,0)),
SUM(IF(Region=$B$20,Total,0)))}
(Because it's too long to fit in one row, I'm showing the second formula
in two rows. You would enter it in one long line, of course.)
Either option works the same in this specific example. But in real
life, you could use these options in different ways. For example, the
second approach would allow you to choose completely different formulas
merely by changing the number in cell C20.
Similarly, the OFFSET function allows you to choose different rows
or columns to test or sum. For example:
E20: {=SUM(IF(OFFSET(Seller,0,$A22)=$B22,Total,0))}
Here if the value in cell A22 is 1, the OFFSET function causes the
formula to compare the label in cell B22 to the first column to the
right of Seller, or the Prod column in the example. If the value in cell
A22 is 2, the OFFSET function compares the label to the second column,
which is the Region column in the example.
In Conclusion
Array formulas are the ideal tool to use whenever you need to
summarize a spreadsheet database. Once you start to experiment with
them, you'll see what I mean.
|