Averaging Transaction Data for Accurate Growth Rate Calculations

Click For Summary
SUMMARY

The discussion focuses on calculating accurate growth rates from weekly transaction data collected by a company. The current method involves averaging weekly transaction counts, which can lead to inaccuracies due to seasonal fluctuations, particularly during holiday periods. A more reliable approach is to utilize a weighted average that considers the number of transactions in each week, thereby providing a more accurate representation of growth over time. The importance of avoiding the naive averaging of averages is emphasized, as it can distort the true growth rate.

PREREQUISITES
  • Understanding of basic statistical concepts, specifically averages and weighted averages.
  • Familiarity with data visualization techniques for representing time-series data.
  • Knowledge of seasonal trends in transaction data analysis.
  • Experience with SQL or data manipulation tools to manage and analyze transaction datasets.
NEXT STEPS
  • Research the implementation of weighted averages in data analysis.
  • Explore SQL functions for calculating averages and weighted averages.
  • Learn about time-series analysis techniques to account for seasonal variations.
  • Investigate data visualization tools that effectively represent transaction trends over time.
USEFUL FOR

Data analysts, business intelligence professionals, and anyone involved in financial reporting or transaction data analysis will benefit from this discussion.

Diffy
Messages
441
Reaction score
0
Hi, this contracted company that I am working with has this data. Basically they collect the number of transactions each day. Then they graph these transactions out by Week, always monday - sunday.

The data is then stored in a table where the x-axis is the date of the monday the week started and the y-axis is the average of the transactions for that week. They like it that way because it smooths out the weekends.

The problem is they are reporting a growth rate. Since the last couple weeks of the year are generally slower because of holidays, they appear to pick the second or third last full week of the year, and the second or third last week of the previous year, divide, and then come up with a rate.

Given the data, and the structure that they keep it, is there a better way?

I was thinking that they could average the weekly averages of each year. But I remember an old college professor always used to warn against averaging averages.

Any input is appreciated.
 
Physics news on Phys.org
You cannot "average averages" in the "unweighted" sense of "add them all and divide by their number.

For an easy example, suppose all 10 of the first set are "2":
{2, 2, 2, 2, 2, 2, 2, 2, 2, 2}.
The average is, of course, 2.

Now suppose all 8 of the second are "3":
{3, 3, 3, 3, 3, 3, 3, 3}.
The average is, of course, 3.

Suppose, finally, that all 8 of the final set are "4":
{4, 4, 4, 4, 4, 4, 4, 4}
The average is, of course, 4,
If you just naively average those three numbers, you get (2+ 3+ 4)/3= 9/3= 3.

But in fact, you have a total of 26 numbers:
{2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4}
which add to 76 so their average is 76/26= 36/13= 2.92, approximately, not quite 3.

Of course, I didn't actually add those numbers to get 76, I argued that the 10 "2"s will add to 20, the 8 "3"s to 24, and the 8 "4" to 32 so the total is 20+ 24+ 32= 76.

That's why we could use, instead, a "weighted" average. Instead of keeping all the numbers to get a grand average, keep the average of each set and the number of entries in each set. Here, our first average was 2 and there were 10 numbers, the second was 3 and there were 8 numbers, the last average was 4 and there were 8 numbers. We can get the total of each by multiplying the average of each set by the number of terms in the set and add those:
[tex]\frac{2(10)+ 3(8)+ 4(8)}{10+ 8+ 8}= \frac{20+ 24+ 32}{26}= \frac{76}{26}= 36/13[/tex]

Notice that we could also write that as
[tex]\frac{10}{26}(2)+ \frac{8}{26}(3)+ \frac{8}{26}(4)= \frac{5}{13}(2)+ \frac{4}{13}(3)+ \frac{4}{13}(4)[/tex]
"weighting" each average by the fraction of total numbers that was based on.
 
Actually, I don't see the problem with the company's current method of comparing the number of transactions in a given week with the number a transactions at the same period a year ago. That naturally allows for seasonal fluctuations and public holidays, etc.
 
Last edited:

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 14 ·
Replies
14
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 0 ·
Replies
0
Views
3K
  • · Replies 15 ·
Replies
15
Views
3K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 6 ·
Replies
6
Views
6K