I have a set of retail data explaining how many sales per store occurred during each week of the past 52 weeks. I want to group these stores by which follow each other the best over the course of the year. Let's say they'll be 10-20 stores in each group, although this number will be refined based on the analysis. These groups will be used to then create location specific curves of distribution. By grouping I can smooth out anomalies while creating a bigger dataset to pull from. I also want to find those stores that have the biggest deviation from the mean.

It would be nice to also find variance within each group, what is statistically significant in these groupings.

I did a simple analysis by just creating curves for each store and comparing them to the mean. Then finding the percent difference for each week, and ranking each store by its average percent difference.

What is a better way to analyze each curve with respect to each other to create groups?

# Grouping Stores based on past retail sales.

