Histogram with # on edge of bin?

Click For Summary
SUMMARY

The discussion centers on the appropriate binning strategy for histograms, particularly when dealing with edge cases like the value 270. It is established that bins should be non-overlapping, with a common practice being to define bins as 260-269 and 270-279 for integer data points. For real numbers, the suggestion is to use intervals such as 259.5 to 269.5 and 269.5 to 279.5 to avoid gaps and ensure comprehensive coverage of the data range. The choice of binning strategy significantly impacts the representation of data in histograms.

PREREQUISITES
  • Understanding of histogram construction and data binning
  • Familiarity with integer and real number data types
  • Knowledge of statistical concepts related to data distribution
  • Experience with data visualization tools that support histogram creation
NEXT STEPS
  • Research best practices for defining histogram bin ranges
  • Learn about the implications of bin overlap in histogram accuracy
  • Explore data visualization libraries such as Matplotlib for Python to create histograms
  • Investigate statistical methods for analyzing data distributions using histograms
USEFUL FOR

Data analysts, statisticians, and anyone involved in data visualization who seeks to accurately represent data distributions through histograms.

doubled
Messages
27
Reaction score
0
let's say I have 10 data points. And one of my data points is 270.

Then let's say two of my bins are 260-270 and 270-280. Which bin would you put the 270 in?
Or would such a choice of bin range be inappropriate and new ranges have to be chosen?
 
Physics news on Phys.org
When you construct your bins they should be literally non-overlapping - how you decide to do this is up to you. If your values are all integers, then most people would probably think it's more natural for the bins to be 260-269 and 270-279, if your values are arbitrary real numbers then the probability you pulled an integer is zero and you should be rethinking what the heck is going on (or more likely just arbitrarily picking whether everyone rounds down or up).

I have seen histograms often described as having intervals for example as 259.5 to 269.5 and 269.5 to 279.5 in order to encompass the same integer range, but look like there aren't any gaps.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
Replies
8
Views
4K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 23 ·
Replies
23
Views
5K
  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 3 ·
Replies
3
Views
9K
  • · Replies 10 ·
Replies
10
Views
6K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K