- #1

- 84

- 8

- I
- Thread starter BillKet
- Start date

- #1

- 84

- 8

- #2

BvU

Science Advisor

Homework Helper

2019 Award

- 13,576

- 3,280

Doesn't that mean you want to keep the two energies separated ?I want to plot a rate vs energy plot

That's an unweighted average19.1 counts per second

The math if you insist on merging is simple:

you have ##r_1 = {100\pm\sqrt{ 100}\over 10}## events with ##E_1## per unit of time and ##r_2 = {2000\pm \sqrt{2000}\over 100}## with ##E_2##, so $$ r_E = {\sum {r_i\over \sigma_{r_i}^2} \over \sum {1\over\sigma_{r_i}^2} }$$so a bit more than 19.1 (19.52##\pm## 0.02)

(In other words: the value and the error are mainly determined by the more accurate counting rate that weighs 20 times heavier).

For ##E## you get $$E={\sum {r_i E_i\over \sigma_{r_i}^2} \over \sum {r_i\over \sigma_{r_i}^2}}$$ In other words : as above

- #3

Stephen Tashi

Science Advisor

- 7,388

- 1,367

I want to plot a rate vs energy plot,

Assume I have 2 data points in the bin (so 2 energy values), ##E_1## and ##E_2##

I don't know what physical interpretation your graph has. Let's suppose there is a smooth function ##R(x)## that gives "rate" ##R## as a function of energy ##x##. You want to accurately represent that function on some interval ("bin") of energy given by ##a \le E_1 \lt E_2 \le b##. You intend to plot one point ##(x_0, R(x_0))## for some point ##x_0## in the interval ##[a,b]##.But I am not sure what energy to assign to this bin.

With only that goal in mind, it doesn't matter what energy ##x_0## in ##[a,b]## you select as long as you plot the correct ##R(x_0)## that goes with that energy.

@BvU has suggested a method that treats both the rates and the energies in the data as realizations of random variables. @BvU seems to recognize problems in physics from scanty descriptions, so perhaps he guessed what you are doing.

I myself don't understand your data. For example, are the energies ##E_1## and ##E_2## precisely measured values?

Was the energy E1 constant over the 10 seconds? Or is E1 an average energy for the 10 second interval?(so 2 energy values), E1 and E2 and the first one has 100 counts measured over 10 seconds

Last edited:

- #4

gleem

Science Advisor

Education Advisor

- 1,702

- 1,040

E

- #5

gleem

Science Advisor

Education Advisor

- 1,702

- 1,040

$$Area_{r1}=r_{}{_{1}}\Delta E_{1}$$

$$Area_{r2}=r_{}{_{2}}\Delta E_{2} $$

$$Area {_{r1}}+Area {_{r2}} = r_{1}\Delta E_{1}+r_{2}\Delta E_{2}$$

$$Area_{r1+r2} = (aE_{1}+b)\Delta E_{1}+(aE_{2}+b)\Delta E_{2}

= a(E_{1}\Delta E_{1} +E_{2}\Delta E_{2} ) +b(\Delta E_{1}+\Delta E_{2})$$

$$Area_{r1+r2}/(\Delta E_{1}+\Delta E_{2}) = a(E_{1}\Delta E_{1} +E_{2}\Delta E_{2})/(\Delta E_{1}+\Delta E_{2})+b

$$

The coefficient of a is then the value of the energy associated with the merged bins.

Energy of the merged bins = ##E_{1}\Delta E_{1} +E_{2}\Delta E_{2})/(\Delta E_{1}+\Delta E_{2}##

- #6

- 84

- 8

Thank you for your reply! I am a bit confused about the way you calculate the rate when rebinning. Why do you use the error in that calculation. To use a toy example numbers: if I know that I have 100 counts in 10 seconds between 0 and 50 (in some arbitrary energy units) and 2000 counts in 100 seconds between 50 and 100, then I know for sure, based on my experiment, that I have 2100 events in 110 seconds between 0 and 100. Why can't I just assign to that 0 to 100 bin a rate of 2100/110? What I am trying to say is that, if I were to decide from the beginning to use bins of 100, that's what I would get: 2100/100. So based on your formula, if I decided to use bins of size 50 and then double the size of the bin, would give me a different result than when I would do directly bins of size 100. That doesn't really make sense to me. Am I missing something?Hi,

Doesn't that mean you want to keep the two energies separated ?

That's an unweighted average

The math if you insist on merging is simple:

you have ##r_1 = {100\pm\sqrt{ 100}\over 10}## events with ##E_1## per unit of time and ##r_2 = {2000\pm \sqrt{2000}\over 100}## with ##E_2##, so $$ r_E = {\sum {r_i\over \sigma_{r_i}^2} \over \sum {1\over\sigma_{r_i}^2} }$$so a bit more than 19.1 (19.52##\pm## 0.02)

(In other words: the value and the error are mainly determined by the more accurate counting rate that weighs 20 times heavier).

For ##E## you get $$E={\sum {r_i E_i\over \sigma_{r_i}^2} \over \sum {r_i\over \sigma_{r_i}^2}}$$ In other words : as above

- #7

BvU

Science Advisor

Homework Helper

2019 Award

- 13,576

- 3,280

If it's just a matter of merging two adjacent bins with centers ##E_1## and ##E_2## then your new counting rate is indeed ##2100/110## at ##E_1+E_2\over 2##. But why do such a thing and throw away information ? ##100\pm 10## counts/unit t next to ## 200\pm 5## is reason for alarm. Merging the bins sweeps that under the rug.

- #8

gleem

Science Advisor

Education Advisor

- 1,702

- 1,040

You must be careful. Look at the example where the rates in the two bins are actually the same. You count for 100 sec in one and 10 sec in the other yielding say 2000 in one and 200 in the other if you combine the counts into 2200 and the times to 110 you get at combined rate of 20 cps which is great. now if we repeat for a count of 2000 in 100 sec and 100 in 10 sec but this rate is not the same as the first bin and combine the two 2200 counts in 110 sec we bet 19 cps. If we would have counted the second bin for a full 100 sec we would have expected 1000 counts. Combining these two rates we get 3000 total counts in 200 sec. giving and average rate of 15 cps !if I know that I have 100 counts in 10 seconds between 0 and 50 (in some arbitrary energy units) and 2000 counts in 100 seconds between 50 and 100, then I know for sure, based on my experiment, that I have 2100 events in 110 seconds between 0 and 100. Why can't I just assign to that 0 to 100 bin a rate of 2100/110?

The lesson is that you should normalize the counting times.That is the rates must be computed from the same counting time.

- #9

- 84

- 8

Thank you for the clarification! You're right, merging bins so different is not a great idea. My questions was mostly for bins with similar counts, in case I want to combine them (for example 98, 100, 103 counts per second). I gave that example just to easily do the computations. So I agree with the rate you obtain when combining the bins, but I am still not 100% sure about the center of the bin. Is just the average enough? Shouldn't I weight by the time there, the way I mentioned in the original post? Given that the rate after merging is around 19, and the second bin was 20 and it was measured for longer, I feel like I am more confident that a rate of 19 should correspond to an energy much closer to ##E_2## than to ##E_1##. For example, assuming a linear increase in the number of counts with energy, that would be the right thing to do. What do you think?

If it's just a matter of merging two adjacent bins with centers ##E_1## and ##E_2## then your new counting rate is indeed ##2100/110## at ##E_1+E_2\over 2##. But why do such a thing and throw away information ? ##100\pm 10## counts/unit t next to ## 200\pm 5## is reason for alarm. Merging the bins sweeps that under the rug.

- #10

BvU

Science Advisor

Homework Helper

2019 Award

- 13,576

- 3,280

- #11

gleem

Science Advisor

Education Advisor

- 1,702

- 1,040

You should not combine the counts and times for the two bins because they are measuring two different situations one a E1 and one at E2. The count in each bin come from a different parent population. Combining them would only be permissible if you were counting the same thing e.g. counts related to E1 for two different times.

As for determining the new rate value for the merged bins. Forgetting about the statistical nature of the data. Rates automatically normalize the data if all the rates are counts per same unit of time. The proper method would be to add the rates associated to the merged bin assigned energy. Why? if you had two counters one set for E1 and the other for E2 you would be accumulating counts at a rate of r1 + r2.

- #12

- 84

- 8

I am not sure I understand why would I add the rates together. If my both rates are 100 counts in 10 seconds (i.e. each of them are 10 counts per second), if I combine them I get 200 counts in 20 seconds which gives a count also of 10 counts per second. Based on what you said, adding the counts I would get 20 counts per second. Am I miss-understanding what you said?

You should not combine the counts and times for the two bins because they are measuring two different situations one a E1 and one at E2. The count in each bin come from a different parent population. Combining them would only be permissible if you were counting the same thing e.g. counts related to E1 for two different times.

As for determining the new rate value for the merged bins. Forgetting about the statistical nature of the data. Rates automatically normalize the data if all the rates are counts per same unit of time. The proper method would be to add the rates associated to the merged bin assigned energy. Why? if you had two counters one set for E1 and the other for E2 you would be accumulating counts at a rate of r1 + r2.

- #13

Stephen Tashi

Science Advisor

- 7,388

- 1,367

##\frac{a}{b} + \frac{c}{b} \ne \frac{a+c}{b+b}##If my both rates are 100 counts in 10 seconds (i.e. each of them are 10 counts per second), if I combine them I get 200 counts in 20 seconds which gives a count also of 10 counts per second.

- #14

- 84

- 8

Yeah, that's what I am saying. Adding the rates directly doesn't make sense to me.##\frac{a}{b} + \frac{c}{b} \ne \frac{a+c}{b+b}##

- #15

BvU

Science Advisor

Homework Helper

2019 Award

- 13,576

- 3,280

- #16

gleem

Science Advisor

Education Advisor

- 1,702

- 1,040

My reasoning is thus: when you merge two adjacent bins you are in fact determining the net counts from the two bins simultaneously for the same time not sequentially i.e., 10 sec vs 20 seconds. That said you can renormalize it to the former energy intervals by averaging the count rates. This does not affect the uncertainties associated with the rates. It is important to make sure that your data reflects the counts for the same energy interval to start since the interval determines the net counts and thus the rates.I am not sure I understand why would I add the rates together. If my both rates are 100 counts in 10 seconds (i.e. each of them are 10 counts per second), if I combine them I get 200 counts in 20 seconds which gives a count also of 10 counts per second. Based on what you said, adding the counts I would get 20 counts per second. Am I miss-understanding what you said?

Look at post #8 carefully. When count rates are from the same parent population having the same mean and variance ( if you count long enough in each bin you get rates that approach one another) you can add the counts you get over any time intervals and you get the same rate. However if the means (and therefore variances) are different adding the counts over the combined time gives a result that weigh the larger count time. You find that adding counts and dividing by the combined times only gives the correct rate if the rates are actually the same.

Take another example suppose you get 2000 counts in 100 sec and 4000 in 10 sec ( this is reasonable since you would count longer for a lower count rate). Adding counts and dividing by the summed times gives 38.2 cps. How can that be if one bin gives 400 cps. If you normalize to the same counting time say 100 sec then the second rate above is 40000 counts in 100 sec. adding the counts and dividing by 100 gives 420 cps . Adding the rates also gives 420 cps.

- #17

Stephen Tashi

Science Advisor

- 7,388

- 1,367

Are you saying that "adding rates" means something different that arithmetic addition of units that have the same physical units? For example, if we perform arithmetic addition on the rates 10 counts per second and 15 counts per second we get 25 counts per second. Similarly, if we add 10 kg/m to 15 kg/m we get 25 kg/m.Yeah, that's what I am saying. Adding the rates directly doesn't make sense to me.

You might be correct. I don't know why the original poster can't describe a specific situation. ("I want to plot a rate vs energy plot" doesn't attribute a specific meaning to the data or the graph - anymore than saying "I want to plot velocity versus dollars" would describe a specific physical situation.)Rates are not only per unit of time, but also: per bin width of energy.

Imagine a situation where an experimenter sets the internal energy of an object to a certain values ##E_i## and measures of counts ##C_i## of particles emitted from that object for ##T_i## seconds (without regard to what energies those particles have.) In such a situation it is not possible to have a pair of detectors that independently make counts at different energies ##E_1, E_2## on the same object in the same time interval. And it is not necessarily true that the rate of counts at energy ##E_1+E_2## would be the sum of rates ##C_1/T_1 + C_2/T_2##.

Imagine a different situation where an object is kept at a constant state and emits particles at various energies. It is possible to have two detectors set to detect particles that have different energies ##E_1## and ##E_2## and to have these detectors in operation over the exactly the same time interval. A slightly different situation is to have the two detectors in operation over different time intervals. If they operate in different time intervals then there can be variation in the "background" count of particles emitted from sources other than the object.

I found the third edition of Bevington and Robinson's

The book doesn't define "optimum" bin sizes mathematically. It does give a specific and detailed example. I don't know if Bevington and Robinson is regarded as an authoritative text. If it is, this may explain why researchers who consult it agonize over bin sizes.Multiple Peaks

Separation of closely spaced peaks is an important problem in many research fields. Although we should not attempt of extract information from our data by sorting in bins smaller than the uncertainties of our measurements, and should not use bin widths that are so narrow that the number of events in the bins are too small to satisfy Gaussian statistics, we also should not err in the other direction and risk supressing important details. Selecting optimum bin sizes is critical. For some data samples, different bin widths for different regions of the data sample may be appropriate.

- #18

- 84

- 8

But you also have 20 counts/sec in one of the bins, so I don't see the problem. But let's take a super basic example. Assume that I have a bin between 5 and 15 (in some energy units), centered at 10, with 100 counts in 1 second. The next bin is between 15 and 25, centered at 20, also with 100 counts in 1 second. Now, assuming that I want to fit a curve to these points (which is usually what one does in a counting experiment) I would have 2 points at coordinates (10,100) and (20,100). If (for some reason) I decide to double the size of my bin, the 2 bins mentioned above will become one bin between 5 and 25, centered at 15. Based on what you say, if I add the rates, I would get 100+100 = 200 counts per second, so the new point would be at (15,200). if I do it the way I suggest, I would get 200/2=100 counts per second so the new point would be (15,100). So if I fit a curve to (10,100) and (20,100), I would be more confident that at 15 the number of counts is 100 rather than 200. So can you please explain to me exactly what do you mean (in this case) by adding the rates. More specifically, what would your point (x,y) would be on a plot after rebinning?My reasoning is thus: when you merge two adjacent bins you are in fact determining the net counts from the two bins simultaneously for the same time not sequentially i.e., 10 sec vs 20 seconds. That said you can renormalize it to the former energy intervals by averaging the count rates. This does not affect the uncertainties associated with the rates. It is important to make sure that your data reflects the counts for the same energy interval to start since the interval determines the net counts and thus the rates.

Look at post #8 carefully. When count rates are from the same parent population having the same mean and variance ( if you count long enough in each bin you get rates that approach one another) you can add the counts you get over any time intervals and you get the same rate. However if the means (and therefore variances) are different adding the counts over the combined time gives a result that weigh the larger count time. You find that adding counts and dividing by the combined times only gives the correct rate if the rates are actually the same.

Take another example suppose you get 2000 counts in 100 sec and 4000 in 10 sec ( this is reasonable since you would count longer for a lower count rate). Adding counts and dividing by the summed times gives 38.2 cps. How can that be if one bin gives 400 cps. If you normalize to the same counting time say 100 sec then the second rate above is 40000 counts in 100 sec. adding the counts and dividing by 100 gives 420 cps . Adding the rates also gives 420 cps.

- #19

Stephen Tashi

Science Advisor

- 7,388

- 1,367

- #20

WWGD

Science Advisor

Gold Member

2019 Award

- 5,349

- 3,326

This is the point I was trying to make with my example. Do you, OP have a theory in mind you want to test for about the distribution or otherwise? That would narrow down or even determine the binning format. Otherwise, if you're trying to go bottom-up, i.e., trying to come up with a pattern from zero, from the data alone, it is a whole new approach. Edit: maybe more clearly, without a goal in mind you have an issue of unsupervised training/classification which is different from supervised training.

Last edited:

- #21

gleem

Science Advisor

Education Advisor

- 1,702

- 1,040

Reread my example

So in merging the two bins in my example would you feel more comfortable quoting 38.2 cps or 420cps for the merged bins? You see I normalized the times of the data acquisition to a single time. I know you do not like the rate coming out in your example twice that of each bin. You can get around that by averaging the two rates after summing by diving by two. The rates depend on an experimental condition (count time) and have nothing to do with the physics. So when fitting you must make sure that the rates you plot have been normalized to the same counting time if you want to use your method.Take another example suppose you get 2000 counts in 100 sec and 4000 in 10 sec ( this is reasonable since you would count longer for a lower count rate). Adding counts and dividing by the summed times gives 38.2 cps. How can that be if one bin gives 400 cps. If you normalize to the same counting time say 100 sec then the second rate above is 40000 counts in 100 sec. adding the counts and dividing by 100 gives 420 cps . Adding the rates also gives 420 cps.

Another reason why your method would be incorrect. Suppose you need the real count rate for a whole peak. You could set an energy window to straddle the peak and collect all the data in one counting period say 100 sec. That would be the same as counting smaller energy intervals over the peak sequentially for 100 sec each. But if you wanted the proper count rate for all the counts in the peak you could not add all the counts in each interval and then divide by the sum of the times of all the individual energy intervals. You would sum all the counts and divide by common count time used for each energy interval.

- #22

- 84

- 8

Sorry I am still confused. First of all, I might not have explained my approach well. Given your example I would get (2000+4000)/(100+10) = 54.5 counts/second. I am not sure how to get 38.2, but that is not what I tried to explain and given that I have 20 counts on the left and 400 on the right (so it is probably the rising of a peak), 54.5 cps at a value of the energy in between seems reasonable. 420 cps definitely seems too much. Secondly, I don't think your approach of just multiplying the counts by 10 is right (I might be wrong tho). The counts have a Poisson error associated with them. You can't just say that if in t seconds you had N counts, in kt seconds you would have kN counts, as there would be statistical fluctuations. Lastly, again, I think I am still missunderstanding your approach. Assuming you are right, your approach should apply to my simple example, too. Let's be a bit more specific. Assume that 2 people measure the background for an experiment, at the same time, over the same range and let's say that the true value (based on theory for example) is 100 cps for all the energies within a certain range. The first person measures the value of the energy at 10 (in some energy units) and gets 100, while the other one measures it at 5 and 15 and gets, let's say, 99 and 101. In order to compare the measurements, the second one wants an estimate for his counts at the energy of 10. If he just adds the counts he would get 200. If he does what I suggest he would get 100. So I think that 100 would be the right answer. I am not sure why would you divide by 2 here, given that in your example you didn't divide by 2 (you got 420 cps, not 210 cps). And basically, if you divide by 2, you do what I am doing already (i.e. (99+101)/(1+1)). So I am not sure what you mean by adding rates, which would be, in this case 99/1+101/1=200/1 cps.@BillKet In the case you have given the rates are the same which does not produce any problems when you add the counts and divide by the sum of the times for the two bins. You get the average of the rates. If one has counts in one bin that occur over one time and counts in the other over a different time and the rates they represent are significantly different you get into a problem doing it your way.

Reread my example

So in merging the two bins in my example would you feel more comfortable quoting 38.2 cps or 420cps for the merged bins? You see I normalized the times of the data acquisition to a single time. I know you do not like the rate coming out in your example twice that of each bin. You can get around that by averaging the two rates after summing by diving by two. The rates depend on an experimental condition (count time) and have nothing to do with the physics. So when fitting you must make sure that the rates you plot have been normalized to the same counting time if you want to use your method.

Another reason why your method would be incorrect. Suppose you need the real count rate for a whole peak. You could set an energy window to straddle the peak and collect all the data in one counting period say 100 sec. That would be the same as counting smaller energy intervals over the peak sequentially for 100 sec each. But if you wanted the proper count rate for all the counts in the peak you could not add all the counts in each interval and then divide by the sum of the times of all the individual energy intervals. You would sum all the counts and divide by common count time used for each energy interval.

- #23

gleem

Science Advisor

Education Advisor

- 1,702

- 1,040

Oops, neither do I, you are correct.I am not sure how to get 38.2,

That is true but your measurement is an estimate of the mean albeit crude so roughly you can do that and there would be some error yes but it is the best one can do given the data.Secondly, I don't think your approach of just multiplying the counts by 10 is right (I might be wrong tho). The counts have a Poisson error associated with them. You can't just say that if in t seconds you had N counts, in kt seconds you would have kN counts, as there would be statistical fluctuations.

Your setting up the example to meet your requirements. Not fair. Are your numbers rates or total counts and what are the counting times. Are they the same. If they are there is no problem. Your assuming that the middle rate is the same as the rates bordering it. If the counting times where the same for all measurements then you would be correct. The problem comes when the counting times are different for different energies. This could happen when the count rate get low and you count much longer to get better statistics.The first person measures the value of the energy at 10 (in some energy units) and gets 100, while the other one measures it at 5 and 15 and gets, let's say, 99 and 101.

Take two intervals next to one another. one with N1 counts obtained in t1 sec and the other with N2 oounts obtained in t2 sec. the rates are expected to be different. If I would have measured both intervals simultaneously my rate should have been r1 +r2. It is like two faucets filling the same pail at the same time. If I had two buckets each filled at a different rate r1 and r2 and combined them without knowing how long each faucet was filling the pail how could I produce a combined rate of filling without knowing those times? Even if you told me the sum of the times that would in general not be accurate unless you also were told the ratio of the times. I could fill both pail to the same level one pail slowly for a long time and the other quickly for a short time. If you took the contents of the pail and divided by the sum of the times you would be in error on the low side because you are inadvertently weighing the low rate more.

V1/t1 +V2/t2 = r1 +r2

≠I am not sure why would you divide by 2 here, given that in your example you didn't divide by 2 (you got 420 cps, not 210 cps). And basically, if you divide by 2, you do what I am doing already (i.e. (99+101)/(1+1)). So I am not sure what you mean by adding rates, which would be, in this case 99/1+101/1=200/1 cps.

As for the factor two . Adding two rate gives you the rate for an bin width twice the original. If you wanted to use the original bin width as reference then divide by two.. If you merged three you would divide by three. etc.

Make up examples for yourself with different rates and times in each bin. Do your way and then mine.(you have to divide the net rate you get using my way by 2 to compare to yours)

- #24

- 84

- 8

- #25

BvU

Science Advisor

Homework Helper

2019 Award

- 13,576

- 3,280

No. You are asking what the rate is for the interval ## E_1 < E < E_2 ##.you are asking what is the rate at (E1+E2)/2

[Edit] sorry: I mean the lower end of the ##E_1## bin and the high end of the ##E_2 bin ##

- Last Post

- Replies
- 10

- Views
- 2K

- Last Post

- Replies
- 8

- Views
- 897

- Last Post

- Replies
- 4

- Views
- 6K

- Last Post

- Replies
- 1

- Views
- 2K

- Last Post

- Replies
- 0

- Views
- 2K

- Last Post

- Replies
- 18

- Views
- 1K

- Last Post

- Replies
- 2

- Views
- 2K

- Replies
- 3

- Views
- 4K

- Replies
- 1

- Views
- 6K

- Last Post

- Replies
- 28

- Views
- 12K