Extracting data the right way

  • I
  • Thread starter kelly0303
  • Start date
  • Tags
    Data
In summary, it is possible to extract the exact value of ##x_i## for each mass, with this information, without knowing anything about ##a## if you have enough odd isotopes mixed in.
  • #1
kelly0303
561
33
Hello! I have a relationship of the form ##y_i=ax_i##. In my case ##y_i## is a frequency and ##x_i## is a mass. For each mass, ##x_i## I measure ##y_i## (and I get a central value and an error). The difference between ##x_i##'s is about 1 (in some arbitrary units). Is it possible to extract the exact value of ##x_i## for each mass, with this information, without knowing anything about ##a##? For a bit of background, we know ##x_i## well enough to be able to separate a given mass from the rest in the experiment, hence we know for sure that a given ##y_i## corresponds to a given mass. However we are able to measure ##y_i## extremely well (the error is very small), so I was wondering if we can use the measured ##y_i##'s to extract the masses, ##x_i## with smaller error than currently.
 
Physics news on Phys.org
  • #2
There is some hope given that the masses are discrete. Based on past threads, am I right to think you're looking at a laser spectrum over multiple isotopes? If so, can you make some guesses about a few values of ##x_i## by comparing the heights of the spectral peaks and natural abundances?

You need some outside information. If you know at least one value of ##x_i##, you can estimate the slope ##a## by evaluating ##a_{est} = \frac{y_i}{x_i}## with error ##\sigma_a = \frac{\sigma_{y,i}}{x_i}## (since you're assuming you know the integer value of ##x_i## exactly). With that estimate of ##a##, you can estimate the rest of the ##x_j## (##j \neq i##) by doing ##x_{j,est} = \frac{y_j}{a_{est}}## with error ##\sigma_{x,j} = \sqrt{ \left( \frac{\sigma_{y,j}}{a_{est}} \right) ^2 + \left( \frac{y_j \sigma_a}{a_{est}^2} \right)^2 }##. If you end up with ##\sigma_{x,j} \leq 1##, then you'll be able to estimate the ##x_j## with at least 68% (##1 \sigma##) accuracy.

If you can't identify masses by natural abundance, you might be in luck if you have any odd isotopes mixed in. You can identify them by their nuclear spin by applying a magnetic field and seeing how many zeeman levels pop out (or by looking at the population decay versus time with a small detuning, you'll see beats in the exponential decay corresponding to zeeman sublevel splittings).

Let me know if I'm dead wrong about the context of the experiment o:)

Edit: I crossed out a suggestion because I forgot that that method only works when the excited state is the one with the zeeman levels.
 
Last edited:
  • #3
Twigg said:
There is some hope given that the masses are discrete. Based on past threads, am I right to think you're looking at a laser spectrum over multiple isotopes? If so, can you make some guesses about a few values of ##x_i## by comparing the heights of the spectral peaks and natural abundances?

You need some outside information. If you know at least one value of ##x_i##, you can estimate the slope ##a## by evaluating ##a_{est} = \frac{y_i}{x_i}## with error ##\sigma_a = \frac{\sigma_{y,i}}{x_i}## (since you're assuming you know the integer value of ##x_i## exactly). With that estimate of ##a##, you can estimate the rest of the ##x_j## (##j \neq i##) by doing ##x_{j,est} = \frac{y_j}{a_{est}}## with error ##\sigma_{x,j} = \sqrt{ \left( \frac{\sigma_{y,j}}{a_{est}} \right) ^2 + \left( \frac{y_j \sigma_a}{a_{est}^2} \right)^2 }##. If you end up with ##\sigma_{x,j} \leq 1##, then you'll be able to estimate the ##x_j## with at least 68% (##1 \sigma##) accuracy.

If you can't identify masses by natural abundance, you might be in luck if you have any odd isotopes mixed in. You can identify them by their nuclear spin by applying a magnetic field and seeing how many zeeman levels pop out (or by looking at the population decay versus time with a small detuning, you'll see beats in the exponential decay corresponding to zeeman sublevel splittings).

Let me know if I'm dead wrong about the context of the experiment o:)

Edit: I crossed out a suggestion because I forgot that that method only works when the excited state is the one with the zeeman levels.
You're right about the experiment! We are looking at several isotopes. However, the setup is a bit different. We do know the masses quite well (from Penning trap mass measurements). I was wondering if there is a way to use the measured frequency to extract the mass better than what we already have, using that formula. For example if we know the masses with a relative error of ##10^{-3}##(0.1%), and the frequency with a relative error of, say, ##10^{-6}##, and we know that the parameter ##a## is the same for all isotopes, can we use all this information to extract the masses with a smaller error than ##10^{-3}##?

For example (I am just trowing this out, not sure if it makes sense from a statistics point of view), if I fit ##y=ax## with the data I have, together with errors on x and y I would get "a" with a given error. Then, using this a, I would get the individual masses by doing ##x_i=y_i/a##. My hope was that if we have enough isotopes and the error on ##y_i## is small enough, the error on a would be small enough that the resulted error on ##x_i## from ##x_i=y_i/a## would be smaller than the initial error.

As I said, this is just an idea. Intuitively, I would expect that using all the information about all the isotopes at once would allow us to constrain the masses better than using ##a## and one isotope at a time. What do you think?
 
  • #4
Ahh ok this makes sense. My gut instinct is that you should be able to constrain the masses, but probably not down to the ##10^{-6}## level. If you have ##N## data points, my gut feeling is that you'd be able to constrain the masses down to the ##\frac{10^{-3}}{\sqrt{N}}## level.

My thoughts:
With the N measurements ##(x_i, y_i)##, you can generate a sequence of N-1 ratios: ##\left( \frac{x_i}{x_1}, \frac{y_i}{y_1} \right) ## for ## i = 2, 3, ..., N##. You know that, up to noise, these ratios have to be equal: ##\frac{x_i}{x_1} = \frac{y_i}{y_1}## because ##y_i = a x_i##. You can use this constraint and the ##10^{-6}## precision on y to constrain ratios of the x's down to the ##10^{-6}## level. However, there is still a floating degree of freedom: the baseline mass ##x_1##. This makes sense, because the frequency spectrum only gives you an idea of the relative masses, it doesn't give you a kilogram standard. In other words, you constrain the relative masses, but you still need a reference mass ##x_1##. I believe you can use the information of all N of the mass measurements ##x_i## by minimizing the squared error ## \langle \left(x_i - \frac{y_i}{y_1} \hat{x}_1 \right)^2 \rangle## where ##\hat{x}_1## is your estimate of the mass ##x_1## (this is the variable you vary to minimize the squared error). My hunch is that you will be limited to a ##\frac{1}{\sqrt{N}}## reduction in the uncertainty of individual masses because this procedure is like averaging over the information contained in the N mass measurements. That means there's a ##\frac{10^{-3}}{\sqrt{N}}## limit of the uncertainty in ##\hat{x}_1##.

I could be wrong on this. This is just my gut reaction.

Edit: Maybe it should be ##\frac{10^{-3}}{\sqrt{N-1}}##? I'm not sure. I'd put this question to a Monte Carlo test.
 
  • #5
Twigg said:
Ahh ok this makes sense. My gut instinct is that you should be able to constrain the masses, but probably not down to the ##10^{-6}## level. If you have ##N## data points, my gut feeling is that you'd be able to constrain the masses down to the ##\frac{10^{-3}}{\sqrt{N}}## level.

My thoughts:
With the N measurements ##(x_i, y_i)##, you can generate a sequence of N-1 ratios: ##\left( \frac{x_i}{x_1}, \frac{y_i}{y_1} \right) ## for ## i = 2, 3, ..., N##. You know that, up to noise, these ratios have to be equal: ##\frac{x_i}{x_1} = \frac{y_i}{y_1}## because ##y_i = a x_i##. You can use this constraint and the ##10^{-6}## precision on y to constrain ratios of the x's down to the ##10^{-6}## level. However, there is still a floating degree of freedom: the baseline mass ##x_1##. This makes sense, because the frequency spectrum only gives you an idea of the relative masses, it doesn't give you a kilogram standard. In other words, you constrain the relative masses, but you still need a reference mass ##x_1##. I believe you can use the information of all N of the mass measurements ##x_i## by minimizing the squared error ## \langle \left(x_i - \frac{y_i}{y_1} \hat{x}_1 \right)^2 \rangle## where ##\hat{x}_1## is your estimate of the mass ##x_1## (this is the variable you vary to minimize the squared error). My hunch is that you will be limited to a ##\frac{1}{\sqrt{N}}## reduction in the uncertainty of individual masses because this procedure is like averaging over the information contained in the N mass measurements. That means there's a ##\frac{10^{-3}}{\sqrt{N}}## limit of the uncertainty in ##\hat{x}_1##.

I could be wrong on this. This is just my gut reaction.

Edit: Maybe it should be ##\frac{10^{-3}}{\sqrt{N-1}}##? I'm not sure. I'd put this question to a Monte Carlo test.
Thanks a lot! I was actually guessing that the error would go like ##\frac{10^{-3}}{\sqrt{N}}##. That was helpful, it gave me confidence to put this to a test!
 
  • Like
Likes Twigg

1. What is the importance of extracting data the right way?

Extracting data the right way is crucial for obtaining accurate and reliable results in scientific research. It ensures that the data collected is representative of the population or sample being studied, and can be used to draw valid conclusions and make informed decisions.

2. What are some common methods for extracting data?

Some common methods for extracting data include surveys, experiments, observations, and interviews. Each method has its own advantages and limitations, and the choice of method depends on the research question and the type of data being collected.

3. How can data extraction be done ethically?

Data extraction must be done ethically by following established guidelines and obtaining informed consent from participants. This includes protecting the privacy and confidentiality of participants and ensuring that the data is used for its intended purpose.

4. What are some potential challenges in data extraction?

Some potential challenges in data extraction include human error, bias, and limitations in data collection methods. It is important to address these challenges by using standardized procedures, double-checking data, and acknowledging any limitations in the data.

5. How can data extraction be improved?

Data extraction can be improved by using advanced technologies and tools, such as data mining and machine learning algorithms. It is also important to continually review and update data extraction methods to ensure they are efficient, accurate, and ethical.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
19
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
24
Views
2K
  • STEM Educators and Teaching
Replies
11
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
964
  • Set Theory, Logic, Probability, Statistics
Replies
28
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
714
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
Back
Top