A Determining if a list of numbers is a result of multiplication

AI Thread Summary
To determine if a list of numbers originates from random integers or from a multiplication process, one approach is to analyze the distributions of the two collections. The second collection, generated by multiplying a random integer by an unknown fraction and rounding, may produce values outside the expected range of 0-1000. A Bayesian analysis could be employed to model the data-generating processes and compare them using Bayes factors. Additionally, a Chi-square goodness of fit test may help identify which method is more likely for a given sample. More precise information about the distributions is necessary for a conclusive analysis.
alk10
Messages
1
Reaction score
0
TL;DR Summary
Determining if a list of numbers is a result of multiplication
Suppose I have 2 collections of lists.

In the first collection the lists consists of random integers, with most (but not all) in the range 0-1000.
In the second collection the lists consist of integers calculated in the following way:
a. start with a random integer of similar range to the first list
b. multiply by some unknown fraction, typically (but not always) in the range 0-2.
c. round to the nearest integer

Given a particular list, I would like to be able to predict which collection it comes from.

I have tried taking the modulo from every number between 2-20 and looking at the remainder (as for example if the fraction in b) was exactly 2, then the elements mod 2 would always be zero), but couldn't find a noticeable difference. Would appreciate any ideas.
 
Physics news on Phys.org
You have been very vague about the probabilities or distributions of the random behavior.
Phrases like "most (but not all)", "typically (but not always)", "random integers", etc. do not give us much to work with.
Since the second method might create a large number of integers that are beyond the range 0-1000, I would consider using that to make an educated guess about which method created the list. But your description is too vague to know if that is a feasible method.
 
Last edited:
I would use a Bayesian approach for this (of course). "Simply" write down the data-generating models for your two possibilities and then do a Bayesian analysis for any parameters of the models. Then you can compare the models using Bayes factors or your favorite alternative Bayesian model comparison technique.
 
Just run a Monte Carlo on some randomly generated numbers to test
 
In general, the distributions of the two processes will be significantly different. You should be able to use a Chi-square goodness of fit test to determine which method is more likely for a given sample. Without more information about the distributions, I don't think that much more can be said.
 
Back
Top