Probability of logic relationships

Click For Summary

Discussion Overview

The discussion revolves around calculating the probability of logical relationships between two binary arrays representing gene presence in different organisms. Participants explore how to determine the number of unique permutations of one array that yield at least the same number of 'true' logical relationships as observed in a specific comparison using logical connectors.

Discussion Character

  • Exploratory, Technical explanation, Debate/contested

Main Points Raised

  • One participant presents a problem involving two binary arrays and seeks a general formula for counting permutations of one array that maintain a certain number of true relationships when compared to another array.
  • Another participant suggests a multiplicative approach based on the number of true and false relationships, proposing a formula involving the product of possibilities for each relationship.
  • A participant clarifies the logical relationships using a table and provides an example of how permutations of the first array can be calculated, but expresses confusion over the initial formula proposed.
  • Further clarification is requested regarding the meaning of the table and the relationships represented, leading to a discussion about the structure of the data and the implications of the logical connector used.
  • One participant explains the context of their research in genomics, emphasizing the need to understand the significance of observed relationships by comparing permutations of gene presence across organisms.
  • Another participant expresses confusion about the representation of rows and the implications for the number of possible permutations, prompting further explanation of the data structure.
  • A participant reiterates the importance of understanding how permutations can affect the observed relationships and the need for a formula to quantify this in a broader context.

Areas of Agreement / Disagreement

Participants exhibit a mix of agreement and disagreement regarding the interpretation of the problem and the proposed solutions. There is no consensus on a definitive formula or approach, and multiple competing views remain on how to calculate the desired permutations.

Contextual Notes

Participants note limitations in understanding the problem due to unclear definitions and the complexity of the data structure. The discussion highlights the challenges of applying logical relationships in a biological context, particularly with large datasets.

riccardo
Messages
4
Reaction score
0
After much thinking I am trying to get a help from the board. Sorry for being naïve and thank you in advance for your answers.

Suppose you have to arrays made up of zeros and ones
a=(1,1,0,1,0,0)
b=(1,0,0,1,1,0)
and a logic relationship to compare the elements of the two arrays in a ordered way, for instance
a[0] & b[0] =true, a[1] & b[1] = true ... a[n] & b[n] = false

I would like to known a general formula to calculate the number of unique permutations of a having at least the same number of 'true' relationships as observed in the original comparison.

I know the number of the positions that became 'true' or 'false' when a '0' is replaced by a '1' or vice versa.


Riccardo
 
Physics news on Phys.org
If all the relations are of the form "a(k) & b(k) = x_k", where x is either true or false, then you have y_1 * y_2 * ... * y_n possibilities, where y_k is 1 when x_k is true and 3 when x_k is false. (That's 3^a, where a is the number of times "false" appears.)

Otherwise, for each k, determine how many solutions there are, and multiply these up.
 
In fact all the relations are of the form "a(k) & b(k) = x_k", where x is either true or false. An example using the logical connector AND would be
AND
a b
1 1=T
0 0=F
1 0=F
1 1=T
0 1=F

Here the number of different permutations of "a" is 5!/(3!2!) and the number of permutations giving at least two "T" is 7.
I can not obtain the same result with your formula (If I understand it well).
 
riccardo said:
I can not obtain the same result with your formula (If I understand it well).

We're clearly solving different problems. What does your table mean? It's not at all clear to me. The problem I solved would expect a string of Ts and Fs, no associated 0s and 1s.
 
CRGreathouse said:
What does your table mean? It's not at all clear to me. .

I will try to explain my problem better. I am carrying out research in genomics by analysing the logical relationships of genes in different genomes (i.e. organisms). In the following table

riccardo said:
AND
a b
1 1=T
0 0=F
1 0=F
1 1=T
0 1=F

columns represent two genes (“a”, “b”); rows represent genomes;
“1” indicates that the gene is present in a given genome,
“0” indicates that the given gene is absent.

When logical connector (AND in this case) is applied to the elements of the two vectors, row by row, a certain number of “true” or “false“ relationship is obtained.

What I would like to be able to count is the number of different permutations of the zeros and ones present in “a”, given “b” (or alternatively in “b”, given “a”) that give the at least the same number of “true” relationships as those observed. Comparing this number with the total number of permutations can be useful to distinguish between relationships arising by chance and actual relationships among genes.

In this particular case I can generate all possible permutations of “a” and see that there are 7 permutations that gives at least two “true” over a total of 5!/(3!2!) = 10 permutations. In my actual data set, however, it is hard to find a solution analytically because the number of elements of “a” is 140 and the permutations to be examined around 10^33.

I think that it should be possible to have formula to compute the number of permutations that can give an equal or greater number of true relationships as those observed, by knowing the number of the positions that can give “true” or “false” when a “0” is replaced by a “1” or vice versa.
 
Last edited:
Interesting, my (second) job is for a DNA microanalysis company at a university. It seems this should be related to the genomics you mention.

riccardo said:
columns represent two genes (“a”, “b”); rows represent genomes;
“1” indicates that the gene is present in a given genome,
“0” indicates that the given gene is absent.

When logical connector (AND in this case) is applied to the elements of the two vectors, row by row, a certain number of “true” or “false“ relationship is obtained.

Still not getting it. Your table says: "a present, b present" = true; "a absent, b absent" = false; "a present, b absent" = false; "a present, b present" = true; "a absent, b present" = false.

Now I don't understand what the rows represent, properly. Are all of these to hold for one gene sequence, at the same time? That would give only one possibility ("ab"), not the 7 you have. Does each row have its own "a" and "b"? This would give 27 possibilities:
ab/ab/ab/ab/ab
ab/ab/ab/ab/a_
ab/ab/ab/ab/__
ab/a_/ab/ab/ab
ab/a_/ab/ab/a_
ab/a_/ab/ab/__
ab/_b/ab/ab/ab
ab/_b/ab/ab/a_
ab/_b/ab/ab/__
ab/ab/_b/ab/ab
ab/ab/_b/ab/a_
ab/ab/_b/ab/__
ab/a_/_b/ab/ab
ab/a_/_b/ab/a_
ab/a_/_b/ab/__
ab/_b/_b/ab/ab
ab/_b/_b/ab/a_
ab/_b/_b/ab/__
ab/ab/__/ab/ab
ab/ab/__/ab/a_
ab/ab/__/ab/__
ab/a_/__/ab/ab
ab/a_/__/ab/a_
ab/a_/__/ab/__
ab/_b/__/ab/ab
ab/_b/__/ab/a_
ab/_b/__/ab/__
where the two characters between each set of slashes is its own row and a letterwritten is present, an underscore is absent.
 
CRGreathouse said:
Now I don't understand what the rows represent, properly. Are all of these to hold for one gene sequence, at the same time? That would give only one possibility ("ab"), not the 7 you have. Does each row have its own "a" and "b"? This would give 27 possibilities

Rows represent organisms, the order of which is kept constant in the comparison. So for example in the first row, organism 1 has the gene "a" and the gene "b", in the second row organism 2 don't have the genes "a" and "b", in the tirth row organism 3 has the gene "a" but not the gene "b" and so on. Obviously genes "a" (and "b") have different sequences in different organisms but we have reasons to believe that they represent the same gene with the same function.

The particular logical connector used here is not very important. I would to try with all possible logical connectors. The important thing for me is, once a particular result is observed (as for example two "true" relationship as in the case above), to know how "surprising" is that result. For that I would like to permute the order of the "1" and "0" of gene "a" across the column (that is equivalent to change the distribution of gene "a" across the organisms) and to count how many permutations over the total give the same or a better result. The kind of permutation which I am referring to are:

a b
1 1=T / 1 1=T / 1 1=T / 1 1=T / 1 1=T / 1 1=T / 0 1=F / 0 1=F / 0 1=F / 0 1=F /
0 0=F / 0 0=F / 0 0=F / 1 0=F / 1 0=F / 1 0=F / 1 0=F / 1 0=F / 1 0=F / 0 0=F /
1 0=F / 1 0=F / 0 0=F / 0 0=F / 0 0=F / 1 0=F / 1 0=F / 1 0=F / 0 0=F / 1 0=F /
1 1=T / 0 1=F / 1 1=T / 1 1=T / 0 1=F / 0 1=F / 1 1=T / 0 1=F / 1 1=T / 1 1=T /
0 1=F / 1 1=T / 1 1=T / 0 1=F / 1 1=T / 0 1=F / 0 1=F / 1 1=T / 1 1=T / 1 1=T /

As you can see there are 10 different permutation of "a" ("b" is kept fixed), with 7 permutations giving at least two true relationships. Riccardo
 
Last edited:

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 40 ·
2
Replies
40
Views
9K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 29 ·
Replies
29
Views
6K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K