Corresponding character matching probability

  • Context: MHB 
  • Thread starter Thread starter vivek1
  • Start date Start date
  • Tags Tags
    Probability
Click For Summary
SUMMARY

The discussion focuses on calculating the probability of matching k-mers in a protein dataset consisting of 10,000 sequences. Specifically, it addresses the probability that a k-mer "b" from one sequence matches a k-mer "a" from another sequence in at least r positions out of k total positions. The probability is determined by the frequency of amino acids in the dataset, emphasizing that without numerical data, an exact probability cannot be established. This highlights the importance of statistical analysis in bioinformatics for sequence comparison.

PREREQUISITES
  • Understanding of k-mer analysis in bioinformatics
  • Familiarity with protein sequence data
  • Knowledge of probability and statistical methods
  • Experience with data extraction and analysis techniques
NEXT STEPS
  • Research statistical methods for k-mer frequency analysis
  • Explore bioinformatics tools for sequence alignment, such as BLAST
  • Learn about the implementation of probabilistic models in sequence matching
  • Investigate software for handling large protein datasets, like Bioconductor
USEFUL FOR

Bioinformaticians, molecular biologists, and researchers involved in protein sequence analysis and those interested in statistical methods for sequence matching.

vivek1
Messages
1
Reaction score
0
I have a dataset of protein, consisting of 10000 sequence each, having length Si
, where 1<=i<=10000. Now, I extracted k-mer "a" from the 1st sequence. The probability of occurrence of amino acid (character of protein sequence) is given by its frequency in the dataset. If I choose k-mer "b" from other sequence, what will be the probability that k-mer "b" matches k-mer "a" at least in r position out of k position?
 
Physics news on Phys.org
I believe that would be the probability that k-mer a appears in the remaining 9999 sequences. Without numerical data we can't give an exact value.
 

Similar threads

  • · Replies 15 ·
Replies
15
Views
2K
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 18 ·
Replies
18
Views
4K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 3 ·
Replies
3
Views
10K