Lets say that you have n different symbols making up m symbols worth of sample data. The probability for each symbol is the number of occurances of n divided by m. The probability vector is just a vector of all of these probabilities (one for each symbol).
For 4bit symbols you have 2^4 = 16 symbols. Suppose that you have the following trivial example:
0001 0010 0011 1000 1001 0011 0001
then the probability vector would be (column vector)
2/7
1/7
2/7
0
0
0
0
1/7
1/7
0
0
0
0
0
0
Hope this helps.
