# How to compute the percentage of values based on multiple columns

• Python
I have a dataframe as shown in the picture and what I am trying to do is to calculate the number of occurrences based on the values in 2 columns and then calculate the percentage of the occurrences. I have tried the following code but it gives me a zero value in the end and i don't know why.

Code:
count_a2_x = (df['a1'].str.contains('b') & df['a2'].str.contains('x')).value_counts()[True]
count_a2_y = (df['a1'].str.contains('b') & df['a2'].str.contains('y')).value_counts()[True]
acc  = float(count_a2_x/ (count_a2_x + count_a2_y))

the expected output should be 3/6 = 0.5

#### Attachments

• p2.png
1.5 KB · Views: 351

Mark44
Mentor
I have a dataframe as shown in the picture and what I am trying to do is to calculate the number of occurrences based on the values in 2 columns and then calculate the percentage of the occurrences. I have tried the following code but it gives me a zero value in the end and i don't know why.

Code:
count_a2_x = (df['a1'].str.contains('b') & df['a2'].str.contains('x')).value_counts()[True]
count_a2_y = (df['a1'].str.contains('b') & df['a2'].str.contains('y')).value_counts()[True]
acc  = float(count_a2_x/ (count_a2_x + count_a2_y))

the expected output should be 3/6 = 0.5
Use and instead of &. The & operator is the bitwise and operator.

when i changed it to and it gave me this error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

and I tried now to add float and it seems to work when float is assigned in the beginning.

Code:
count_a2_x = float((df['a1'].str.contains('b') & df['a2'].str.contains('x')).value_counts()[True])
count_a2_y = float((df['a1'].str.contains('b') & df['a2'].str.contains('y')).value_counts()[True])
acc  = count_a2_x/ (count_a2_x + count_a2_y)