Python How to compute the percentage of values based on multiple columns

AI Thread Summary
The discussion revolves around calculating occurrences and percentages from a DataFrame using two columns. The initial code provided resulted in a zero value due to the use of the bitwise operator '&' instead of the logical operator 'and'. When the user attempted to switch to 'and', they encountered a ValueError indicating ambiguity in truth values for Series. The solution involved retaining the '&' operator but ensuring that the counts were converted to floats before performing calculations. The final code successfully computes the expected output of 0.5 by correctly counting occurrences of specific conditions in the DataFrame.
msn009
Messages
53
Reaction score
6
I have a dataframe as shown in the picture and what I am trying to do is to calculate the number of occurrences based on the values in 2 columns and then calculate the percentage of the occurrences. I have tried the following code but it gives me a zero value in the end and i don't know why.

Code:
count_a2_x = (df['a1'].str.contains('b') & df['a2'].str.contains('x')).value_counts()[True]
count_a2_y = (df['a1'].str.contains('b') & df['a2'].str.contains('y')).value_counts()[True]
acc  = float(count_a2_x/ (count_a2_x + count_a2_y))

the expected output should be 3/6 = 0.5
 

Attachments

  • p2.png
    p2.png
    726 bytes · Views: 437
Technology news on Phys.org
msn009 said:
I have a dataframe as shown in the picture and what I am trying to do is to calculate the number of occurrences based on the values in 2 columns and then calculate the percentage of the occurrences. I have tried the following code but it gives me a zero value in the end and i don't know why.

Code:
count_a2_x = (df['a1'].str.contains('b') & df['a2'].str.contains('x')).value_counts()[True]
count_a2_y = (df['a1'].str.contains('b') & df['a2'].str.contains('y')).value_counts()[True]
acc  = float(count_a2_x/ (count_a2_x + count_a2_y))

the expected output should be 3/6 = 0.5
Use and instead of &. The & operator is the bitwise and operator.
 
when i changed it to and it gave me this error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

and I tried now to add float and it seems to work when float is assigned in the beginning.

Code:
count_a2_x = float((df['a1'].str.contains('b') & df['a2'].str.contains('x')).value_counts()[True])
count_a2_y = float((df['a1'].str.contains('b') & df['a2'].str.contains('y')).value_counts()[True])
acc  = count_a2_x/ (count_a2_x + count_a2_y)
 
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
I have a quick questions. I am going through a book on C programming on my own. Afterwards, I plan to go through something call data structures and algorithms on my own also in C. I also need to learn C++, Matlab and for personal interest Haskell. For the two topic of data structures and algorithms, I understand there are standard ones across all programming languages. After learning it through C, what would be the biggest issue when trying to implement the same data...
Back
Top