How to calculate rows where it has values in at least 3 columns

In summary: I am trying to count the number of rows that has values in at least 3 columns so the output based on the image shared should be 4. I tried using the code below but it is resulting in the same shape as the whole table which is 7.counts = df[(df[['a', 'b', 'c', 'd'] != 'no_label').count(axis=1) >= 3]sum(row.count("no_label")<2 for row in df)If your table includes the header but
  • #1
msn009
53
6
I am trying to count number of rows that has values in at least 3 columns so the output based on the image shared should be 4. I tried using the code below but it is resulting in the same shape as the whole table which is 7.

Python:
counts = df[(df[['a', 'b', 'c', 'd]] != 'no_label').count(axis=1) >= 3]
 

Attachments

  • p3.png
    p3.png
    1.5 KB · Views: 384
Technology news on Phys.org
  • #2
sum(row.count("no_label")<2 for row in df)
If your table includes the header but you don't want to count that: subtract 1.
 
  • #3
thank you for this suggestion. but what if i only want to check from these columns a,b,c, and d? as I have other columns e,f,g that also has the 'no_label' value but I don't want to consider them,
 
  • #4
sum(row[0:4].count("no_label")<2 for row in df)
 
  • #5
i wanted to do it this way so that i can select the columns that could be in other position in the dataframe:

Python:
cols = ['a', 'b', 'c', 'd'']

sum(row[cols].count("no_label") < 2 for row in df)

but it gives me this error :

TypeError: string indices must be integers, not list

what can i do to use the column names explicitly instead. thanks
 
  • #6
Where do the column labels come from? Are they the first row of you array? I wouldn't do that, it is bad style. Select them based on integers.
If you absolutely need strings convert them to integers for the sum.
 
  • #7
yes they are but in this case there are more than 10 columns which makes it difficult to determine the location of the column using integers.
 
  • #8
You keep changing the task. Why don't you show what exactly you want to do?

cols={'a':0,'b':1,'c':2,'d':3}
sum(row[cols['a']:cols['c']].count("no")>0 for row in df)

Or, if the first row of df has the column names:
Code:
cols2={}
for x,y in enumerate(df[0]):
  cols2[y]=x
 
  • Like
Likes msn009 and Tom.G
  • #9
thanks. sorry for not explaining the scenario in detail.
 

What is the purpose of calculating rows with values in at least 3 columns?

The purpose of calculating rows with values in at least 3 columns is to identify and analyze data that meets a certain criteria. This can help with data cleaning, identifying patterns, and making informed decisions based on the data.

What are the steps involved in calculating rows with values in at least 3 columns?

The steps involved in calculating rows with values in at least 3 columns will vary depending on the specific data and tools being used. However, in general, the steps may include identifying the relevant columns, filtering the data to only include rows with values in those columns, and then performing the necessary calculations or analysis on the resulting data.

Can this calculation be done manually or is it best to use a software or programming language?

This calculation can be done manually, but it may be time-consuming and prone to errors. It is usually more efficient and accurate to use a software or programming language that has functions or methods specifically designed for data analysis and manipulation.

What are some common challenges when calculating rows with values in at least 3 columns?

Some common challenges when calculating rows with values in at least 3 columns include dealing with missing or incomplete data, deciding which columns to include in the calculation, and understanding how the data is structured and organized. It is important to carefully consider these challenges and make necessary adjustments to ensure the accuracy and reliability of the results.

Are there any alternative methods for calculating rows with values in at least 3 columns?

Yes, there may be alternative methods for calculating rows with values in at least 3 columns depending on the specific data and goals of the analysis. For example, instead of using a traditional programming language, one could use a visual programming tool or an Excel spreadsheet with built-in functions. It is important to explore different methods and choose the one that best suits the needs of the analysis.

Similar threads

  • Programming and Computer Science
Replies
17
Views
1K
  • Programming and Computer Science
Replies
2
Views
972
  • Programming and Computer Science
Replies
7
Views
1K
  • Programming and Computer Science
Replies
7
Views
439
  • Programming and Computer Science
Replies
5
Views
1K
  • Programming and Computer Science
Replies
4
Views
345
  • Programming and Computer Science
Replies
5
Views
4K
  • Programming and Computer Science
Replies
29
Views
1K
  • Programming and Computer Science
Replies
2
Views
21K
  • Programming and Computer Science
Replies
9
Views
1K
Back
Top