简体   繁体   中英

How to delete rows in a CSV file based on blank columns

I have a csv file that is in this format, but has thousands of rows so I can summarize it like this

id,name,score1,score2,score3
1,,3.0,4.5,2.0
2,,,,
3,,4.5,3.2,4.1

I have tried to use.dropna() but that is not working.

My desired output is

id,name,score1,score2,score3
1,,3.0,4.5,2.0
3,,4.5,3.2,4.1

All I would really need is to check if score1 is empty because if score1 is empty then the rest of the scores are empty as well.

I have also tried this but it doesn't seem to do anything.

import pandas as pd

df = pd.read_csv('dataset.csv')

df.drop(df.index[(df["score1] == '')], axis=0,inplace=True)

df.to_csv('new.csv')

Can anyone help with this?

import pandas as pd


df = pd.DataFrame([[1,3.0,4.5,2.0],[2],[3,4.5,3.2,4.1]], columns=["id","score1","score2","score3"])

aux1 = df.dropna()
aux2 = df.dropna(axis='columns')
aux3 = df.dropna(axis='rows')

print('=== original ===')
print(df)
print()
print('=== mode 1 ===')
print(aux1)
print()
print('=== mode 2 ===')
print(aux2)
print()
print('=== mode 3 ===')
print(aux3)
print()
print('=== mode 4 ===')
print('drop original')
df.dropna(axis=1,inplace=True)
print(df)

After seeing your edits, I realized that dropna doesn't work for you because you have a None value in all of your rows. To filter for nan values in a specific column, I would recommend using the apply function like in the following code. (Btw the StackOverflow.csv is just a file where I copied and pasted your data from the question)

import pandas as pd
import math

df = pd.read_csv("StackOverflow.csv", index_col="id")

#Function that takes a number and returns if its nan or not
def not_nan(number):
    return not math.isnan(number)

#Filtering the dataframe with the function
df = df[df["score1"].apply(not_nan)]

What this does is iterate through the score1 row and check if a value is NaN or not. If it is, then it returns False. We then use the list of True and False values to filter out the values from the dataframe.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM