I tried to do some basic statistics on a csv file in python.What I'm trying to do is making a dictionary of headers and the values of special columns.But there are some NaN values which makes the following error on my code
import csv
reader=csv.reader(f,delimiter=',')
import numpy as np
header=next(reader)
dataset=[]
for line in reader:
d=dict(zip(header,line))
for field in ['Reviews','Rating']:
np.isnan('Rating','Reviews')
d[field]=int(float(d[field]))
dataset.append(d)
I tried to use numpy.isnan to remove NaN values but I got this error
return arrays must be of ArrayType
Therefore,How can I remove the NaN values?
not sure what your data looks like but I'm guessing the NaN values are strings
you can do
d=dict(zip(header,[l for l in line if l != "NaN"]))
while you're reading them in to drop the NaNs
but best to post a sample of your data so we can actually see what youre working with
Depends if the NaN
values are NaN
or "NaN"
, you can use:
df=df.dropna() #take rows from your dataframe that are finite or not equal to NaN as NaN.
df[(df != "Nan").all(1)] # take rows from your DataFrame that does not have any "NaN" value from any column
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.