I have 2 csv files, one with the original data with lots of both NaNs and empty spaces and the other csv file contains the answers for the NaNs.
How do i replace ONLY the NaNs with the contents from second csv file without changing any original values. Is there a easy solution with pandas.
import pandas as pd
import numpy as np
a = pd.read_csv('training.csv',header=0)
b = pd.read_csv('training_predict.csv')
print 'input shapes', a.shape, b.shape
a[:,:29] = np.where(np.isnan(a[:,:29].values), b.values, a[:,:29].values)
a.to_csv('training_new.csv')
i tried using
a = a.fillna(b, inplace=True)
but it is not working.
Assuming that the two files are aligned and you just want to fill the cells, you could use where
or combine_first
, depending on your preference:
>>> a = pd.DataFrame([[10.0, 20.0, np.nan], [30.0, np.nan, 60.0], [np.nan, 80.0, 90.0]], columns=["a","b","c"])
>>> b = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]], columns=["a","b","c"])
>>> a
a b c
0 10 20 NaN
1 30 NaN 60
2 NaN 80 90
>>> b
a b c
0 1 2 3
1 4 5 6
2 7 8 9
>>> a.where(a.notnull(), b)
a b c
0 10 20 3
1 30 5 60
2 7 80 90
>>> a.combine_first(b)
a b c
0 10 20 3
1 30 5 60
2 7 80 90
This worked for me :)
import pandas as pd
import numpy as np
a = pd.read_csv('training.csv',header=0, nrows=7048)
b = pd.read_csv('training_predict.csv')
#a = a.where(a.notnull(), b)
#a = a.combine_first(b)
#a = a.where(~np.isnan(a), other = b, inplace = True)
a = np.where(np.isnan(a.values), b.values, a.values)
df = pd.DataFrame(a)
df = pd.concat([df,b],axis=1)
df.info()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.