Can I use pandas python module to do this:
I searched online and could not find a way to achieve both conditions.
Example:
This input (where NA is either a specific character or whitespace, and X is another character, known a priori)
NA, 1, 2, X, 5, 6
5, 6, 7, 8, 9, 10
NA, 3, 4, 5, 6, 7
9, 8, 7, 6, 5, X
should become
5, 6, 7, 8, 9, 10
9, 8, 7, 6, 5, 0
To drop the rows with NA, you can do:
df.dropna()
To specify the columns where is checked for NaNs, you can provide the subset
keyword argument, see http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html
To replace a certain value, you can do:
df.replace('X', 0)
Full example:
In [14]: df
Out[14]:
0 1 2 3 4 5
0 NaN 1 2 X 5 6
1 5 6 7 8 9 10
2 NaN 3 4 5 6 7
3 9 8 7 6 5 X
In [15]: df.dropna(subset=[0,1])
Out[15]:
0 1 2 3 4 5
1 5 6 7 8 9 10
3 9 8 7 6 5 X
In [16]: df.dropna(subset=[0,1]).replace('X', 0)
Out[16]:
0 1 2 3 4 5
1 5 6 7 8 9 10
3 9 8 7 6 5 0
Aside, it is not very efficient to have strings like 'X'
in numeric columns (this will make it of object
type instead of int
or float
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.