[英]Check if dataframe column contain string type in Python
List comprehension might come handy in such case, check out:在这种情况下,列表理解可能会派上用场,请查看:
df = pd.DataFrame()
df[0] = ["some", "dummy", "data", "filling", "up"]
df[1] = ["0","foo","2","3","bar"]
df[2] = [9,8,7,6,5]
df[3] = [item[1][1] if item[1][1].isnumeric() else item[1][2] for item in df.iterrows() ]
Assign new row with item from row 1 if it is numeric, else use item from row 2.如果它是数字,则使用第 1 行中的项目分配新行,否则使用第 2 行中的项目。
This is what you need.这就是你需要的。
I assume your dataframe name is df.我假设您的 dataframe 名称是 df。
for i, a in enumerate(df[1]):
if type(a) == str:
df.iloc[i, 1] = df.iloc[i, 2]
df
Assuming your desired output is to replace all non-numerical values from column 1 with the ones from column 2, this is how you do it.假设您想要的 output 是将第 1 列中的所有非数字值替换为第 2 列中的值,这就是您的操作方式。
Assume your initial dataframe is this:假设您的初始 dataframe 是这样的:
>>> df
0 1 2
0 some foo 1
1 random -2 2
2 text bar 3
3 some NaN 4
4 random -5 5
5 text -6 6
6 some None 7
You first call pandas.to_numeric
on your desired column, specifying to put non-numeric values to NaN.您首先在所需列上调用pandas.to_numeric
,指定将非数字值放入 NaN。
After this, you fill these NaNs with the matching elements from column 2, and (optional) cast to int the Series obtained.在此之后,您使用第 2 列中的匹配元素填充这些 NaN,并(可选)转换为 int 获得的系列。
>>> df['1'] = pd.to_numeric(df['1'], errors='coerce').fillna(df['2']).astype(int)
>>> df
0 1 2
0 some 1 1
1 random -2 2
2 text 3 3
3 some 4 4
4 random -5 5
5 text -6 6
6 some 7 7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.