简体   繁体   English

如何用NaN替换所有字符串值(动态地)?

[英]How do I replace all string values with NaN (Dynamically)?

I want to find all the strings in my dataframe and I want to replace them with NaN values so that I can drop all associated NaN values with the function df.dropna(). 我想在数据框中找到所有字符串,并想用NaN值替换它们,以便可以使用函数df.dropna()删除所有关联的NaN值。 For example, if I have the following data set: 例如,如果我有以下数据集:

x = np.array([1,2,np.NaN,4,5,6,7,8,9,10])
z = np.array([1,2,np.NaN,4,5,np.NaN,7,8,9,"My Name is Jeff"])
y = np.array(["Hello World",2,3,4,5,6,7,8,9,10])

I should first be able to dynamically replace all strings with np.nan so my output should be: 我首先应该能够用np.nan动态替换所有字符串,所以我的输出应该是:

x = np.array([1,2,np.NaN,4,5,6,7,8,9,10])
z = np.array([1,2,np.NaN,4,5,np.NaN,7,8,9,np.NaN])
y = np.array([np.NaN,2,3,4,5,6,7,8,9,10])

and then running df.dropna() (Assume that x,y,z reside in a data frame and not just separate variables) should allow me to have: 然后运行df.dropna()(假设x,y,z驻留在数据帧中,而不仅是单独的变量)应该允许我:

x = np.array([2,4,5,7,8,9])
z = np.array([2,4,5,7,8,9])
y = np.array([2,4,5,7,8,9])

自从你标记了pandas

pd.to_numeric(x,errors='coerce')

This works I think: 我认为这可行:

df = pd.DataFrame(data={'A':[1,2,'str'],'B':['name',2,2]})
for column in df.columns:
    df[column]=df[column].apply(lambda x:np.nan if type(x)==str else x)
print(df)

I think the following is the simplest rendition: The function called "cleanData" takes in a file as an argument and an array of columns that you may want to ignore. 我认为以下是最简单的表示法:名为“ cleanData”的函数将文件作为参数和您可能要忽略的列数组。 It will then replace all of the strings in the file with NaN values and then it will drop those NaN values. 然后,它将用NaN值替换文件中的所有字符串,然后删除那些NaN值。

def cleanData(file, ignore=[]):
    for column in file.columns:
        if len(ignore) is not 0:
            if column not in ignore:
                file[column] = file[column].apply(pd.to_numeric, errors='coerce')
        else:
            file[column] = file[column].apply(pd.to_numeric, errors='coerce')
    file = file.dropna()
    return file

Please find the following: 请找到以下内容:

df = pd.DataFrame([x, y, z])

def Replace(i):
    try:
        float(i)
        return float(i)
    except:
           return np.nan

df = df.applymap(func=Replace)
df.dropna(axis=1)

产量

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas 动态替换 nan 值 - Pandas dynamically replace nan values 如何检查数组中所有值的 decimal.is_nan() ? - how do I check decimal.is_nan() for all values in array? 在 Pandas 中重新采样:当所有值都是 NaN 时,我如何获得 NaN,但仍然使用 skipna=True? - In resampling in Pandas: How do I get NaN when all values are NaN, but still use skipna=True? 如何替换NaN值? - How to replace NaN values? 如何用 numpy 字符串数组中的空字符串替换所有不以 00 结尾的值? - How do I replace all values not ending in 00 with an empty string in a numpy string array? 对于 dataframe 列,如何将所有非正常值替换为“NaN” - For a dataframe column, how to replace all the unnormal values to 'NaN' 如何处理这种情况:“ n / a”在熊猫数据框中显示为“ nan”,但无法对其进行字符串匹配和替换 - How do I handle this situation: 'n/a' shows up as 'nan' in pandas dataframe, but cannot string-match it and replace 如何在我的DataFrame中用空字符串替换所有“nan”字符串? - How to Replace All the “nan” Strings with Empty String in My DataFrame? pandas系列中如何用Nan替换非字符串值? - How to replace non string values with Nan in pandas series? 如何替换 nan 和 null 值? - How to replace nan and null values?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM