简体   繁体   English

Python Pandas 从 excel 导入数据后删除空白空格

[英]Python Pandas remove white blank space after import data from excel

I import some data from excel to dataframe.我将一些数据从 excel 导入到 dataframe。 In excel there are some cells which are blank (not empty) ie someone has pressed spacebar from keyboard to fill that cell in excel, therefore no characters but still looks blank.在 excel 中有一些单元格是空白的(不是空的),即有人从键盘按空格键来填充 excel 中的单元格,因此没有字符,但看起来仍然是空白的。 In dataframe i tried to clean it up with below function.在 dataframe 中,我尝试使用以下 function 进行清理。 But dataframe doesn't show as NAN after cleaning.但是 dataframe 清洗后不显示为 NAN。 Is there a function available so that it can be cleaned?是否有可用的 function 以便清洁?

df.columns = df.columns.str.strip()

I can't reply to your comment because I have no rep:(.我无法回复您的评论,因为我没有代表:(。

If I am understanding you correctly, you wish to place a NaN value where there are spaces?如果我对您的理解正确,您希望在有空格的地方放置一个 NaN 值?

I tried the following and it seems to work, let me know if this helps.我尝试了以下方法,它似乎有效,如果这有帮助,请告诉我。

import pandas as pd
import numpy as np

df = pd.DataFrame({'Names': ['betty', 'chris',' ',  'steve', 'carly']})

df.loc[df['Names'] == ' '] = np.nan

If you need to iterate over each column you can put the df.loc within a loop like the following.如果您需要遍历每一列,您可以将 df.loc 放在一个循环中,如下所示。

df = pd.DataFrame({'Names': ['betty', 'chris',' ',  'steve', 'carly'],'Age':\
               ['40', ' ', '32', '44', '69']})

for col in df.columns:
    df[col].loc[df[col] == ' '] = np.nan

Are you sure df.columns = df.columns.str.strip() is what you want?你确定df.columns = df.columns.str.strip()是你想要的吗? That only changes the column names .那只会更改列名 If you want to change the values inside the cells, consider replace :如果要更改单元格内的值,请考虑replace

df.replace('^\s+$', np.nan, regex=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM