用数字值和缺少的数据作为字符串读取pandas colum

Question

I have an Id column in my data frame like this: 我的数据框中有一个Id列，如下所示：

a = pandas.DataFrame([12673, 44, 847]) a = pandas.DataFrame（[12673，44，847]）

This data has some missing values. 该数据缺少一些值。 If I Keep_default_NA = True, then the missing value is filled by NaN, and the data is read as float, and therefore the values will change to 如果我Keep_default_NA = True，则缺少的值将被NaN填充，并且数据将被读取为float，因此值将更改为

12673.0 , 44.0, 847.0

which is not desired ( I want to drop NA values and convert to str/obj because the id can be of any length). 这是不希望的（我想删除NA值并转换为str / obj，因为id可以是任何长度）。 If I keep_default_NA = False, then other columns (such as booleans) all become object and I have to compare string values to find out true/false values. 如果我keep_default_NA = False，那么其他列（例如布尔值）都将成为对象，并且我必须比较字符串值以找出true / false值。

Answer 1

If you want NaN values, you have to have floats. 如果要使用NaN值，则必须有浮点数。 https://stackoverflow.com/a/38003951/3841261 https://stackoverflow.com/a/38003951/3841261

Use "keep_default_NA = True", then after dropping the NaNs, convert the column to integers. 使用“ keep_default_NA = True”，然后删除NaN之后，将列转换为整数。

Answer 2

Without a better sample of your data I can't be sure but maybe this will help: 如果没有更好的数据样本，我无法确定，但这也许会有所帮助：

First you read your data preserving the dtype, then you basically read it again to get the right id . 首先，您读取保留dtype的数据，然后基本上再次读取它以获得正确的id 。 If your boolean columns also miss values (empty strings) you will need to cast those rows with df.astype("bool") . 如果您的布尔列也缺少值（空字符串），则需要使用df.astype("bool")这些行。

df1 = pd.read_csv("test.csv", keep_default_na=True).dropna()
df2 = pd.read_csv("test.csv", keep_default_na=False)
df1["id"] = df2.loc[df1.index]["id"]
df = pd.DataFrame(df1.to_dict())

if you don't want to read it in twice, you could read it in with keep_default_na=False then filter out rows with empty strings and cast every column to it's desired dtype or df = pd.DataFrame(df1.to_dict()) . 如果您不想读两次，可以使用keep_default_na=False读入，然后过滤出包含空字符串的行，并将每一列转换为所需的df = pd.DataFrame(df1.to_dict())或df = pd.DataFrame(df1.to_dict()) 。

用数字值和缺少的数据作为字符串读取pandas colum

问题描述

2 个解决方案

解决方案1
0 2018-08-24 15:15:19

解决方案2
0 2018-08-24 15:17:14

用数字值和缺少的数据作为字符串读取pandas colum

问题描述

2 个解决方案

解决方案1 0 2018-08-24 15:15:19

解决方案2 0 2018-08-24 15:17:14

解决方案1
0 2018-08-24 15:15:19

解决方案2
0 2018-08-24 15:17:14