简体   繁体   English

熊猫:关于如何处理缺少的小数的建议

[英]Pandas: Recommendation on how to handle missing decimal

I have a scenario where the one of the record in the dataset contains empty value (simplified below for ease of understanding). 我有一个场景,数据集中的记录之一包含空值(为便于理解,在下面进行了简化)。 there are two records in data, one with 0.1 and other with None. 数据中有两条记录,一条记录为0.1,另一条记录为None。 When I serialize df1 , I get the response I want ie null for second record. 当我序列化df1 ,我得到了想要的响应,即第二条记录为null。

import pandas as pd
import numpy as np
import decimal


data = [{'A': 0.1}, {'A': None}]
df1 = pd.DataFrame(data)
print(df1.to_json(orient='records'))

prints [{"A":0.1},{"A":null}] 打印[{"A":0.1},{"A":null}]

I want to treat A as decimal like below: 我想将A视为小数,如下所示:

df3 = df1.copy()
df3['A'] = df2['A'].apply(lambda x: decimal.Decimal(x))
print(df3.to_json(orient='records')) # this throws exception

prints OverflowError: Invalid Nan value when encoding double 打印OverflowError: Invalid Nan value when encoding double

I would like to get the same result as df1 ie null for the missing decimal in json. 我想得到与df1相同的结果,即json中缺少的十进制为null。 Note this works if I used float instead of decimal but that's not an option I can use. 请注意,如果我使用浮点数而不是十进制数,则此方法有效,但这不是我可以使用的选项。

df3['A'] = df2['A'].apply(lambda x: (decimal.Decimal(x) if not pd.isnull(x) else None))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM