简体   繁体   English

未在 dataframe 中填充 NaN 值

[英]Not filling NaN values in dataframe

Lets say I have the following df:可以说我有以下df:

      quantity#1    taxsubtotal#1    taxrate#1    quantity#2    taxsubtotal#2    taxrate#2
--  ------------  ---------------  -----------  ------------  ---------------  -----------
 0           nan             1.05           21           nan            nan            nan
 2             1             2.1            21             1              1.8            9
 6             1             0               0           nan              nan            nan
13             1             0.9             9             1              1.8            9
21             1            23.4             9             1              2.7            9

I don't want to write the NaN values to the columns of a df:我不想将 NaN 值写入 df 的列:

df3 = pd.DataFrame({
'InvoiceLine1':"""
    <cbc:ID>1</cbc:ID>
    <cbc:InvoicedQuantity unitCode="ZZ">"""+dftaxitems1['quantity#1'].astype(str)+"""</cbc:InvoicedQuantity>
        <cbc:TaxAmount currencyID="EUR">"""+dftaxitems1['taxsubtotal#1'].astype(str)+"""</cbc:TaxAmount>
          <cbc:Percent>"""+dftaxitems1['taxrate#1'].astype(str)+"""</cbc:Percent>""",
'InvoiceLine2':"""
    <cbc:ID>2</cbc:ID>
    <cbc:InvoicedQuantity unitCode="ZZ">"""+dftaxitems1['quantity#2'].astype(str)+"""</cbc:InvoicedQuantity>
        <cbc:TaxAmount currencyID="EUR">"""+dftaxitems1['taxsubtotal#2'].astype(str)+"""</cbc:TaxAmount>
          <cbc:Percent>"""+dftaxitems1['taxrate#2'].astype(str)+"""</cbc:Percent>""",
})

Assessing the type of nan:评估nan的类型:

type:
type(dftaxitems['quantity#2'][0])
numpy.float64

Getting the folllowing output:获取以下 output:

    InvoiceLine1                                       InvoiceLine2
0   \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
2   \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
6   \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
13  \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
21  \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...

Desired output:所需的 output:

    InvoiceLine1                                       InvoiceLine2
0   \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... 
2   \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
6   \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... 
13  \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
21  \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...

df3.fillna('') did not work! df3.fillna('')

What could help according to you guys:)?你们有什么可以帮助的:)?

I've tried to transform all values to np.nan so that it can be accurately deleted in the new df我试图将所有值转换为 np.nan 以便可以在新的 df 中准确删除它

Please help!请帮忙!

Try first convert values to strings and then empty strings to missing values:尝试首先将值转换为字符串,然后将空字符串转换为缺失值:

df = df.astype(str).replace('', np.nan)

and then remove .astype(str) later like dftaxitems1['quantity#1'].astype(str) .然后像dftaxitems1['quantity#1'].astype(str)之后删除.astype(str) str) 。

Test:测试:

dftaxitems1 = pd.DataFrame({'quantity#1': ['', 1.0, 1.0, 1.0, 1.0]})
dftaxitems1 = dftaxitems1.astype(str).replace('', np.nan)

s = """<cbc:InvoicedQuantity unitCode="ZZ">"""+dftaxitems1['quantity#1']+"""</cbc:InvoicedQuantity>"""
 
print (s)
0                                                  NaN
1    <cbc:InvoicedQuantity unitCode="ZZ">1.0</cbc:I...
2    <cbc:InvoicedQuantity unitCode="ZZ">1.0</cbc:I...
3    <cbc:InvoicedQuantity unitCode="ZZ">1.0</cbc:I...
4    <cbc:InvoicedQuantity unitCode="ZZ">1.0</cbc:I...
Name: quantity#1, dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM