Lets say I have the following df:
quantity#1 taxsubtotal#1 taxrate#1 quantity#2 taxsubtotal#2 taxrate#2
-- ------------ --------------- ----------- ------------ --------------- -----------
0 nan 1.05 21 nan nan nan
2 1 2.1 21 1 1.8 9
6 1 0 0 nan nan nan
13 1 0.9 9 1 1.8 9
21 1 23.4 9 1 2.7 9
I don't want to write the NaN values to the columns of a df:
df3 = pd.DataFrame({
'InvoiceLine1':"""
<cbc:ID>1</cbc:ID>
<cbc:InvoicedQuantity unitCode="ZZ">"""+dftaxitems1['quantity#1'].astype(str)+"""</cbc:InvoicedQuantity>
<cbc:TaxAmount currencyID="EUR">"""+dftaxitems1['taxsubtotal#1'].astype(str)+"""</cbc:TaxAmount>
<cbc:Percent>"""+dftaxitems1['taxrate#1'].astype(str)+"""</cbc:Percent>""",
'InvoiceLine2':"""
<cbc:ID>2</cbc:ID>
<cbc:InvoicedQuantity unitCode="ZZ">"""+dftaxitems1['quantity#2'].astype(str)+"""</cbc:InvoicedQuantity>
<cbc:TaxAmount currencyID="EUR">"""+dftaxitems1['taxsubtotal#2'].astype(str)+"""</cbc:TaxAmount>
<cbc:Percent>"""+dftaxitems1['taxrate#2'].astype(str)+"""</cbc:Percent>""",
})
Assessing the type of nan:
type:
type(dftaxitems['quantity#2'][0])
numpy.float64
Getting the folllowing output:
InvoiceLine1 InvoiceLine2
0 \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
2 \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
6 \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
13 \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
21 \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
Desired output:
InvoiceLine1 InvoiceLine2
0 \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua...
2 \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
6 \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua...
13 \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
21 \n <cbc:ID>1</cbc:ID>\n <cbc:InvoicedQua... \n <cbc:ID>2</cbc:ID>\n <cbc:InvoicedQua...
df3.fillna('')
did not work!
What could help according to you guys:)?
I've tried to transform all values to np.nan so that it can be accurately deleted in the new df
Please help!
Try first convert values to strings and then empty strings to missing values:
df = df.astype(str).replace('', np.nan)
and then remove .astype(str)
later like dftaxitems1['quantity#1'].astype(str)
.
Test:
dftaxitems1 = pd.DataFrame({'quantity#1': ['', 1.0, 1.0, 1.0, 1.0]})
dftaxitems1 = dftaxitems1.astype(str).replace('', np.nan)
s = """<cbc:InvoicedQuantity unitCode="ZZ">"""+dftaxitems1['quantity#1']+"""</cbc:InvoicedQuantity>"""
print (s)
0 NaN
1 <cbc:InvoicedQuantity unitCode="ZZ">1.0</cbc:I...
2 <cbc:InvoicedQuantity unitCode="ZZ">1.0</cbc:I...
3 <cbc:InvoicedQuantity unitCode="ZZ">1.0</cbc:I...
4 <cbc:InvoicedQuantity unitCode="ZZ">1.0</cbc:I...
Name: quantity#1, dtype: object
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.