[英]How to Drop Rows with NaN Values so i can zip and range
I have a code that range the values between two columns, the code works normally when there is no empty cell, I tried the df.isnull, dropna, always the same problem我有一个代码,范围在两列之间的值,当没有空单元格时代码正常工作,我尝试了 df.isnull,dropna,总是同样的问题
import pandas as pd
import numpy as np
path = [('SC200', 100, 102),
('Unified', 210, 210),
('Clé',np.nan,np.nan),
('samsung', 155, 158),
]
df_l = pd.DataFrame(path, columns=['Désignation', 'First', 'Last'])
zipped_l = zip(df_l['Désignation'], df_l['First'], df_l['Last'])
df_l = pd.DataFrame([(k, y) for k, s, e in zipped_l for y in range(s, e+1) ], columns=['Désignation', 'KITCODE'])
print(df_l)
Is this what you are trying to do?这是你想要做的吗?
import pandas as pd
import numpy as np
path = [('SC200', 100, 102),
('Unified', 210, 210),
('Clé',np.nan,np.nan),
('samsung', 155, 158),
]
df_l = pd.DataFrame(path, columns=['Désignation', 'First', 'Last'])
print (df_l)
def kitcd(d):
first = int(d.First)
last = int(d.Last) + 1
return [i for i in range(first, last)]
df_l['KITCODE'] = df_l.apply(lambda x: kitcd(x) if pd.notnull(x.First) else x.First, axis = 1)
df_l = df_l.explode('KITCODE')
print (df_l)
The output of this will be:输出将是:
Original dataframe:原始数据框:
Désignation First Last
0 SC200 100.0 102.0
1 Unified 210.0 210.0
2 Clé NaN NaN
3 samsung 155.0 158.0
Updated dataframe with KITCODE:使用 KITCODE 更新数据框:
Désignation First Last KITCODE
0 SC200 100.0 102.0 100
0 SC200 100.0 102.0 101
0 SC200 100.0 102.0 102
1 Unified 210.0 210.0 210
2 Clé NaN NaN NaN
3 samsung 155.0 158.0 155
3 samsung 155.0 158.0 156
3 samsung 155.0 158.0 157
3 samsung 155.0 158.0 158
If you want to ignore the rows that have NaN, then you can change the code to the following:如果要忽略具有 NaN 的行,则可以将代码更改为以下内容:
def kitcd(d):
first = int(d.First)
last = int(d.Last) + 1
return [i for i in range(first, last)]
df_l = df_l.dropna(axis=0, subset=['First', 'Last'])
df_l['KITCODE'] = df_l.apply(lambda x: kitcd(x), axis = 1)
df_l = df_l.explode('KITCODE')
print (df_l)
This will remove the record from df_l and will help you process the data as normal.这将从 df_l 中删除记录并帮助您正常处理数据。 The output will have same set with one row missing
'Clé'
输出将具有相同的设置,其中一行缺少
'Clé'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.