[英]How to get the column names of an Excel where the rows are blank Using Python
[英]Extract column names from Excel where rows values are blank or NaN using Python
在 Excel 表中存在的多列中,我需要检查并找出 excel 表中每行的特定列的名称,其中行具有 NaN 或空白,然后在另一列中输入列的名称。 如果列中没有任何空白或 NaN 值,则将其写为无间隙。
输入数据:
col1 col2 col3 col4 col5 col6 Result
AB BC CD EF GH IJ
AN AP AR AS AT
BP BQ BR BT
BZ BY BX BW
CP CQ CR CS NaN
CZ NaN CR CS NaN
预期 output:
Result
No Gaps
col3 is not available
col2, col5 not available
col3, col4 not available
col5, col6 not available
col1, col5, col6 not available
下面的脚本可以为 dataframe 中具有 NaN 值的行提供正确的 output,但如果有任何空白行,则不考虑。
我一直在使用的脚本:
p = df[['col1','col2','col3','col4','col5']]
z = p.isna().dot(p.columns+",").str.rstrip(",")
df['Results'] = np.where(z.ne(''),z.add(" not available"),"No Gaps")
还尝试使用:
z = p.eq('').dot(p.columns+",").str.rstrip(",")
想法是在测试之前将空字符串替换为缺失值:
p = df[['col1','col2','col3','col4','col5']]
z = p.replace('', np.nan).isna().dot(p.columns+",").str.rstrip(",")
df['Results'] = np.where(z.ne(''),z.add(" not available"),"No Gaps")
print (df)
col1 col2 col3 col4 col5 col6 Results
0 AB BC CD EF GH IJ No Gaps
1 AN AP AR AS AT col3 not available
2 BP BQ BR BT col2,col5 not available
3 BZ BY BX BW col3,col4 not available
4 CP CQ CR CS NaN col5 not available
5 CZ NaN CR CS NaN col2,col5 not available
如果可能带有空格的空字符串,请使用:
p = df[['col1','col2','col3','col4','col5']]
z = p.replace(r'^\s*$', np.nan, regex=True).isna().dot(p.columns+",").str.rstrip(",")
df['Results'] = np.where(z.ne(''),z.add(" not available"),"No Gaps")
print (df)
col1 col2 col3 col4 col5 col6 Results
0 AB BC CD EF GH IJ No Gaps
1 AN AP AR AS AT col3 not available
2 BP BQ BR BT col2,col5 not available
3 BZ BY BX BW col3,col4 not available
4 CP CQ CR CS NaN col5 not available
5 CZ NaN CR CS NaN col2,col5 not available
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.