简体   繁体   English

使用 pandas 组合 python 上不同列的具有 NaN 的特定行

[英]Combining specific rows that have NaN for a different column on python using pandas

I am hoping this question makes sense.我希望这个问题是有道理的。 I have a table I extracted from a PDF of chemical names that I am trying to format and I am having issues it looks like this: table我有一张从 PDF 中提取的表格,我试图格式化它的化学名称,但我遇到了如下所示的问题:表格

Some of the chemical names are split into multiple rows and I need each name in its won row.一些化学名称被分成多行,我需要在其赢得的行中的每个名称。 I did notice the chemicals whose names are split into multiple rows have an NaN in the first column.我确实注意到名称被分成多行的化学物质在第一列中有一个 NaN。

EDIT: after running dt.head(15).to_dict()编辑:运行后 dt.head(15).to_dict()

{'Unnamed: 0': {6: '1', 7: nan, 8: '2', 9: '3', 10: nan, 11: nan, 12: '4', 13: '5', 14: nan, 15: nan, 16: '6', 17: '7', 18: '8', 19: '9', 20: nan}, 'Phenolics': {6: 'Dihydroquercetin', 7: '7,30-dimethyl ether', 8: 'Artelin', 9: 'Esculin 7-', 10: 'methylether', 11: '(methylesculin)', 12: 'Esculin', 13: 'Scopoletin (7-', 14: 'hydroxy-6-', 15: 'methoxycoumarin)', 16: 'Axillarin', 17: 'Esculetin', 18: 'Isoscopoletin', 19: '6-Beta-D-glucosyl-7-', 20: 'methoxycoumarin'}} {'未命名:0':{6:'1',7:nan,8:'2',9:'3',10:nan,11:nan,12:'4',13:'5', 14: nan, 15: nan, 16: '6', 17: '7', 18: '8', 19: '9', 20: nan}, '酚类': {6: '二氢槲皮素', 7: '7,30-二甲醚',8:'Artelin',9:'七叶苷 7-',10:'甲醚',11:'(甲基七叶苷)',12:'七叶苷',13:'东莨菪碱(7- ', 14: 'hydroxy-6-', 15: 'methoxycoumarin)', 16: 'Axillarin', 17: 'Esculetin', 18: 'Isoscopoletin', 19: '6-Beta-D-glucosyl-7-' , 20: '甲氧基香豆素'}}

Can anyone help me?谁能帮我? Thank you!谢谢!

df["group"] = df["Unnamed: 0"].ffill()
df.groupby("group").agg({"Phenolics": "".join})

A one-line solution单线解决方案

df = df.fillna(method='ffill').groupby('Unnamed: 0')['Phenolics'].apply(' '.join).reset_index()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用必须在另一列的特定行上计算的公式填充值为0或NaN的Pandas数据框行 - Fill Pandas dataframe rows, whose value is a 0 or NaN, with a formula that have to be calculated on specific rows of another column 我正在尝试使用 Pandas 用 NaN 替换特定列中特定行集中的数据 - I am trying to replace data within a specific set of rows in a specific column with NaN using Pandas 如果下一行在特定列中有 NaN,则连接 Pandas 行 - concatenate pandas rows if next row has NaN in specific column Pandas dataframe- 计算特定列为 NaN 的行 - Pandas dataframe- Count rows where specific column is NaN 在特定列(熊猫)中首次出现 NaN 后删除所有行 - drop all rows after first occurance of NaN in specific column (pandas) Python / Pandas:如何使用NaN合并不同行中的重复行? - Python/Pandas: How to consolidate repeated rows with NaN in different columns? Python Pandas 根据列值过滤行返回 NaN - Python Pandas filtering rows based on column value returns NaN Python Pandas Groupby NaN 在列中应该具有相同的值 - Python Pandas Groupby NaN should have the same value in the column Python/Pandas:根据不同列的平均值将 NaN 更改为值 - Python/Pandas: change NaN to valeues based on average from different column Python pandas 用不同的条件从表中的另一列替换 NaN - Python pandas replace NaN from another column in table with different conditions
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM