简体   繁体   English

如何遍历Python中的CSV层次树?

[英]How to iterate through CSV hierarchy tree in Python?

I'm trying to iterate through a hierarchy tree in a CSV file to do certain things with the items.我正在尝试遍历 CSV 文件中的层次结构树以对项目执行某些操作。 The hierarchy is setup as follows: Hierarchy tree层次结构设置如下:层次结构树

Note that the actual items under the parent and child levels won't have words like parent/child in them, like this: Example tree请注意,父级和子级下的实际项目中不会有父/子之类的词,如下所示:示例树

Now, I want to create two nested loops, with the 'outer' loop iterating through the parent levels and the 'inner' loop iterating through the child levels, and the bodies of each loop will do stuff with the information from each level of cell they deal with.现在,我想创建两个嵌套循环,“外”循环遍历父级别,“内”循环遍历子级别,每个循环的主体将处理来自每个级别的信息单元格他们处理。 To add additional clarification, each parent will have a variable number of children, so parent 1 could have 4, parent 2 could have 2, parent 3 could have 8, and so on.为了进一步说明,每个父母都有可变数量的孩子,所以父母 1 可能有 4 个,父母 2 可能有 2 个,父母 3 可能有 8 个,依此类推。 Can anyone help me with how I'd set these loops up to iterate through them the way I want?谁能帮助我如何设置这些循环以按照我想要的方式遍历它们?

I would have used pandas .我会使用pandas

import pandas as pd

df = pd.read_csv(path_to_csv)

# Fill empty cells in parent column with the precedent value
df['parent'] = df.parent.fillna(method='ffill')

# Group cells with the same parent
df = df.groupby('parent').agg({'child': list})

print(df)
print(df.loc['Apple'])

This will end up with a table indexed by parent and their children grouped into a single cell in a list structure.这将最终得到一个由父级索引的表,它们的子级分组到列表结构中的单个单元格中。 You can next do whatever you want with the each list or even convert the pandas dataframe (=table) to anything else (list, dict...) that suits your case best.接下来,您可以对每个列表执行任何操作,甚至可以将 pandas dataframe (=table) 转换为最适合您情况的任何其他内容 (list, dict...)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM