繁体   English   中英

如何将 Pandas 数据帧转换为多个 NamedTuple 的列表

[英]How to convert a pandas dataframe into a list of multiple NamedTuple

我正在处理一个代码,我需要将多个NamedTuple映射到一个列表中。 下面是代码示例 - 我的主要问题是关于双重NamedTuple PeopleNamePeopleAge List的映射 - 我不清楚如何做到这一点。 如果这分为两个步骤,1/ 将整行提取到通用NamedTupe ,然后 2/ 将记录拆分为不同的NamedTuple PeopleNamePeopleAge

from typing import NamedTuple, List

import pandas as pd

data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
people = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])

PeopleName = NamedTuple("PeopleName", [("Name", str)])
PeopleAge = NamedTuple("PeopleAge", [("Age", int)])
PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])

# The code below is not correct
Demography = NamedTuple(
    "Demography", [("names", List[(PeopleName, PeopleAge)]), ("postalcodes", PeoplePC)],
)


def to_nested_tuple(k, g):
    peoples = list(
        g["Name"].to_frame().itertuples(name="Person", index=False),
        # rec["Age"].to_frame().itertuples(name="PeopleAge", index=False),
    )
    return Demography(peoples, PeoplePC(k))


d = [to_nested_tuple(*item) for item in people.groupby("PostalCode")]

print(d)

这个注解List[(PeopleName, PeopleAge)]抛出TypeError: Too many parameters for typing.List; actual 2, expected 1 TypeError: Too many parameters for typing.List; actual 2, expected 1

具有 2 种不同类型的元组也应该用typing.Tuple注释:

List[Tuple[PeopleName, PeopleAge]]

但是,要注释参数,最好使用抽象集合类型,例如SequenceIterable

Demography = NamedTuple(
    "Demography", [("names", Sequence[Tuple[PeopleName, PeopleAge]]), ("postalcodes", PeoplePC)],
)

我不会为每个组应用to_nested_tuple ,而是直接按照以下方式进行:

d = [Demography([(PeopleName(row['Name']), PeopleAge(row['Age'])) for _, row in group.iterrows()], PeoplePC(k))
     for k, group in people.groupby("PostalCode")] 

现在,结果将打印为:

[Demography(names=[(PeopleName(Name='tom'), PeopleAge(Age=10)), (PeopleName(Name='juli'), PeopleAge(Age=14))], postalcodes=PeoplePC(PostalCode='ab 11')),
 Demography(names=[(PeopleName(Name='nick'), PeopleAge(Age=15))], postalcodes=PeoplePC(PostalCode='ab 22'))]

使用list(df.itertuples())其中df是您的数据list(df.itertuples())

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM