簡體   English   中英

如何將 Pandas 數據幀轉換為多個 NamedTuple 的列表

[英]How to convert a pandas dataframe into a list of multiple NamedTuple

我正在處理一個代碼,我需要將多個NamedTuple映射到一個列表中。 下面是代碼示例 - 我的主要問題是關於雙重NamedTuple PeopleNamePeopleAge List的映射 - 我不清楚如何做到這一點。 如果這分為兩個步驟,1/ 將整行提取到通用NamedTupe ,然后 2/ 將記錄拆分為不同的NamedTuple PeopleNamePeopleAge

from typing import NamedTuple, List

import pandas as pd

data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
people = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])

PeopleName = NamedTuple("PeopleName", [("Name", str)])
PeopleAge = NamedTuple("PeopleAge", [("Age", int)])
PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])

# The code below is not correct
Demography = NamedTuple(
    "Demography", [("names", List[(PeopleName, PeopleAge)]), ("postalcodes", PeoplePC)],
)


def to_nested_tuple(k, g):
    peoples = list(
        g["Name"].to_frame().itertuples(name="Person", index=False),
        # rec["Age"].to_frame().itertuples(name="PeopleAge", index=False),
    )
    return Demography(peoples, PeoplePC(k))


d = [to_nested_tuple(*item) for item in people.groupby("PostalCode")]

print(d)

這個注解List[(PeopleName, PeopleAge)]拋出TypeError: Too many parameters for typing.List; actual 2, expected 1 TypeError: Too many parameters for typing.List; actual 2, expected 1

具有 2 種不同類型的元組也應該用typing.Tuple注釋:

List[Tuple[PeopleName, PeopleAge]]

但是,要注釋參數,最好使用抽象集合類型,例如SequenceIterable

Demography = NamedTuple(
    "Demography", [("names", Sequence[Tuple[PeopleName, PeopleAge]]), ("postalcodes", PeoplePC)],
)

我不會為每個組應用to_nested_tuple ,而是直接按照以下方式進行:

d = [Demography([(PeopleName(row['Name']), PeopleAge(row['Age'])) for _, row in group.iterrows()], PeoplePC(k))
     for k, group in people.groupby("PostalCode")] 

現在,結果將打印為:

[Demography(names=[(PeopleName(Name='tom'), PeopleAge(Age=10)), (PeopleName(Name='juli'), PeopleAge(Age=14))], postalcodes=PeoplePC(PostalCode='ab 11')),
 Demography(names=[(PeopleName(Name='nick'), PeopleAge(Age=15))], postalcodes=PeoplePC(PostalCode='ab 22'))]

使用list(df.itertuples())其中df是您的數據list(df.itertuples())

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM