A consistent answer seems to be to avoid iterating over rows while working with Pandas. I'd like to understand how I can do so in the following case.
from typing import List
@dataclass
class Person:
id: int
name: str
age: int
persons_df = pd.DataFrame(data={'id': [1, 2, 3], 'name': ['A', 'B', 'C'], 'age': [32, 44, '86']})
persons_list: List[Person] = [] #populate this list with Person objects, created from the dataframe above
# my approach is to use iterrows()
for row in persons_df.itertuples():
person = Person(row.id, row.name, int(row.age)) # type: ignore
plist.append(person)
I'd like to find an option which can avoid the iterrows, and if possible, be done in a manner that has some type safety built in (avoid the mypy ignore comment).
thanks!
I am not sure if thats what you are looking for, but maybe this helps:
import pandas as pd
df = pd.DataFrame(data={'id': [1, 2, 3], 'name': ['A', 'B', 'C'], 'age': [32, 44, '86']})
class Person:
def __init__(self, lst):
self.id = lst[0]
self.name = lst[1]
self.age = lst[2]
df.apply(Person, axis=1).tolist()
out:
[<__main__.Person at 0x176eee70608>,
<__main__.Person at 0x176eee704c8>,
<__main__.Person at 0x176eee70388>]
I add a new answer, because the title of the question is map dataframe rows to a list of dataclass objects , and this has not been addressed yet.
To return dataclasses, we can slightly improve @Andreas answer , without requiring an additional constructor receiving a list. We just have to use Python spread operators.
I see two ways of mapping:
df.apply(lambda row: MyDataClass(**row), axis=1)
df.apply(lambda row: MyDataClass(*row), axis=1)
Example:
from dataclasses import dataclass @dataclass class Person: id: int name: str age: int import pandas df = pandas.DataFrame(data={ 'id': [1, 2, 3], 'name': ['A', 'B', 'C'], 'age': [32, 44, '86'] })
persons = df.apply(lambda row: Person(*row), axis=1)
persons = df[['age', 'id', 'name']].apply(lambda row: Person(**row), axis=1)
print(type(persons)) print(persons)
<class 'pandas.core.series.Series'> 0 Person(id=1, name='A', age=32) 1 Person(id=2, name='B', age=44) 2 Person(id=3, name='C', age='86') dtype: object
WARNINGS:
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.