简体   繁体   中英

How to convert flat items/list to Pandas dataframe

I am currently trying to create a pandas DataFrame from results grabbed from a database. The data is most efficiently retrieved from the DB looking like this:

(
("First", datetime.date(2014,10,5), 1.1),
("First", datetime.date(2014,10,4), 1.2),
("First", datetime.date(2014,10,3), 1.3),
("First", datetime.date(2014,10,2), 1.4),
("Second", datetime.date(2014,10,5), 2.1),
("Second", datetime.date(2014,10,4), 2.2),
("Second", datetime.date(2014,10,3), 2.3),
("Second", datetime.date(2014,10,2), 2.4),
("Third", datetime.date(2014,10,5), 3.1),
("Third", datetime.date(2014,10,4), 3.2),
("Third", datetime.date(2014,10,3), 3.3),
("Third", datetime.date(2014,10,2), 3.4),
)

The goal is to have the first value in a row be the DF column, the second value in the row be the index in the DF, and the third value be the value. Eg:

                          First     Second    Third
datetime.date(2014,10,5)  1.1       2.1       3.1
datetime.date(2014,10,4)  1.2       2.2       3.2
datetime.date(2014,10,3)  1.3       2.3       3.3
datetime.date(2014,10,2)  1.4       2.4       3.4

Any thoughts on a quick way to transform this data? I am new to pandas, and a bit stuck.

df.pivot can move column values (eg the first column) into columns (and column values (eg the dates) into the index):

import datetime as DT
import pandas as pd

data = [("First", DT.date(2014, 10, 5), 1.1),
        ("First", DT.date(2014, 10, 4), 1.2),
        ("First", DT.date(2014, 10, 3), 1.3),
        ("First", DT.date(2014, 10, 2), 1.4),
        ("Second", DT.date(2014, 10, 5), 2.1),
        ("Second", DT.date(2014, 10, 4), 2.2),
        ("Second", DT.date(2014, 10, 3), 2.3),
        ("Second", DT.date(2014, 10, 2), 2.4),
        ("Third", DT.date(2014, 10, 5), 3.1),
        ("Third", DT.date(2014, 10, 4), 3.2),
        ("Third", DT.date(2014, 10, 3), 3.3),
        ("Third", DT.date(2014, 10, 2), 3.4), ]

df = pd.DataFrame(data, columns=['cols', 'date', 'val'])
df = df.pivot(columns='cols', index='date')
df.columns = df.columns.droplevel(0)

print(df)

yields

cols        First  Second  Third
date
2014-10-02    1.4     2.4    3.4
2014-10-03    1.3     2.3    3.3
2014-10-04    1.2     2.2    3.2
2014-10-05    1.1     2.1    3.1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM