I have three lists of tuples and the first element of those lists is a year, like shown below.
list1 = [
('2010', 1783675.0), ('2011', 1815815.0), ('2012', 1633258.0), ('2013', 1694062.0), ('2014', 1906527.0),
('2015', 1908661.0), ('2016', 2492979.0), ('2017', 2846997.0), ('2018', 2930313.0), ('2019', 2654724.0)
]
list2 = [
('2010', 302816.0), ('2011', 229549.0), ('2012', 323063.0), ('2013', 285066.0), ('2014', 282003.0),
('2015', 354500.0), ('2016', 275383.0), ('2017', 322074.0), ('2018', 366909.0), ('2019', 297942.0)
]
list3 =[
('2010', 149036.0), ('2011', 144112.0), ('2012', 173944.0), ('2013', 205724.0), ('2014', 214019.0),
('2015', 261462.0), ('2016', 260646.0), ('2017', 279267.0), ('2018', 288120.0), ('2019', 277106.0)
]
I want to create a pandas.DataFrame using those lists, setting the year as the row index:
list1 list2 list3
2010 1783675.0 302816.0 149036.0
2011 1815815.0 229549.0 144112.0
2012 1633258.0 323063.0 173944.0
2013 1694062.0 285066.0 205724.0
2014 1906527.0 282003.0 214019.0
2015 1908661.0 354500.0 261462.0
2016 2492979.0 275383.0 260646.0
2017 2846997.0 322074.0 279267.0
2018 2930313.0 366909.0 288120.0
2019 2654724.0 297942.0 277106.0
You can create a new DataFrame
for each list and merge them using the merge
method.
import pandas as pd
list1 = [('2010', 1783675.0), ('2011', 1815815.0), ('2012', 1633258.0), ('2013', 1694062.0),
('2014', 1906527.0), ('2015', 1908661.0), ('2016', 2492979.0), ('2017', 2846997.0), ('2018', 2930313.0),
('2019', 2654724.0)]
list2 = [('2010', 302816.0), ('2011', 229549.0), ('2012', 323063.0), ('2013', 285066.0),
('2014', 282003.0), ('2015', 354500.0), ('2016', 275383.0), ('2017', 322074.0), ('2018', 366909.0),
('2019', 297942.0)]
list3 =[('2010', 149036.0), ('2011', 144112.0), ('2012', 173944.0), ('2013', 205724.0),
('2014', 214019.0), ('2015', 261462.0), ('2016', 260646.0), ('2017', 279267.0), ('2018', 288120.0),
('2019', 277106.0)]
df = (pd.DataFrame(data=list1, columns=["year", "list1"])
.merge(pd.DataFrame(data=list2, columns=["year", "list2"]), on="year")
.merge(pd.DataFrame(data=list3, columns=["year", "list3"]), on="year"))
Another option to the answers already provided: python's defaultdict could simplify the process of lumping the data into one dictionary before reading it into a dataframe:
from collections import defaultdict
from itertools import chain
#chain the lists into one, then
#get all the similar values into one list:
d = defaultdict(list)
for k, v in chain(list1,list2,list3):
d[k].append(v)
#read the data into a pandas dataframe:
df = pd.DataFrame.from_dict(d, orient='index', columns=['list1','list2','list3'])
list1 list2 list3
2010 1783675.0 302816.0 149036.0
2011 1815815.0 229549.0 144112.0
2012 1633258.0 323063.0 173944.0
2013 1694062.0 285066.0 205724.0
2014 1906527.0 282003.0 214019.0
2015 1908661.0 354500.0 261462.0
2016 2492979.0 275383.0 260646.0
2017 2846997.0 322074.0 279267.0
2018 2930313.0 366909.0 288120.0
2019 2654724.0 297942.0 277106.0
You can iterate over the lists and create a dictionary in the correct format, and then turn that into a DataFrame. Note that this assumes ordered lists, with the same years in each list.
import pandas as pd
list1 = [('2010', 1783675.0), ('2011', 1815815.0), ('2012', 1633258.0),
('2013', 1694062.0), ('2014', 1906527.0), ('2015', 1908661.0),
('2016', 2492979.0), ('2017', 2846997.0), ('2018', 2930313.0),
('2019', 2654724.0)]
list2 = [('2010', 302816.0), ('2011', 229549.0), ('2012', 323063.0),
('2013', 285066.0), ('2014', 282003.0), ('2015', 354500.0),
('2016', 275383.0), ('2017', 322074.0), ('2018', 366909.0),
('2019', 297942.0)]
list3 =[('2010', 149036.0), ('2011', 144112.0), ('2012', 173944.0),
('2013', 205724.0), ('2014', 214019.0), ('2015', 261462.0),
('2016', 260646.0), ('2017', 279267.0), ('2018', 288120.0),
('2019', 277106.0)]
df_dict = {}
years = [el[0] for el in list1]
df_dict["list1"] = [el[1] for el in list1]
df_dict["list2"] = [el[1] for el in list2]
df_dict["list3"] = [el[1] for el in list3]
df = pd.DataFrame(df_dict, index=years)
Another solution is to use pandas.concat
on pandas.Series
made in a for-loop. The code is following:
series = []
for l, name in [(list1, 'list1'), (list2, 'list2'), (list3, 'list3')]:
series.append(pd.Series({tup[0]: tup[1] for tup in l}, name=name))
df = pd.concat(series, axis=1)
And the result looks like this:
>>> print(df)
list1 list2 list3
2010 1783675.0 302816.0 149036.0
2011 1815815.0 229549.0 144112.0
2012 1633258.0 323063.0 173944.0
2013 1694062.0 285066.0 205724.0
2014 1906527.0 282003.0 214019.0
2015 1908661.0 354500.0 261462.0
2016 2492979.0 275383.0 260646.0
2017 2846997.0 322074.0 279267.0
2018 2930313.0 366909.0 288120.0
2019 2654724.0 297942.0 277106.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.