I am working with a data set that has yearly data as well as lifelong data in python 2.7. I have a dictionary that stores lifelong data, as well as an inner DataFrame of yearly data. So it looks something like
Bear1
{'color':'brown',
'grown_size':'7ft',
'stats': df1}
}
where the dataframe 'df1' is built like the following:
meals children territory
4 5 8
2 4 6
5 2 7
I would like to get a dataframe that is rectangular, each row having a different years data as well as all of the lifelong stats, so this would become something like:
color grow_size meals children territory
brown 7ft 4 5 8
brown 7ft 2 4 6
brown 7ft 5 2 7
I assume that this would need something like the Series.repeat() method in pandas, although this has yet to work for me. What would be the fastest way of accomplishing this, as there are many such bears with varying ages!
EDIT Unfortunately I found a problem with my question. The yearly data is already inside of a dataframe, not inside of a dictionary!
I have tried the following code for this:
pd.DataFrame.from_dict(bears['bear1'])
with 'bears['bear1']' being the dictionary posted above, but I am receiving the following message:
File "<stdin>", line 1, in <module>
File "/Users/masongardner/Library/Python/2.7/lib/python/site- packages/pandas/core/frame.py", line 226, in __init__
mgr = self._init_dict(data, index, columns, dtype=dtype)
File "/Users/masongardner/Library/Python/2.7/lib/python/site-packages/pandas/core/frame.py", line 363, in _init_dict
dtype=dtype)
File "/Users/masongardner/Library/Python/2.7/lib/python/site-packages/pandas/core/frame.py", line 5158, in _arrays_to_mgr
index = extract_index(arrays)
File "/Users/masongardner/Library/Python/2.7/lib/python/site-packages/pandas/core/frame.py", line 5197, in extract_index
ValueError: If using all scalar values, you must pass an index
Thanks!
Use from_dict
:
In [20]:
d={'color':'brown',
'grown_size':'7ft',
'stats': {2007:[1,5,7,2],
2008:[5,3,4,5],
2009:[5,2,6,7]}
}
pd.DataFrame.from_dict(d)
Out[20]:
color grown_size stats
2007 brown 7ft [1, 5, 7, 2]
2008 brown 7ft [5, 3, 4, 5]
2009 brown 7ft [5, 2, 6, 7]
also pd.DataFrame(d)
will also work
EDIT
Here is a simple way to have what you want for one bear.
# recreating your data
d = {'meals':[4,2,5], 'children':[5,4,2], 'territory':[8,6,7]}
bear1 = {'color':'brown',
'grown_size':'7ft',
'stats': DataFrame(d)}
def bear_to_df(bear_dict):
df = bear_dict['stats']
for (k,v) in bear_dict.iteritems():
if k == 'stats':
pass
else:
df[k] = v
return df
In [32]: bear_to_df(bear1)
Out[32]:
children meals territory color grown_size
0 5 4 8 brown 7ft
1 4 2 6 brown 7ft
2 2 5 7 brown 7ft
How many bears do you have ? If you want to concatenate all your bears'data in the same DataFrame use pandas.concat
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.