I would like to fill up a pandas Dataframe value by value.
I get a series of dictionaries liKe:
LIST_OF_DICTS = [ {'name':'carlos','age':34}, {'name':'John', 'country':'USA'},{'name':'John', 'COLOR':'GREEN'},{'name':'Andrew', 'country':'USA','age':56}
For that I loop over all the values of all the dicts
df = pd.DataFrame( columns = ['name' , 'age', 'city' , 'country'])
df.set_index('name')
I get an empty DF. All fine
When I try to add a single value I do as follows:
OPTION 1
df['carlos']['age']=34 # does nto work because carlos is not part of the index yet
OPTION 2
df.loc['carlos','age']=34 # works but then an empty column named 'name" appears:
OPTION 3
df2 = pd.DataFrame( index=['name'],columns = [ 'age', 'city' , 'country'])
df2.loc['calos','age']=34
# does not work, name appears as one of the index and not the name of the index column
EXPECTED RESULT:
# loop over the list of dicts and add values one by one
for dict in LIST_OF_DICTS:
myname = dict['name'] # this being the index. The dicts all have 'name' key
for key in dict.keys:
df[myname,key]=dict[key]
the result is a DF where all the values of the DICTs are added to the DF. the names are unique, and there is no repetition of pair values.
Some help?
@anky answer is the one to go for.
LIST_OF_DICTS = [ {'name':'carlos','age':34}, {'name':'John', 'country':'USA'},{'name':'John', 'COLOR':'GREEN'},{'name':'Andrew', 'country':'USA','age':56}
pd.DataFrame(LIST_OF_DICTS).set_index('name')
The nice thing to take into account here is that pandas fills IN the DataFrame even if the dictionaries do not include all the values of every row. In the non-filled values you get NaN. So remember to do:
myDF= myDF.fillna('')
This is very convenient if you are putting together data from heterogeneous API responses for instance. You add data to a Dictionaries attending to a particular "index" in this case name.
Warnings: a) If there is names repeated (two dictionaries with the sane name) you get a row for every one of them, but then the values of the indes are not unique. b) If one of the dictionaries does not have a name you get NaN in the index. You have to take that into account.
LIST_OF_DICTS = [ {'name':'carlos','age':34}, {'name':'John', 'country':'USA'},{'name':'John', 'COLOR':'GREEN'},{'name':'Andrew', 'country':'USA','age':56}, {'country':'Mexico','age':3456},{'name':'Andrew', 'country':'Germany','age':56}]
myDF= pd.DataFrame(LIST_OF_DICTS).set_index('name')
myDF
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.