Currently, I have data in the following form:
ie.
[ ('ab', {'a' : [apple1], 'b': [ball1]}), ('cd', {'a' : [apple2], 'b': [ball2]})]
List[Tuple[Any, dict{'key':List}]]
The goal is to create a pandas data frame in the following form:
start a b
ab apple1 ball1
cd apple2 ball2
I have tried to it the following way:
df = pd.DataFrame(columns=['start', 'a', 'b'])
for start, details in mylist:
df = df.append({'start' : start}, ignore_index= True)
df = df.append({'a' : details['a']} , ignore_index= True)
df = df.append({'b': details['b']}, ignore_index=True)
I'm trying to figure out an optimized way to do this.
pd.DataFrame.from_dict
Pandas works well with a dictionary or a list of dictionaries. You have something in between. In this case, converting to a dictionary is trivial:
L = [('ab', {'a' : ['apple1'], 'b': ['ball1']}),
('cd', {'a' : ['apple2'], 'b': ['ball2']})]
res = pd.DataFrame.from_dict(dict(L), orient='index')
res = res.apply(lambda x: x.str[0])
print(res)
a b
ab apple1 ball1
cd apple2 ball2
Like this:
form = [ ('ab', {'a' : ['apple1'], 'b': ['ball1']}), ('cd', {'a' : ['apple2'], 'b': ['ball2']})]
# separate 'start' from rest of data - inverse zip
start, data = zip(*form)
# create dataframe
df = pd.DataFrame(list(data))
# remove data from lists in each cell
df = df.applymap(lambda l: l[0])
df.insert(loc=0, column='start', value=start)
print(df)
start a b
0 ab apple1 ball1
1 cd apple2 ball2
or, if you want start to be the index of the dataframe:
# separate 'start' from rest of data - inverse zip
index, data = zip(*form)
# create dataframe
df = pd.DataFrame(list(data), index=index)
df.index.name = 'start'
# remove data from lists in each cell
df = df.applymap(lambda l: l[0])
print(df)
start a b
ab apple1 ball1
cd apple2 ball2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.