简体   繁体   中英

Pandas Dataframe - find sums in list in column

I have a list like

source = [{'TGA': [0, 1, 0, 0], 'AAC': [0, 0, 0, 1], 'GAA': [0, 0, 1, 0], 
           'GTG': [1, 0, 0, 0]},{'TGA': [0, 1, 0, 0], 'AAC': [0, 0, 0, 1], 
           'GAA': [0, 0, 1, 0], 'GTG': [1, 0, 0, 0]} ]

I need to sum every digit in list column:

pandas.DataFrame(source)
        AAC           GAA           GTG           TGA
  0  [0, 0, 0, 1]  [0, 0, 1, 0]  [1, 0, 0, 0]  [0, 1, 0, 0]
  1  [0, 0, 0, 1]  [0, 0, 1, 0]  [1, 0, 0, 0]  [0, 1, 0, 0]`

And in final:

         AAC           GAA           GTG           TGA
    sum  [0, 0, 0, 2 ] [0, 0, 2, 0] [2, 0, 0, 0]  [0, 2, 0, 0]

How can I do this?

You can use this to sum a list of dict of list:

source = [{'TGA': [0, 1, 0, 0], 'AAC': [0, 0, 0, 1], 'GAA': [0, 0, 1, 0],
           'GTG': [1, 0, 0, 0]},{'TGA': [0, 1, 0, 0], 'AAC': [0, 0, 0, 1],
           'GAA': [0, 0, 1, 0], 'GTG': [1, 0, 0, 0]} ]

res = {}
for d in source:
    for key,value in d.items():
            if key not in res:
                    res[key] = value
            else:
                    res[key] = [x+y for x,y in zip(res[key],value) ]

print res

You can easily change the entries into numpy.array s, then sum:

import numpy as np

>> df.applymap(np.array).sum()
AAC    [0, 0, 0, 2]
GAA    [0, 0, 2, 0]
GTG    [2, 0, 0, 0]
TGA    [0, 2, 0, 0]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM