How to conduct row-wise operations between two pandas dataframes with nested string represented lists

Question

I'm looking for a pandas-friendly way of conducting row-wise operations between two nested string-represented lists in two dataframes. Here is my incomplete attempt:

import pandas as pd
import ast

df1 = pd.DataFrame({'id': [0, 1, 2],
                   'nested_ls': ["[20, 15, 5]", "[8, 7, 0]", "[124, 23, 43]"]})

df2 = pd.DataFrame({'id': [0, 1, 2],
                   'nested_ls': ["[10, 3, 2]", "[14, 7, 0]", "[100, 3, 20]"]})

df3 = pd.Dataframe()

# This is something along the lines of what needs to be accomplished but,
# it is evaluating the series versus the row-wise nested lists
df3['nested_ls_diff'] = ast.literal_eval(df1['nested_ls']) - ast.literal_eval(df2['nested_ls'])

# Throws - ValueError: malformed node or string: 0

The desired output would be a dataframe that looks like this:

df3 = pd.DataFrame({'id': [0, 1, 2],
                   'nested_ls_diff': ["[10, 12, 3]", "[-6, 0, 0]", "[24, 20, 23]"]})

Answer 1

Your code: ast.literal_eval(df1['nested_ls']) tries to evaluate the string representation of the whole series. It's not what you want to do. Instead, you want:

# this gives you a series of lists
df1['nested_ls'].apply(ast.literal_eval)

or better:

# this gives you a numpy array
pd.eval(df1['nested_ls'])

So this would work for you (though not ideal):

df3 = pd.DataFrame()

df3['nested_ls_diff'] = list(pd.eval(df1['nested_ls']) - pd.eval(df2['nested_ls']))

Note that each cell in df3['nested_ls_diff'] is a list, not a string.

Update we can just do a list comprehension here for the general case:

df3['nested_ls_diff'] = [[a-b for a,b in zip(*xy)] 
                         for xy in zip(pd.eval(df1['nested_ls']),pd.eval(df2['nested_ls']))
                        ]

Due to the nature of data (object dtype), this would perform comparable to the other approach.

How to conduct row-wise operations between two pandas dataframes with nested string represented lists

Question

1 answers

solution1
2 2021-02-26 16:22:21

How to conduct row-wise operations between two pandas dataframes with nested string represented lists

Question

1 answers

solution1 2 2021-02-26 16:22:21

solution1
2 2021-02-26 16:22:21