Pandas AssertionError 应用 function 时返回包含列表的元组

Question

I am applying a function to a Pandas DataFrame , and returning a tuple , to cast into multiple DataFrame columns using zip(* ) . I am applying a function to a Pandas DataFrame , and returning a tuple , to cast into multiple DataFrame columns using zip(* ) .

The returned tuple , contains a list , containing one or more tuples .返回的tuple包含一个list ，其中包含一个或多个tuples 。

In cases where at least one of the the nested lists contain a different count of tuples from the rest of the lists , everything works fine.如果嵌套lists中的至少一个包含与lists的 rest 不同的tuples数，则一切正常。

In rare cases where the function returns all nested lists with equal tuple counts within, an AssertionError: Shape of new values must be compatible with manager shape is raised.在 function 返回所有具有相同tuple计数的嵌套lists的极少数情况下，会引发AssertionError: Shape of new values must be compatible with manager shape 。

I suspect Pandas is seeing the consistent nested list lengths and is trying to unpack the list(tuples) into separate columns.我怀疑 Pandas 看到一致的嵌套list长度，并试图将list(tuples)解压缩到单独的列中。

How can I force Pandas to always store the returned list as is, regardless of the conditions above?无论上述条件如何，如何强制 Pandas 始终按原样存储返回的list ？

(Python 3.7.4, Pandas 1.0.3) （Python 3.7.4、Pandas 1.0.3）

Code that works:有效的代码：

import pandas as pd
import numpy as np

def simple_function(type_count):
    calculated_value1 = np.random.randint(5)
    calculated_value2 = np.random.randint(5)
    types_list = [tuple((x, calculated_value2)) for x in range(0, type_count)]
    return calculated_value1, types_list
    
df = pd.DataFrame([{'name': 'Joe', 'types': 1},
                   {'name': 'Beth', 'types': 1},
                   {'name': 'John', 'types': 1},
                   {'name': 'Jill', 'types': 2},
                   ], columns=['name', 'types'])

df['calculated_result'], df['types_list'] = zip(*df['types'].apply(simple_function))

Code that raises AssertionError: Shape of new values must be compatible with manager shape :引发AssertionError: Shape of new values must be compatible with manager shape ：

import pandas as pd
import numpy as np

def simple_function(type_count):
    calculated_value1 = np.random.randint(5)
    calculated_value2 = np.random.randint(5)
    types_list = [tuple((x, calculated_value2)) for x in range(0, type_count)]
    return calculated_value1, types_list
    
df = pd.DataFrame([{'name': 'Joe', 'types': 1},
                   {'name': 'Beth', 'types': 1},
                   {'name': 'John', 'types': 1},
                   {'name': 'Jill', 'types': 1},
                   ], columns=['name', 'types'])

df['calculated_result'], df['types_list'] = zip(*df['types'].apply(simple_function))

Answer 1

By creating a DataFrame from the list on your result:通过从结果列表中创建 DataFrame：

df[['calculated_result','types_list']] = pd.DataFrame(df['types'].apply(simple_function).tolist())

You can get similar result with array您可以使用数组获得类似的结果

df['calculated_result'], df['types_list'] = np.array(df['types'].apply(simple_function).tolist()).T

Pandas AssertionError 应用 function 时返回包含列表的元组

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-06-21 11:52:17

Pandas AssertionError 应用 function 时返回包含列表的元组

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-06-21 11:52:17

解决方案1
0 已采纳 2020-06-21 11:52:17