簡體   English   中英

Pandas AssertionError 應用 function 時返回包含列表的元組

[英]Pandas AssertionError when applying function which returns tuple containing list

I am applying a function to a Pandas DataFrame , and returning a tuple , to cast into multiple DataFrame columns using zip(* ) .

返回的tuple包含一個list ,其中包含一個或多個tuples

如果嵌套lists中的至少一個包含與lists的 rest 不同的tuples數,則一切正常。

在 function 返回所有具有相同tuple計數的嵌套lists的極少數情況下,會引發AssertionError: Shape of new values must be compatible with manager shape

我懷疑 Pandas 看到一致的嵌套list長度,並試圖將list(tuples)解壓縮到單獨的列中。

無論上述條件如何,如何強制 Pandas 始終按原樣存儲返回的list


(Python 3.7.4、Pandas 1.0.3)

有效的代碼:

import pandas as pd
import numpy as np

def simple_function(type_count):
    calculated_value1 = np.random.randint(5)
    calculated_value2 = np.random.randint(5)
    types_list = [tuple((x, calculated_value2)) for x in range(0, type_count)]
    return calculated_value1, types_list
    
df = pd.DataFrame([{'name': 'Joe', 'types': 1},
                   {'name': 'Beth', 'types': 1},
                   {'name': 'John', 'types': 1},
                   {'name': 'Jill', 'types': 2},
                   ], columns=['name', 'types'])

df['calculated_result'], df['types_list'] = zip(*df['types'].apply(simple_function))

引發AssertionError: Shape of new values must be compatible with manager shape

import pandas as pd
import numpy as np

def simple_function(type_count):
    calculated_value1 = np.random.randint(5)
    calculated_value2 = np.random.randint(5)
    types_list = [tuple((x, calculated_value2)) for x in range(0, type_count)]
    return calculated_value1, types_list
    
df = pd.DataFrame([{'name': 'Joe', 'types': 1},
                   {'name': 'Beth', 'types': 1},
                   {'name': 'John', 'types': 1},
                   {'name': 'Jill', 'types': 1},
                   ], columns=['name', 'types'])

df['calculated_result'], df['types_list'] = zip(*df['types'].apply(simple_function))

通過從結果列表中創建 DataFrame:

df[['calculated_result','types_list']] = pd.DataFrame(df['types'].apply(simple_function).tolist())

您可以使用數組獲得類似的結果

df['calculated_result'], df['types_list'] = np.array(df['types'].apply(simple_function).tolist()).T

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM