簡體   English   中英

根據來自其他列的值使用將 function 應用於多個列,在 dataframe 中創建新列

[英]Create new column into dataframe based on values from other columns using apply function onto multiple columns

我正在使用 apply function 根據現有列 [TV_TIC 和 ERRORS] 值創建一個新列,即 ERROR_TV_TIC 到 dataframe 中。 我不確定我做錯了什么。 在某些情況下它可以工作,而在另一種情況下它不會並拋出錯誤。

DataFrame:

ERRORS|TV_TIC
|2.02101E+41
['Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)']|nan
['Future Option Indicator is missing']|nan
['Trade Id is missing', 'Future Option Indicator is missing']|nan
['Trade Id is missing', 'Future Option Indicator is missing']|nan

工作時的代碼:

def validate_tv_tic(trades):
    tv_tiv_errors = list() 
    if pd.isnull(trades['TV_TIC']):
        tv_tiv_errors.append("Initial validations passed still TV_TIC missing")
    if pd.notnull(trades['TV_TIC']) and len(trades['TV_TIC']) != 42:
        tv_tiv_errors.append("Initial validations passed and TV_TIC is also generated but length is != 42 chars")
    return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan

trades['ERROR_TV_TIC'] = trades.apply(validate_tv_tic, axis=1)

不起作用時的代碼:這里現在的條件是 2 列系列,我確保我傳遞的是“&”而不是“and”

def validate_tv_tic(trades):
    tv_tiv_errors = list()
    if pd.isnull(trades['ERRORS']) & pd.isnull(trades['TV_TIC']):
        tv_tiv_errors.append("Initial validations passed still TV_TIC missing")
    if pd.isnull(trades['ERRORS']) & pd.notnull(trades['TV_TIC']) & len(trades['TV_TIC']) != 42:
        tv_tiv_errors.append("Initial validations passed and TV_TIC is also generated but length is != 42 chars")
    return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan

trades['ERROR_TV_TIC'] = trades.apply(validate_tv_tic, axis=1)

我得到的錯誤:('具有多個元素的數組的真值不明確。使用 a.any() 或 a.all()','發生在索引 3')

使用“and”的錯誤描述錯誤截圖 2

使用“&”時的錯誤說明錯誤截圖 2

我的直覺是說 pd.isnull 是在某個地方引起問題,但不確定。

代碼沒有問題。 dataframe 中的數據存在問題。

列 ERRORS 是字符串列表,當 > 1 個項目作為列值存在時引發錯誤。 所以,我在第 3 行和第 4 行遇到錯誤

ERRORS

['Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)']
['Future Option Indicator is missing']
['Trade Id is missing', 'Future Option Indicator is missing']
['Trade Id is missing', 'Future Option Indicator is missing']

找到根本原因后,我將列表更改為字符串,其中元素由非逗號元素分隔,這對我有用。

return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan

return ' & '.join(errors) if len(errors) > 0 else np.nan

這創建了我的 dataframe 列錯誤如下:

ERRORS

Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)
Future Option Indicator is missing
Trade Id is missing & Future Option Indicator is missing
Trade Id is missing & Future Option Indicator is missing

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM