[英]Create new column based on values from other columns / apply a function of multiple columns, row-wise in Pandas
[英]Create new column into dataframe based on values from other columns using apply function onto multiple columns
我正在使用 apply function 根據現有列 [TV_TIC 和 ERRORS] 值創建一個新列,即 ERROR_TV_TIC 到 dataframe 中。 我不確定我做錯了什么。 在某些情況下它可以工作,而在另一種情況下它不會並拋出錯誤。
DataFrame:
ERRORS|TV_TIC
|2.02101E+41
['Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)']|nan
['Future Option Indicator is missing']|nan
['Trade Id is missing', 'Future Option Indicator is missing']|nan
['Trade Id is missing', 'Future Option Indicator is missing']|nan
工作時的代碼:
def validate_tv_tic(trades):
tv_tiv_errors = list()
if pd.isnull(trades['TV_TIC']):
tv_tiv_errors.append("Initial validations passed still TV_TIC missing")
if pd.notnull(trades['TV_TIC']) and len(trades['TV_TIC']) != 42:
tv_tiv_errors.append("Initial validations passed and TV_TIC is also generated but length is != 42 chars")
return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan
trades['ERROR_TV_TIC'] = trades.apply(validate_tv_tic, axis=1)
不起作用時的代碼:這里現在的條件是 2 列系列,我確保我傳遞的是“&”而不是“and”
def validate_tv_tic(trades):
tv_tiv_errors = list()
if pd.isnull(trades['ERRORS']) & pd.isnull(trades['TV_TIC']):
tv_tiv_errors.append("Initial validations passed still TV_TIC missing")
if pd.isnull(trades['ERRORS']) & pd.notnull(trades['TV_TIC']) & len(trades['TV_TIC']) != 42:
tv_tiv_errors.append("Initial validations passed and TV_TIC is also generated but length is != 42 chars")
return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan
trades['ERROR_TV_TIC'] = trades.apply(validate_tv_tic, axis=1)
我得到的錯誤:('具有多個元素的數組的真值不明確。使用 a.any() 或 a.all()','發生在索引 3')
我的直覺是說 pd.isnull 是在某個地方引起問題,但不確定。
代碼沒有問題。 dataframe 中的數據存在問題。
列 ERRORS 是字符串列表,當 > 1 個項目作為列值存在時引發錯誤。 所以,我在第 3 行和第 4 行遇到錯誤
ERRORS
['Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)']
['Future Option Indicator is missing']
['Trade Id is missing', 'Future Option Indicator is missing']
['Trade Id is missing', 'Future Option Indicator is missing']
找到根本原因后,我將列表更改為字符串,其中元素由非逗號元素分隔,這對我有用。
從
return tv_tiv_errors if len(tv_tiv_errors) > 0 else np.nan
至
return ' & '.join(errors) if len(errors) > 0 else np.nan
這創建了我的 dataframe 列錯誤如下:
ERRORS
Length of Underlying Symbol for Option Contract is exceeding allowed limits(10 chars)
Future Option Indicator is missing
Trade Id is missing & Future Option Indicator is missing
Trade Id is missing & Future Option Indicator is missing
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.