I'm trying to conditionally assign a value to a column using pandas assign.
I tried using pandas assign to make a new column and label it SV if length value specified by the column sv_length is >= 50 and InDel if length is <50.
df3=df2.assign(InDel_SV='InDel' if df2.sv_length < 50 else 'SV')
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
other examples use np.where. Why do I have to use numpy? shouldn't this simple function be part of pandas?
https://chrisalbon.com/python/data_wrangling/pandas_create_column_using_conditional/
This syntax is supported through the use of apply
.
df3 = df2.assign(
InDel_SV=df2.sv_length.apply(lambda x: 'InDel' if x < 50 else 'SV'))
However, in the interest of performance, you are recommended to use numpy because apply is a slow convenience function . The pandaic way of doing this is with numpy.where
:
df3 = df2.assign(InDel_SV=np.where(df2.sv_length < 50, 'InDel', 'SV'))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.