![](/img/trans.png)
[英]Create a new pandas dataframe column based on other column of the dataframe
[英]Create a new Column based on the latest column with a value on a dataframe - Pandas
我有一个看起来像这样的数据框:
orderID m1 m2 m3
1 2020-03-04 2020-03-04 NaT
2 2020-03-08 NaT NaT
我想创建一个新列,显示可用于订单的最新里程碑 (mn)。
输出看起来像这样
orderID m1 m2 m3 last_m_available
1 2020-03-04 2020-03-04 NaT m2
2 2020-03-08 NaT NaT m1
我将如何用 python 做到这一点?
您可以交换列的顺序,测试没有缺失值并使用DataFrame.idxmax
:
#if orderID is not index
df = df.set_index('orderID')
df = df.apply(pd.to_datetime)
df['last_m_available'] = df.iloc[:, ::-1].notna().idxmax(axis=1)
print (df)
m1 m2 m3 last_m_available
orderID
1 2020-03-04 2020-03-04 NaT m2
2 2020-03-08 NaT NaT m1
如果可能,一些只有缺失值的行:
df = df.apply(pd.to_datetime)
mask = df.iloc[:, ::-1].notna()
df['last_m_available'] = np.where(mask.any(axis=1), mask.idxmax(axis=1), np.nan)
print (df)
m1 m2 m3 last_m_available
orderID
1 2020-03-04 2020-03-04 NaT m2
2 NaT NaT NaT NaN
您可以使用dataframe.dropna()
删除列表中的空列。
cols = df.dropna().columns
df['last_m_available'] = cols
print(df)
m1 m2 m3 last_m_available
orderID
1 2020-03-04 2020-03-04 NaN m2
2 2020-03-08 NaN NaN m1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.