[英]Create a new pandas dataframe column based on other column of the dataframe
[英]Create a new Column based on the latest column with a value on a dataframe - Pandas
我有一個看起來像這樣的數據框:
orderID m1 m2 m3
1 2020-03-04 2020-03-04 NaT
2 2020-03-08 NaT NaT
我想創建一個新列,顯示可用於訂單的最新里程碑 (mn)。
輸出看起來像這樣
orderID m1 m2 m3 last_m_available
1 2020-03-04 2020-03-04 NaT m2
2 2020-03-08 NaT NaT m1
我將如何用 python 做到這一點?
您可以交換列的順序,測試沒有缺失值並使用DataFrame.idxmax
:
#if orderID is not index
df = df.set_index('orderID')
df = df.apply(pd.to_datetime)
df['last_m_available'] = df.iloc[:, ::-1].notna().idxmax(axis=1)
print (df)
m1 m2 m3 last_m_available
orderID
1 2020-03-04 2020-03-04 NaT m2
2 2020-03-08 NaT NaT m1
如果可能,一些只有缺失值的行:
df = df.apply(pd.to_datetime)
mask = df.iloc[:, ::-1].notna()
df['last_m_available'] = np.where(mask.any(axis=1), mask.idxmax(axis=1), np.nan)
print (df)
m1 m2 m3 last_m_available
orderID
1 2020-03-04 2020-03-04 NaT m2
2 NaT NaT NaT NaN
您可以使用dataframe.dropna()
刪除列表中的空列。
cols = df.dropna().columns
df['last_m_available'] = cols
print(df)
m1 m2 m3 last_m_available
orderID
1 2020-03-04 2020-03-04 NaN m2
2 2020-03-08 NaN NaN m1
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.