![](/img/trans.png)
[英]Unable to find the index of an element in a complex 2D Python array using np.where()
[英]Want to find list index of element correpsonding to pandas dataframe (np.where with .index())
我想找到一個滿足條件的字典或列表項的索引,並將其寫入數據框中的新列。
我從以下設置開始:
import pandas as pd
import numpy as np
df = pd.DataFrame(data = {'col1': ['2018_08', '2008_02','2019_01','2017_04']})
dates = {0: ['2019-01-15 00:00:00', '2019_01', 1, 2019, 0],
-1: ['2018-12-15 00:00:00', '2018_12', 12, 2018, -1],
-2: ['2018-11-15 00:00:00', '2018_11', 11, 2018, -2],
-3: ['2018-10-15 00:00:00', '2018_10', 10, 2018, -3],
-4: ['2018-09-15 00:00:00', '2018_09', 9, 2018, -4],
-5: ['2018-08-15 00:00:00', '2018_08', 8, 2018, -5]}
我想檢查數據幀df
中col1
列的值是否包含在字典dates
。 如果是,則返回鍵或字典中相應列表的最后一個條目。 如果不是,則返回NaT或NaN。 我試過了:
df['month_seq'] = np.where(df.col1.isin([dates[i][1] for i in range(0,-6,-1)]), '?' ,pd.NaT)
它標識正確的條目,但不返回相應的負數。 輸出為:
col1 month_seq
0 2018_08 ?
1 2008_02 NaT
2 2019_01 ?
3 2017_04 NaT
如果嘗試過
[dates[i][1] for i in range(0,-6,-1)].index(df.col1)
返回錯誤。
在此先感謝您的幫助。
將map
與通過詞典理解創建的詞典一起使用:
df = pd.DataFrame(data = {'col1': ['2018_08', '2008_02','2019_01','2017_04']})
dates = {0: ['2019-01-15 00:00:00', '2019_01', 1, 2019, 0],
-1: ['2018-12-15 00:00:00', '2018_12', 12, 2018, -1],
-2: ['2018-11-15 00:00:00', '2018_11', 11, 2018, -2],
-3: ['2018-10-15 00:00:00', '2018_10', 10, 2018, -3],
-4: ['2018-09-15 00:00:00', '2018_09', 9, 2018, -4],
-5: ['2018-08-15 00:00:00', '2018_08', 8, 2018, -5]}
d = {v[1]:k for k, v in dates.items()}
print (d)
{'2019_01': 0, '2018_12': -1, '2018_11': -2, '2018_10': -3, '2018_09': -4, '2018_08': -5}
df['new'] = df['col1'].map(d)
print (df)
col1 new
0 2018_08 -5.0
1 2008_02 NaN
2 2019_01 0.0
3 2017_04 NaN
您可以使用帶有適當功能的apply (在這種情況下locate
):
import pandas as pd
import numpy as np
df = pd.DataFrame(data = {'col1': ['2018_08', '2008_02','2019_01','2017_04']})
dates = {0: ['2019-01-15 00:00:00', '2019_01', 1, 2019, 0],
-1: ['2018-12-15 00:00:00', '2018_12', 12, 2018, -1],
-2: ['2018-11-15 00:00:00', '2018_11', 11, 2018, -2],
-3: ['2018-10-15 00:00:00', '2018_10', 10, 2018, -3],
-4: ['2018-09-15 00:00:00', '2018_09', 9, 2018, -4],
-5: ['2018-08-15 00:00:00', '2018_08', 8, 2018, -5]}
def locate(e, d=dates):
for k, values in dates.items():
if e in values:
return k
return np.nan
result = df['col1'].apply(locate)
print(result)
產量
0 -5.0
1 NaN
2 0.0
3 NaN
Name: col1, dtype: float64
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.