想要找到與pandas數據框對應的元素的列表索引（帶有.index（）的np.where）

Question

我想找到一個滿足條件的字典或列表項的索引，並將其寫入數據框中的新列。

我從以下設置開始：

import pandas as pd
import numpy as np
df = pd.DataFrame(data = {'col1': ['2018_08', '2008_02','2019_01','2017_04']})

dates = {0: ['2019-01-15 00:00:00', '2019_01', 1, 2019, 0],
         -1: ['2018-12-15 00:00:00', '2018_12', 12, 2018, -1],
         -2: ['2018-11-15 00:00:00', '2018_11', 11, 2018, -2],
         -3: ['2018-10-15 00:00:00', '2018_10', 10, 2018, -3],
         -4: ['2018-09-15 00:00:00', '2018_09', 9, 2018, -4],
         -5: ['2018-08-15 00:00:00', '2018_08', 8, 2018, -5]}

我想檢查數據幀df中col1列的值是否包含在字典dates 。 如果是，則返回鍵或字典中相應列表的最后一個條目。 如果不是，則返回NaT或NaN。 我試過了：

df['month_seq'] = np.where(df.col1.isin([dates[i][1] for i in range(0,-6,-1)]), '?' ,pd.NaT)

它標識正確的條目，但不返回相應的負數。 輸出為：

    col1    month_seq
0   2018_08     ?
1   2008_02     NaT
2   2019_01     ?
3   2017_04     NaT

如果嘗試過

[dates[i][1] for i in range(0,-6,-1)].index(df.col1)

返回錯誤。

在此先感謝您的幫助。

Answer 1

將map與通過詞典理解創建的詞典一起使用：

df = pd.DataFrame(data = {'col1': ['2018_08', '2008_02','2019_01','2017_04']})

dates = {0: ['2019-01-15 00:00:00', '2019_01', 1, 2019, 0],
         -1: ['2018-12-15 00:00:00', '2018_12', 12, 2018, -1],
         -2: ['2018-11-15 00:00:00', '2018_11', 11, 2018, -2],
         -3: ['2018-10-15 00:00:00', '2018_10', 10, 2018, -3],
         -4: ['2018-09-15 00:00:00', '2018_09', 9, 2018, -4],
         -5: ['2018-08-15 00:00:00', '2018_08', 8, 2018, -5]}

d = {v[1]:k for k, v in dates.items()}
print (d)
{'2019_01': 0, '2018_12': -1, '2018_11': -2, '2018_10': -3, '2018_09': -4, '2018_08': -5}

df['new'] = df['col1'].map(d)
print (df)
      col1  new
0  2018_08 -5.0
1  2008_02  NaN
2  2019_01  0.0
3  2017_04  NaN

Answer 2

您可以使用帶有適當功能的apply （在這種情況下locate ）：

import pandas as pd
import numpy as np
df = pd.DataFrame(data = {'col1': ['2018_08', '2008_02','2019_01','2017_04']})

dates = {0: ['2019-01-15 00:00:00', '2019_01', 1, 2019, 0],
         -1: ['2018-12-15 00:00:00', '2018_12', 12, 2018, -1],
         -2: ['2018-11-15 00:00:00', '2018_11', 11, 2018, -2],
         -3: ['2018-10-15 00:00:00', '2018_10', 10, 2018, -3],
         -4: ['2018-09-15 00:00:00', '2018_09', 9, 2018, -4],
         -5: ['2018-08-15 00:00:00', '2018_08', 8, 2018, -5]}


def locate(e, d=dates):
    for k, values in dates.items():
        if e in values:
            return k
    return np.nan


result = df['col1'].apply(locate)
print(result)

產量

0   -5.0
1    NaN
2    0.0
3    NaN
Name: col1, dtype: float64

想要找到與pandas數據框對應的元素的列表索引（帶有.index（）的np.where）

問題描述

2 個解決方案

解決方案1
2 已采納 2019-01-04 12:33:10

解決方案2
0 2019-01-04 12:26:57

想要找到與pandas數據框對應的元素的列表索引（帶有.index（）的np.where）

問題描述

2 個解決方案

解決方案1 2 已采納 2019-01-04 12:33:10

解決方案2 0 2019-01-04 12:26:57

解決方案1
2 已采納 2019-01-04 12:33:10

解決方案2
0 2019-01-04 12:26:57