TypeError: unhashable type: 'numpy.ndarray' 第一次出現時

Question

我正在嘗試在 pandas df 中首次出現chain_id的唯一值。 我正在使用以下代碼：

import pandas as pd
import re
df = pd.DataFrame(columns="Sender", "Subject", "Body", "Datetime", "chain_id"


first_occurrence_df = df[re.match(pd.unique(chain_id), df.chain_id),]

但它正在返回錯誤； 不可散列類型：numpy.ndarray。 是的，我知道這與 df 的“形狀”有關。 但是我對編碼完全陌生，沒有任何先驗知識-所以任何人都可以用外行人的話來解釋嗎？ 我該如何解決這個問題？

我還有 3 個其他變量：“發件人”、“主題”、“正文”、“日期時間”、“chain_id”。 除 Datetime 外都是字符串，是日期格式。 chain_id 標識 email 鏈。

錯誤信息

TypeError                                 Traceback (most recent call last)
Input In [232], in <cell line: 2>()
      1 import re
----> 2 first = df[re.match(pd.unique(df.chain_id), df.chain_id),]

File C:\Anaconda3\envs\universal\lib\re.py:191, in match(pattern, string, flags)
    188 def match(pattern, string, flags=0):
    189     """Try to apply the pattern at the start of the string, returning
    190     a Match object, or None if no match was found."""
--> 191     return _compile(pattern, flags).match(string)

File C:\Anaconda3\envs\universal\lib\re.py:294, in _compile(pattern, flags)
    292     flags = flags.value
    293 try:
--> 294     return _cache[type(pattern), pattern, flags]
    295 except KeyError:
    296     pass

TypeError: unhashable type: 'numpy.ndarray'

Answer 1

所以，明確一點，您想要的是每個chain_id出現的第一行？ 您可以使用

first = df.drop_duplicates( ['chain_id'], keep='first' )

保留第一個是默認設置，但既然它很重要，您不妨指定它。

TypeError: unhashable type: 'numpy.ndarray' 第一次出現時

問題描述

1 個解決方案

解決方案1
1 2022-05-16 22:37:54

TypeError: unhashable type: 'numpy.ndarray' 第一次出現時

問題描述

1 個解決方案

解決方案1 1 2022-05-16 22:37:54

解決方案1
1 2022-05-16 22:37:54