簡體   English   中英

Python Pandas:將一列的值檢查到另一列 dataframe

[英]Python Pandas: checking value of one column into column of another dataframe

我有兩個數據框,如下所示:

東風:

         Review Text                                        Noun                                             Thumbups   Rating
    I've been using this app for over a month. It ...   [app, month, job, track, ATV, replay, animatio...         2.0   4
    Would be nice to be able to import files from ...   [My, Tracks, app, phone, Google, Drive, import...         6.0   5
    When screen off it shows a straight line. Not ...   [screen, line, route]                                     1.0   3
    No Offline Maps! It used to have offline maps ...   [Offline, Maps, menu, option, video, exchange,...         20.0  1
    Great application. Designed with very well tho...   [application, application]                                20.0  5
    Great App. Nice and simple but accurate. Wish ...   [Great, App, Nice, Exported]                                0.0 5
    Does just what it says. Had a couple of questi...   [couple, service]                                         0.0   5
    Save For Offline - This does not work. The rou...   [Save, Offline, route, filesystem]                       12.0   1
    Since latest update app will not run. Subscrip...   [update, app, Subscription, March, application]           9.0   5
    Great app. Love it! And all the things it does...   [Great, app, Thank, work]                                1.0    5
    I have paid for subscription but keeps telling...   [subscription, trial, period]                            0.0    2
    Error: The route cannot be save for no locatio...   [Error, route, i, GPS]                                   0.0    2

df1:

Noun    Thumb_count
accuracy    1.0
almost      1.0
animation   2.0
antarctica  1.0
app         25.0
application 29.0
apps        1.0
atv         2.0
august      3.0
battery     1.0

我想檢查 df1 的“名詞”列的值是否存在於 df 的“名詞”列中,然后在 df1 中創建一個名為“平均”的新列,並取 df 行的“評級”列的平均值,其中名詞存在的價值。

我開始使用以下代碼比較 dataframe 的兩列:

df['Noun'].isin(set(df1['Noun']))

但是,我得到了 TypeError 和 System Error: 以下是錯誤:

TypeError: unhashable type: 'list'
SystemError: <built-in method view of numpy.ndarray object at 0x7ff6313e3df0> returned a result with an error set

誰能幫助我我在哪里犯了錯誤?

樣品 output 將非常有用。 在它缺席的情況下,我的嘗試;

df.Noun=df.Noun.str.strip('[]')#Strip corner brackets
df.Noun=df.Noun.str.split(",")#Make list again.
df=df.explode('Noun')#Get each item in df.Noun 
df[df.Noun.str.contains(('|').join(df1.Noun.values.tolist()))]#Check membership
df.groupby('Noun')['Rating'].mean()

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM