I have two data frames which looks like following:
df:
Review Text Noun Thumbups Rating
I've been using this app for over a month. It ... [app, month, job, track, ATV, replay, animatio... 2.0 4
Would be nice to be able to import files from ... [My, Tracks, app, phone, Google, Drive, import... 6.0 5
When screen off it shows a straight line. Not ... [screen, line, route] 1.0 3
No Offline Maps! It used to have offline maps ... [Offline, Maps, menu, option, video, exchange,... 20.0 1
Great application. Designed with very well tho... [application, application] 20.0 5
Great App. Nice and simple but accurate. Wish ... [Great, App, Nice, Exported] 0.0 5
Does just what it says. Had a couple of questi... [couple, service] 0.0 5
Save For Offline - This does not work. The rou... [Save, Offline, route, filesystem] 12.0 1
Since latest update app will not run. Subscrip... [update, app, Subscription, March, application] 9.0 5
Great app. Love it! And all the things it does... [Great, app, Thank, work] 1.0 5
I have paid for subscription but keeps telling... [subscription, trial, period] 0.0 2
Error: The route cannot be save for no locatio... [Error, route, i, GPS] 0.0 2
df1:
Noun Thumb_count
accuracy 1.0
almost 1.0
animation 2.0
antarctica 1.0
app 25.0
application 29.0
apps 1.0
atv 2.0
august 3.0
battery 1.0
I want to check if the value of column 'Noun' of df1 present in 'Noun' column of df, then create a new column in df1 with name 'average' and take the average of 'Rating' column of df rows where the Noun value present.
I started with comparing two columns of dataframe by using following code:
df['Noun'].isin(set(df1['Noun']))
However, I got TypeError and System Error: Following are the error:
TypeError: unhashable type: 'list'
SystemError: <built-in method view of numpy.ndarray object at 0x7ff6313e3df0> returned a result with an error set
Could anyone help me where am I making the mistake?
A sample output would have been very useful. In its absence, my attempt;
df.Noun=df.Noun.str.strip('[]')#Strip corner brackets
df.Noun=df.Noun.str.split(",")#Make list again.
df=df.explode('Noun')#Get each item in df.Noun
df[df.Noun.str.contains(('|').join(df1.Noun.values.tolist()))]#Check membership
df.groupby('Noun')['Rating'].mean()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.