I have a list of top 100 search trends for a category in an ecommerce, where number 1 is the top searched phrase by users, for example:
Ranking Search phrase
1 mesa ratona moderna
2 mesa ratona
3 mesa de arrime
4 mesa ratona nordica
5 mesas ratonas
6 mesas ratonas modernas
Eccetera
I would like to obtain a bag of words (ideally a word cloud) taking into account that the words in the phrase ranked 1 are more important than the words in the phrase ranked 100.
I found several alternatives like CountVectorizer, TF-IDF, Wordcloud but none of them take into account the relative importance of a search phrase being rank 1 or rank 50.
Thank you very much for your kind help!! All the best, Federico
Example of pd.dataframe()
:
df = pd.DataFrame({
'Search phrase' : ['mesa ratona moderna','mesa ratona',
'mesa de arrime', 'mesa ratona nordica',
'mesas ratonas', 'mesas ratonas modernas'
]
})
df.index.name = 'Ranking'
If you want to draw wordcloud
depends on the ranking, try this :
code
li_ = list(range(len(df.index)))
li_.reverse()
df['count'] = li_
freq = df.set_index('Search phrase').to_dict()['count']
keyword = wordcloud.generate_from_frequencies(freq)
array = keyword.to_array()
plt.figure(figsize = (10,10))
plt.imshow(array,
interpolation = 'bilinear')
plt.axis('off')
plt.show()
result
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.