简体   繁体   中英

Word Cloud in Python with sentence ranking

I have a list of top 100 search trends for a category in an ecommerce, where number 1 is the top searched phrase by users, for example:

Ranking        Search phrase
1          mesa ratona moderna
2                 mesa ratona 
3               mesa de arrime
4          mesa ratona nordica
5                mesas ratonas
6       mesas ratonas modernas
Eccetera

I would like to obtain a bag of words (ideally a word cloud) taking into account that the words in the phrase ranked 1 are more important than the words in the phrase ranked 100.

I found several alternatives like CountVectorizer, TF-IDF, Wordcloud but none of them take into account the relative importance of a search phrase being rank 1 or rank 50.

Thank you very much for your kind help!! All the best, Federico

Example of pd.dataframe() :

df = pd.DataFrame({
        'Search phrase' : ['mesa ratona moderna','mesa ratona',
                           'mesa de arrime', 'mesa ratona nordica',
                           'mesas ratonas', 'mesas ratonas modernas'
                          ]
    })
    df.index.name = 'Ranking'

If you want to draw wordcloud depends on the ranking, try this :

code

li_ = list(range(len(df.index)))
li_.reverse()
df['count'] = li_
freq = df.set_index('Search phrase').to_dict()['count']
keyword = wordcloud.generate_from_frequencies(freq)
array = keyword.to_array()

plt.figure(figsize = (10,10))
plt.imshow(array,
           interpolation = 'bilinear')
plt.axis('off')
plt.show()

result

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM