简体   繁体   English

如何仅根据列表标题向用户建议标签?

[英]How do I suggest tags to the user based only on the title of a list?

The Problem: 问题:

I need to suggest tags to the user based only on the title (5-15 words) of a list they are about to create. 我只需要根据标签将要创建的列表的标题(5-15个字)向用户建议标签。

We have around 30 pre-determined tags - 我们大约有30个预定标签-

Gaming, Movies, TV shows, Documentaries, Books, Music, Art, History, People, Adventure, Sports, Cooking, Travel, Places, Food, Drinks, Fitness, DIY, Technology, Science, Cars, Bikes, Comedy, Shopping, Clothes, Fashion, Photography, Nature, etc.

So, for example, for a list with title 'Most expensive fine-dine restaurants around the world' suggested tags could be (Food, Places, Drinks, Travel) . 因此,例如,对于标题为“世界上最昂贵的高档餐厅”的列表,建议的标签可能是(食品,地点,饮料,旅行)

It does not need to be super accurate, just needs to work satisfactorily well, I am sure it would get better as more training data comes in from our users. 它不需要非常准确,只需要令人满意地工作即可,我相信随着用户提供更多培训数据,它会变得更好。 I don't have any training data for supervised learning yet. 我还没有任何监督学习的培训数据。

I find myself lost in the vast space of Machine Learning and Natural Language Processing. 我发现自己迷失在机器学习和自然语言处理的广阔空间中。 It would be very helpful if someone can suggest what methods/algorithms/libraries I should use for this specific task, and the background reading I should do before it. 如果有人可以建议我应该为该特定任务使用什么方法/算法/库,以及在此之前应该做的背景阅读,这将非常有帮助。

Thanks 谢谢

You can use word2vec. 您可以使用word2vec。 Get some pretrained model, calculate vectors for tags. 获取一些预训练的模型,计算标签的向量。 Then calculate vector for new title. 然后计算新标题的向量。 Find cosine similarity between title vector and each tag vectors. 查找标题向量和每个标记向量之间的余弦相似度。 Take for description tags which similarity to title greater then some threshold. 以与标题相似度大于某个阈值的描述标签为例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM