简体   繁体   English

是否有任何机器学习模型适合此数据集和所需的输出?

[英]Is any Machine Learning model appropriate for this dataset and desired output?

My dataset consists of video game titles from various websites, formatted in different ways.我的数据集由来自各种网站的视频游戏标题组成​​,以不同的方式格式化。 Here's my example:这是我的例子:

"The Legend Of Zelda: Wind Waker, Nintendo"
"The Legend Of Zelda: The Wind Waker"
"The Legend Of Zelda: Wind Waker, Nintendo"
"The Legend Of Zelda: Wind Waker, Nintendo"
"Zelda: Wind Waker Hd Nintendo Wii U Game"
"The Legend Of Zelda: The Wind Waker"
"Legend Of Zelda: The Wind Waker Hd (nintendo Wii"
"The Legend Of Zelda: Wind Waker Of Game (nintendo"
"The Legend Of Zelda: The Wind Waker Nintendo Wii"
"Nintendo Wii U Game Zelda: Wind Waker Hd"
"The Legend Of Zelda: The Wind Waker Hd Wii U"
"The Legend Of Zelda: Wind Waker, Nintendo Pinterest"
"Zelda: Hd (nintendo Wii The"
"The Legend Of Zelda: The Wind Waker Hd Wii U Pinterest"
"The Legend Of Zelda: The Wind Waker Hd"
"Legend Of Zelda: Wind Waker Hd (nintendo Wii"
"The Legend Of Zelda: The Wind Waker Hd"
"The Legend Of Zelda: Wind Waker, Nintendo Wii U"
"The Legend Of Zelda Wind Hd"
"Zelda Wind Waker Hd"
"The Legend Of Zelda: Wind Waker, Nintendo Pinterest"
"The Legend Of Zelda Wind Waker Wii U Nintendo"
"Wii U The Legend Of Zelda: The Wind Waker Hd"
"Zelda: Wind Waker Hd"
"The Legend Of Zelda: The Wind Waker Hd Game Wii"
"The Legend Of Zelda: The Wind Waker Hd Nintendo Wii U"
"Zelda: Wind Waker Hd"
"The Legend Of Zelda The Wind Waker Hd Wii U"

The correct output for this data would be:此数据的正确输出将是:

The Legend Of Zelda: The Wind Waker HD - Title The Legend Of Zelda: The Wind Waker HD - 标题

Wii U - Platform Wii U - 平台

Nintendo - Publisher Nintendo - 出版商

I can feed a model 100's of these datasets, with what I would then expect as the correct output, and then hope that the model "learns" for future datasets of titles what an expected output might be.我可以为这些数据集提供 100 个模型,然后将我期望的作为正确输出,然后希望模型“学习”未来标题数据集的预期输出可能是什么。

Is this something that Machine Learning can do?这是机器学习可以做的事情吗? What model should I use?我应该使用什么型号? I have never done anything with ML before so I'm unsure if this is a good use case for it.我之前从未对 ML 做过任何事情,所以我不确定这是否是一个很好的用例。

正如我在您的问题中所看到的,标题、平台和发布者(输出)是从原始数据(输入)中提取的,因此您可以使用类似于命名实体识别的内容,您应该查看文献以了解更多信息,但这是最有可能的方向。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM