[英]Calculating the similarity between 2 sentences
I would like to calculate the similarity between 2 sentences and I need the percentage value which says "how good" they match with each other. 我想计算2个句子之间的相似度,我需要一个百分比值来说明它们彼此匹配的程度。 Sentences like, 像这样的句子
1. The red fox is moving on the hill.
2. The black fox is moving in the bill.
I was considering about Levenshtein distance
but I am not sure about this because it says it is for finding similarity between "2 words". 我当时正在考虑Levenshtein distance
但是我不确定,因为它说这是为了寻找“ 2个字”之间的相似性。 So can this Levenshtein distance
help me or what other method can help me? 那么这个Levenshtein distance
可以帮助我吗?或者还有什么其他方法可以帮助我呢? I will be using JavaScript. 我将使用JavaScript。
尝试此解决方案的JS string diff
Use Jaccard index . 使用Jaccard索引 。 You can find implementations in any language, including JavaScript ( here is one, didn't test it personally though). 您可以找到任何语言的实现,包括JavaScript( 这是一种,虽然没有亲自测试过)。
this is what i would do depending on how important this is. 这是我会做的,具体取决于这有多重要。 if this is medium to low priority here is a simple algo. 如果是中到低优先级,这是一个简单的算法。
But the context in why you want to do this is really important. 但是,为什么要执行此操作的上下文非常重要。 ie the example you gave us could be for students learning english etc. ie theres different algorithms i would use if i was trying to see if crowd sourced users are describing the same paragraph vs if article topics are similar enough for a suggested reading section. 也就是说,您提供给我们的示例可能是针对学习英语等的学生,也就是说,如果我尝试查看人群中的用户是否在描述同一段,而文章主题是否足够相似以建议阅读,那么我将使用不同的算法。
A common Method to compute the similarity of two sentences is to cosine similiarity. 计算两个句子相似度的常用方法是余弦相似度。 Don't know if there an implemenatation in JavaScript exists. 不知道JavaScript中是否存在实现。 The cosine similiarity looks on words and not of single letters. 余弦相似度仅针对单词而不是单个字母。 The web is full of explenations for example here . 该网站是完全explenations例如这里 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.