简体   繁体   English

计算两对 X 和 y 之间的相似性的最佳做法是什么

[英]What is the best practice to calculate the similarity between two couples of X And y

I have some values about one element.我对一个元素有一些价值。 For example, element1: values1, values2 .例如, element1: values1, values2 For each element, I need to calculate the 'score' for a given number of features.对于每个元素,我需要计算给定数量特征的“分数”。 Imagine that we have one feature that is represented as:想象一下,我们有一个特征表示为:

  • An high score for the feature1 is given by an high score of value1 And a low score of value2.特征 1 的高分由 value1 的高分和 value2 的低分给出。

So If I suppose that to an high score of value1 (1) And a low score of value2 (0) correspond an high score of 'feature1', what is the best practice to calculate the score of feature1 given as value1 And value2 two different scores?因此,如果我假设 value1 (1) 的高分和 value2 (0) 的低分对应于 'feature1' 的高分,那么计算作为 value1 和 value2 两个不同的 feature1 的得分的最佳实践是什么分数? (For example value1=0.7, value=0.2). (例如 value1=0.7,value=0.2)。 I use Python as programming language, And I prefer to use sklearn ad module but every solution that fits well is accepted.我使用 Python 作为编程语言,我更喜欢使用 sklearn 广告模块,但每个适合的解决方案都被接受。

  1. First normalize your data.首先规范化您的数据。 One type of normalization is to make your values1, values2 fit between the range [0,1].一种标准化是使您的 values1, values2 适合范围 [0,1] 之间。
  2. Suppose the average 2-value characterization of the feature1 based on the normalized data is (.7, .2).假设基于归一化数据的特征 1 的平均 2 值特征是 (.7, .2)。 For any new 2-values (x,y) compute the distance between (x,y) and (.7,.2)对于任何新的 2 值 (x,y),计算 (x,y) 和 (.7,.2) 之间的距离

When computing distance in machine learning, the sqrt component is usually not calculated.在机器学习中计算距离时,通常不计算 sqrt 分量。

dist^2 = (x-.7)^2 + (y-.2)^2

You might also be interested in calculating the error of a 2-value (x,y) wrt to (.7,.2) and can look into categorical cross entropy.您可能还对计算 2 值 (x,y) wrt 到 (.7,.2) 的误差感兴趣,并且可以研究分类交叉熵。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 计算文本相似度的最佳方法是什么? - what is the best method to calculate text similarity? 计算两个标签列表之间的相似度 - calculate similarity between two lists of tags 如何计算两个张量之间的余弦相似度? - How to calculate the Cosine similarity between two tensors? 如何计算两列之间的相似度? - How to calculate the similarity between two columns? 计算两个相同形状矩阵的行之间的余弦相似度的最快方法是什么 - What is the fastest way of calculate cosine similarity between rows of two same shape matrices x,y = y,y+x 和 x=y , y=x+y 有什么区别? - What is difference between x,y = y,y+x and x=y , y=x+y? Python 计算两个起始目的地对之间的相似度 - Python Calculate the similarity between two origin destination pairs 是否可以使用 Google BERT 计算两个文本文档之间的相似度? - Is it possible to use Google BERT to calculate similarity between two textual documents? 如何计算两列之间的余弦相似度? -Python - How to calculate cosine similarity between two columns? - Python 如何计算两个词之间的相似度以检测它们是否重复? - How do I calculate similarity between two words to detect if they are duplicates?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM