簡體 English 中英

計算二進制向量數據幀的相似度矩陣的最佳方法是什么？

[英]What is the best way to compute a similarity matrix for a dataframe of binary vectors?

原文 2020-03-27 11:07:55 4 1 python/ binary/ similarity/ cosine-similarity

我有一個大小為 mxn 的二進制向量的數據框，其中包含一些未填充的值，如下例所示

col1 col2 col3 col4 col5
 V0    1         0    1
 V1    1    1         0
 V2    0    1    0    1
 V3         0    0

我想在這個數據框上計算一個相似度矩陣，這樣我就可以得到任意 2 個向量之間的相似度分數。

做這個的最好方式是什么？

注意：我嘗試用 2 替換 NULL 值，並從數據幀上的 scipy 庫中應用余弦相似度。 結果矩陣不准確/正確。

1 個解決方案

您可能希望將pdist或cdist與二元距離函數（例如骰子、jaccard 或 hamming）一起使用（請參閱本頁末尾的這些函數列表）。

用向量索引 3d 矩陣的最佳方法是什么？

[英]What is the best way to index 3d matrix with vectors?

在numpy中計算矩陣乘積的跡線的最佳方法是什么？

[英]What is the best way to compute the trace of a matrix product in numpy?

計算數據幀上的 jaccard 相似度

[英]compute jaccard similarity on dataframe

創建一個 function 以僅使用 numpy 計算二維矩陣中行向量的所有成對余弦相似度

[英]create a function to compute all pairwise cosine similarity of the row vectors in a 2-D matrix using only numpy

計算矩陣中鄰居數量的最佳方法？

[英]Best way to compute amount of neighbours in matrix?

Pyspark：針對向量列計算余弦相似度的最快方法是什么

[英]Pyspark: What is the Fastest way to Calculate Cosine Similarity against a Column of Vectors

在 JAX 中計算詞向量的移動平均值的最佳方法

[英]Best way to compute the moving average of word vectors in JAX

計算變壓器結果指標的最佳方法是什么？

[英]What is the best way to compute metrics for the transformers results?

計算非常大的指數的最佳方法是什么？

[英]What is the best way to compute very large exponents?

PyTorch 中一組向量之間的成對相似度矩陣

[英]Pairwise similarity matrix between a set of vectors in PyTorch

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 用向量索引 3d 矩陣的最佳方法是什么？在numpy中計算矩陣乘積的跡線的最佳方法是什么？計算數據幀上的 jaccard 相似度創建一個 function 以僅使用 numpy 計算二維矩陣中行向量的所有成對余弦相似度計算矩陣中鄰居數量的最佳方法？ Pyspark：針對向量列計算余弦相似度的最快方法是什么在 JAX 中計算詞向量的移動平均值的最佳方法計算變壓器結果指標的最佳方法是什么？計算非常大的指數的最佳方法是什么？ PyTorch 中一組向量之間的成對相似度矩陣

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM