简体繁体中英

Word/Sentence similarity. What is the best approach?

原文 2019-07-31 03:16:24 9 1 python/ nlp

I need to build an algorithm for product master data purposes and I'm not sure about the best NLP approach for this. The scenario is: - I have Product golden records; - I have many others Product catalogs that need to be harmonized; Example: - Product Golden Record: Coke and Coke Zero; - Products description that need to be hamonized: Coke 300ml, Coke Zero 300ml, Cke zero.

I need an algorithm that harmonize by similarity, since I have to consider typos and, sometimes, piece of a product in a sentence. Example: Coke zero JS MKT (JS and MKT are garbage, but the sentence is more similar to Coke Zero).

I've been testing some NLP for sentence similarity such as Bag of words as well as reading some other approaches such as Cosine Similarity and Levenshtein distance. However, I don't know what is the best option for my case.

Could you please help me to understand the best way to achieve what I need?

1 answers

I have found two great solutions, by using Cosine similarity and Levenshtein distance. Im my case, Cosine similarity worked better, because I easily found part of the brand name into the text, so getting a score of 100% of accuracy. Matrix replacing (Levenshtein) was also good, but I good some errors due to very similar words in the dataset.

What is the best approach to measure a similarity between texts in multiple languages in python?

Using word2vec to calculate sentence similarity

Using NearestNeighbors and word2vec to detect sentence similarity

Best way to get keyword similarity value from a sentence?

Best approach to solve Word Chain

How to improve word mover distance similarity in python and provide similarity score using weighted sentence

Best approach for semantic similarity in large documents using BERT or LSTM models

what is the best method to calculate text similarity?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question sentence similarity using word embedding What is the best approach to measure a similarity between texts in multiple languages in python? Using word2vec to calculate sentence similarity Using NearestNeighbors and word2vec to detect sentence similarity Finding Similarity between 2 sentences using word2vec of sentence with python Best way to get keyword similarity value from a sentence? Best approach to solve Word Chain How to improve word mover distance similarity in python and provide similarity score using weighted sentence Best approach for semantic similarity in large documents using BERT or LSTM models what is the best method to calculate text similarity?

Related Tags

Word/Sentence similarity. What is the best approach?

Question

1 answers

solution1 1 ACCPTED 2020-03-12 00:29:10

solution1
1 ACCPTED 2020-03-12 00:29:10