Tag[similarity] Recent Newest Questions

Finding similarity in a pandas variable

I have a dataset with company names as follows: the problem is that some of those names refer to the exact same firm but are written differently (e ...

How would I make clusters from a Levenshtein similarity matrix?

I have a similarity matrix of words and would like to apply an algorithm that can put the words in clusters. Here's the example I have so far: Obv ...

How can I check similarity in meaning and not just having same words between two texts with spacy

I'm trying to compare two differents texts. One comming from a CV and the other one of a job annonce. After cleaning of the texts I'm trying to compa ...

'float' object is not iterable", 'occurred at index 1'

i have for the following dataset I have got a custom function that compares the Mate, Bence, Raul and Marina columns against the 'company_name_Igni ...

How to find best string match out of multiple possibilities in a dataframe?

I have a DF that looks like this: What I am trying to do is to compare the option1 and option2 columns to the master columns separately and obtain ...

Iterating over 2 columns and comparing similarities in Python

I have a DF that looks like this: what I am trying to do is to iterate through the Account_Name_HGI and the company_name_Ignite columns and compare ...

pandas: group near similar string data

I am trying to use groupby on a column with str type of data that has near similar values and get a count of it: for example: I'd like to get a co ...

how to find percentage of similarity between two arrays

I have two data arrays x and y: I want to check the similarity between x and y in the program code. I've tried using SequenceMatcher() but I'm not ...

ElasticSearch: more_like_this query

I have an index = "es_demo" , where I need to find similar documents to given "_id", I don't think it is working as the returned results have same "_i ...

Cypher - match nodes with similar relations rank result based on number of identical relations

I have a graph where node/relations look like: All Entity- and Attibute-nodes have a property called id_obj to identify each node Lets say I have a ...

Can I apply similarity models to .tif files in R?

Basically, I have two groups of .tif files (some for Spain and some other for California), and I want to statistically compare the climatic variables ...

Having trouble creating a loop to check if two columns of a matrix are similar (2D arrays, Java)

So I was practicing 2D arrays in Java and I have this exercise that asks me to create 2 functions, one that receives a matrix, a column number and a v ...

How to convert TS-SS result to similarity measure between 0 - 1?

I'm currently developing a question plugin for some LMS that auto grade the answer based on the similarity between the answer and answer key with cosi ...

which type of algorithm suits the best to find the most similar molecule with respect to the actual drug? and how to add weight on factors?

The data consists of some properties of drug candidate molecules (the last row is the actual drug) Mol= Molecule name, Su= Surface area, Vol= Volu ...

Change Typo Column Values with Right Word based on Columns in Other Dataframe

I have two dataframe, the first one is location , the other one is customer, I want to change the typo word in location column in customer dataf ...

How can I calculate Jaccard index between a set of different keywords

Here is an example of the data I am dealing with : In the data, each abstract belongs to a specific author and the same author can have multiple ab ...

Multithreading for similarity test in Python

Hello I've been working on a huge csv file which needs similarity tests done. There is 1.16million rows and to test similarity between each rows it ta ...

I need to find a computationally efficient way of identifying and matching words in sentences. I know there are various string similarity packages whi ...

Change a tuple within a list of tuples

I am reading in data from multiple Excel files and writing them back to an aggregated Excel file. So I have this output, and it represents the relati ...

Find a string having highest partial match with other strings in a list

I have a list A with strings: ['assembly eye tow top', 'tow eye bolts', 'tow eye bolts need me'] I am trying to find a string strA that has the high ...