I'm working with a dataset(df) which contains a column call job, where people just enter their job position. The problem is because the data is typed ...
I'm working with a dataset(df) which contains a column call job, where people just enter their job position. The problem is because the data is typed ...
I want to take a list of Customer names and compare them to an internal database to find a high likely match and return a customer code So I would re ...
I'm trying to create a loop for the grabl function in stringdist that repeats the followings steps: pick a string pattern from a vector and insert ...
The documentation for aFind, specifies a maxDist paramater you can use, but there is no maxDist parameter you can pass into aFind? https://cran.r-pro ...
I have example data as follows: So the following runs fine: But merging with a column with a different name is not allowed (note that the join i ...
The following data has the surprising result that it does not match. I was expecting the distance to be 5, but even at 7 I get no match Only at 10 ...
I am having trouble matching character strings. Most of the difficulty centers on abbreviation I have two character vectors. I am trying to match wor ...
I have a huge dataset and that look like this. To save some memory I want to calculate the pairwise distance but leave the upper diagonal of the matri ...
Hi I am trying to match one string from other string in different dataframe and get nearest n matches based on score. EX: from string_2 (df_2) column ...
I have thousands of DNA sequences that look like this :). I need to extract every sequence between the CTACG and CAGTC. However, many cases in thes ...
I have a very large dataset, which looks like this. I have two types of data frames my reference data.frame and my experimental data.frame ...
I would like to do a left_join(df1, df2) based on fuzzy matches. My df1 is 100k rows big and my df2 is 25k rows big. Basically I would like to calcula ...
I am using the stringdist package in R. For several options: it uses maxDist. This option however counts the distance between A and a as one. Just ...
I have test data as follows. I am trying to find (near) matches for a vector of words, using stringdist as the actual database is large: I tried to ...
I'm working on string distance in multi-word strings, as in this toy data: I'd like to determine the (dis)similarity of each row compared to the ne ...
I am trying to use stringdist_join to merge two tables. I have built my 'by' variable as the concatenation of three variables which are named as such: ...
I have 2 data frames which needs to compare df_1 to df_2 and get similar string from df_2 of col_2 and store their number of phrases matched in df_out ...
I have two values which their order is mismatched and values are ideally same. When i calculate the string similaratity the score between them is far ...
i am trying to compare col_1 in df_1 dataframe with col_2 in df_2 dataframe to get nearest top 3 match with least score(least score represents nearest ...
I have two data frames with department names similar to these ones: The variables "depto" are suppose to be the same but with some differences. I t ...