简体繁体中英

Find matching records with least characters from Pattern - Oracle / Java

原文 2011-11-08 17:22:03 5 2 java/ oracle

The web application I am working currently has an File import logic. The logic

1> reads the records from a file [excel or txt],
2> shows a non editable grid of all the records imported [New records are marked as New if they do not exist in the database and existing records are marked as Update] and
3> dumps the records in the database.

a file containing contacts with following format in the file (mirrors the columns in the database with primary keys First_Name, Last_Name ):

First_Name, Last_Name, AddressLine1, AddressLine2, City, State, Zipcode

The issue we are running into is when there are different values for the same entity being entered in the file. example, Someone might type NY for New York while others would put in New York. Same applies to first name or last name ex. John Myers and John Myer refer to the same person, but because the record does not match exactly, it inserts the record rather than reusing it for an update.

Example, for the record from the file ( Please note the name and address usage is purely coincidental :) ):

John, Myers, 44 Chestnut Hill, Apt 5, Indiana, Indiana, 11111

and the record in the database:

John, Myer, 80 Washington St, Apt 1, Chicago, IL, 3333

the system should have detected the record in the file as existing record [because of the last name being Myers and Myer and since first name matches completely] and do an update on the Address, but rather inserts a new value.

How can I approach this issue where I would want to find all the records that would perform the match on the existing records in the database?

2 answers

It is a very difficult problem to solve, if you know the sources of your data, then you could attempt to manually rectify the different combinations of data input.

Else

you could try for phonetic data cleaning solutions

One solution I could think of is using Regex in Oracle to achieve the functionality upto some extent.

For each of the column, I would generate Regex expression half way through the String length. example, for the name "Myer" in the file and "Myers" in the database, following query would work:

SELECT Last_Name from Contacts WHERE (Last_Name IS NULL OR Regexp_Like(Last_Name, '^Mye?r?$'))

I would consider this as a partial solution because I would parse the input string and start appending the none or only one operator from half the length to the end of the string and hoping the input string is not so messed up.

Hoping to find some feedback from others on SO for this "solution".

Reusing the consumed characters in pattern matching in java?

Java pattern matching from a log

Pattern matching with Chinese characters (encoded in UTF-8) in Java

How to Find two or more words matching from given String using Pattern Matcher class in java?

Java, Pattern matching and sorting from a file

Using pattern matching to sort from a file, Java

remove sequence matching a pattern from a Java 8 stream

Doing pattern matching in SQL from java

How do I exclude certain characters from pattern matching in regex?

Java Regex + How to find a matching pattern within a String

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Reusing the consumed characters in pattern matching in java? Java pattern matching from a log Pattern matching with Chinese characters (encoded in UTF-8) in Java How to Find two or more words matching from given String using Pattern Matcher class in java? Java, Pattern matching and sorting from a file Using pattern matching to sort from a file, Java remove sequence matching a pattern from a Java 8 stream Doing pattern matching in SQL from java How do I exclude certain characters from pattern matching in regex? Java Regex + How to find a matching pattern within a String

Related Tags

Find matching records with least characters from Pattern - Oracle / Java

Question

2 answers

solution1
0 2011-11-08 17:57:33

solution2
0 ACCPTED 2011-11-09 02:17:37

Find matching records with least characters from Pattern - Oracle / Java

Question

2 answers

solution1 0 2011-11-08 17:57:33

solution2 0 ACCPTED 2011-11-09 02:17:37

solution1
0 2011-11-08 17:57:33

solution2
0 ACCPTED 2011-11-09 02:17:37