To identify duplicates in a large list of personal records, I'm replacing all names with a CONCATENATE of Name and Date of Birth (DOB). Here is the sheet I'm referencing (DOBs!):
DOBs! -----------------------------------------
C D E F G I K
Name Years Start End # Yrs DOB NewNameDOB
Sally Adams 2014-2014 2014 2014 1 1968-1204 Sally Adams1968-1204
John Agnew 2014-2014 2014 2014 1 1979-0419 John Agnew1979-0419
Bob Anderson 2013-2014 2013 2014 2 1965-0402 Bob Anderson1965-0402
Antonio Andrews 2014-2014 2014 2014 1 1955-0716 Antonio Andrews1955-0716
Julie Assan 2012-2014 2012 2014 3 1978-0805 Julie Assan1978-0805
On the main sheet (Employees!), each person has a row of data for each active year. Work 14 years, you have 14 lines of data to track.
Employees! -----------------------------------------
C D **E** F G H I J...
Year Name NewNameDOB Dept
2013 Julie Assan Julie Assan1978-0805 East
1998 Mike Rogers Duplicate in Same Year Main
1999 Mike Rogers Duplicate in Same Year Main
2000 Mike Rogers Mike Rogers1969-0510 Main
2001 Mike Rogers Mike Rogers1969-0510 Main
As mentioned, I need to separate duplicate names from 10395 records (like Mike Rogers and Mike Rogers). Employees! column E will now identify the employees as Julie Assan1978-0805 and Julie Assan1980-0131 (for example).
Today we take my first step, using the years they worked in order to solve 99% of the duplicates. After this, only a few duplicate names will be left who worked at the same time as each other, which I'll have to handle manually.
If the Employees! sheet has a 2013 record for "Julie Assan," then the first step is to check DOBs to find any Julie Assans who worked in 2013. My new column E in Employees! will take the current 2013 record of Julie Assan, and find any matches in DOB! where C (name) matches Julie Assan, E <= 2013, and F >= 2013. Usually, there will be only one match, and it will tell me that is Julie Assan1978-0805. Sometimes, there will be two Mike Rogers who worked during the same year, and it should tell me "Duplicate in Same Year".
On the Employees! sheet column E , I've started with this...
=index(DOBs!$k$2:$k$10395,match($d3&$c3,DOBs!$c$2:$c$10395& ??? ,0)
Not sure where to go with this formula, whether that means adding "IFs" or something different.
edited to explain in great depth
=IF(COUNTIFS(DOBs!C$2:C$10395,Employees!D2,DOBs!E$2:E$10395,"<="&Employees!C2,DOBs!F$2:F$10395,">="&Employees!C2)>1,"Duplicate in Same Year",INDEX(DOBs!K$2:K$10395,MATCH(TRUE,IF(DOBs!C$2:C$10395=Employees!D2,IF(DOBs!E$2:E$10395<=Employees!C2,IF(DOBs!F$2:F$10395>=Employees!C2,TRUE))),0)))
Enter as an array formula by confirming with Ctrl+Shift+Enter, then autofill down. It first checks for duplicates using COUNTIFS, and returns "Duplicate in same year" if it is. If there are not duplicates, it uses INDEX/MATCH to find the NewNameDOB.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.