简体   繁体   English

在Excel中,通过VBA或公式/函数的组合,根据另一列中的值从一列中删除重复项

[英]In Excel, remove duplicates from one column based on the values in another column, either through VBA or a combination of formulas/functions

I'm having trouble trying to achieve this in an accurate and automated way. 我在尝试以准确和自动化的方式实现此目标时遇到了麻烦。 I've tried the approaches discussed here , here and here , but none work in my scenario. 我已经尝试过这里这里这里讨论的方法,但是在我的方案中没有任何工作。

I have a spreadsheet with thousands of rows of data. 我有一个包含数千行数据的电子表格。 Data is organised as follows: 数据组织如下:

  • Column A contains IP addresses in General format 列A包含常规格式的 IP地址
  • Column B contains Date/Time in the following Custom format ( d/mm/yyyy h:mm ) B列包含以下自定义格式的日期/时间( d / mm / yyyy h:mm
  • Column C contains duration in the following Custom format ( h:mm:ss ) C列包含以下自定义格式的持续时间( h:mm:ss

This data contains a number of duplicates I need to remove, based on the IP address in Column A. However, the criteria I need is to remove whichever duplicates are not the longest duration. 该数据包含大量重复我需要基于列A.然而,IP地址删除的,我需要的标准是消除重复为准最长持续时间。 To better explain my scenario, see sample image below: 为了更好地解释我的情况,请参见下面的示例图片:

在此处输入图片说明

I need a way to remove all duplicates of a particular IP address that do not contain the longest duration for that IP address. 我需要一种方法来删除不包含该IP地址最长持续时间的特定IP地址的所有重复项。 So, using the above example, row 3 would be deleted because the duration of 1 minute is shorter than 36 minutes in row 4 that contains the same IP address. 因此,使用上面的示例,将删除第3行,因为1分钟的持续时间比包含相同IP地址的第4行中的36分钟短。

Another example is that rows 5, 6 and 7 would also be removed as all their durations are shorter than row 8 which has the same IP address but a longer duration. 另一个示例是,第5、6和7行也将被删除,因为它们的持续时间都比具有相同IP地址但持续时间更长的第8行短。 Of course, any rows already containing unique IP addresses would be left alone. 当然,任何已经包含唯一IP地址的行都将被保留。 The end result using my above sample would be as follows: 使用我上面的示例的最终结果如下:

在此处输入图片说明

Of course, in my sample above all the data was nicely sorted by IP address first and Duration second. 当然,在我上面的示例中,所有数据均按IP地址排在首位,然后将Duration排在第二位。 In real life this isn't the case, but that's something easy enough for me to do prior to any solution, if necessary. 在现实生活中并非如此,但是如果需要的话,对于任何解决方案,这对于我来说都是一件容易的事。

The key thing is that in some cases an IP address may be duplicated once, in others it may be duplicated many times over. 关键是,在某些情况下,一个IP地址可能重复一次,在其他情况下,可能重复多次。 I just need to ensure that only the one with the longest duration remains. 我只需要确保只保留时间最长的那个即可。 In the event that multiple instances of an IP address has the same longest duration, then I want them all kept. 如果一个IP地址的多个实例具有最长的持续时间,那么我希望将它们全部保留。 That is, if an IP address is repeated ten times and its longest duration is an hour for two of those times, then both of them need to remain. 也就是说,如果一个IP地址重复十次,并且最长的持续时间是其中两次的一小时,那么这两个都需要保留。

I'm happy with any solution for this, be it using formulas, functions or macros. 我对使用公式,函数或宏的任何解决方案感到满意。

You can solve your task using the helper column (column D). 您可以使用帮助程序列(D列)解决任务。

  1. Insert the following array formula to the cell D2: 将以下数组公式插入单元格D2:

    =IF($C2=MAX(IF($A2=$A$2:$A$50,$C$2:$C$50,-1)),"Remain","Remove")

    where 50 - the last row of your table 50-表格的最后一行

    Remember to press Ctrl+Shift+Enter to complete the array formula correctly. 请记住按Ctrl+Shift+Enter正确完成数组公式。

  2. Copy/paste the formula to the other cells. 将公式复制/粘贴到其他单元格。

  3. Аpply filter to column D by "remove" value 将“删除”值过滤到D列

  4. Delete filtered rows. 删除过滤的行。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 通过VBA或公式/函数的组合,将一列中所有行的值相加,并将另一列中的数据相匹配 - Sum values in one column for all rows with matching data in another column, either through VBA or a combination of formulas/functions 删除重复项并根据另一列中的值指望一列 - Remove duplicates and count on one column based on values from another column 删除重复项并根据另一列中的值指望一列 - Remove duplicates and count on one column based on values from another column 根据Microsoft Excel中A列中的值从B列中删除重复项 - Remove duplicates from column B based on values in column A in Microsoft Excel 根据另一列值删除excel中的重复项 - Remove duplicates in excel based on another column value Excel VBA - 根据另一列中的值填充一列中的单元格 - Excel VBA - populate cells in one column based on values in another one VBA Excel 2016根据另一列的值将表标题从一列粘贴到新表中 - VBA Excel 2016 pasting Table Headers from one column, into a new table, based on the values of another column Excel VBA - 根据 3 个条件(复杂的 IF AND 相关 VBA 与通配符)使用另一张工作表中的值填充一张工作表上的列 - Excel VBA - Populate a column on one sheet with values from another sheet based on 3 criteria (complicated IF AND related VBA with wildcards) excel vba - 从基于另一列空白的变体中删除单元格 - excel vba - remove cell from a variant based on blank in another column 使用VBA根据B列中的现有值从A列中删除重复项 - Remove duplicates from column A based on existing values in column B using VBA
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM