简体   繁体   English

删除二维数组中的重复项

[英]Remove duplicates in 2d array

I want to remove duplicate row in a 2d array .我想删除二维数组中的重复行。 i tried the below code .but it is not working .我尝试了下面的代码。但它不起作用。 please help me .请帮我 。

Input :输入 :

 1,ram,mech 1,ram,mech 2,gopi,csc 2.gopi,civil

output should be :输出应该是:

 1,ram,mech 2,gopi,csc 2.gopi,civil

Code :代码 :

package employee_dup;

import java.util.*;

public class Employee_dup {

    public static void main(String[] args)
    {
        boolean Switch = true;
        System.out.println("Name  ID  Dept ");
        String[][] employee_t = {{"1","ram","Mech"},{"1","siva","Mech"},{"1","gopi","Mech"},{"4","jenkat","Mech"},{"5","linda","Mech"},{"1","velu","Mech"}};
        int g = employee_t[0].length;
        String[][] array2 = new String[10][g];
        int rows = employee_t.length;
        Arrays.sort(employee_t, new sort(0));

        for(int i=0;i<employee_t.length;i++){  
            for(int j=0;j<employee_t[0].length;j++){  

                System.out.print(employee_t[i][j]+" ");  
            }  
            System.out.println();  
        } 

        List<String[]> l = new ArrayList<String[]>(Arrays.asList(employee_t));

        for(int k = 0 ;k < employee_t.length-1;k++)
        {
            if(employee_t[k][0] == employee_t[k+1][0])
            {
                System.out.println("same value is present");  
                l.remove(1);
                array2 = l.toArray(new String[][]{});
            }        
        }

        System.out.println("Name  ID  Dept ");
        for(int i=0;i<array2.length;i++){  
            for(int j=0;j<array2[0].length;j++){  

                System.out.print(array2[i][j]+" ");  
            }  
            System.out.println();  
        }
    }
}

class sort implements Comparator {
    int j;
    sort(int columnToSort) {
        this.j = columnToSort;
    }
    //overriding compare method
    public int compare(Object o1, Object o2) {
        String[] row1 = (String[]) o1;
        String[] row2 = (String[]) o2;
        //compare the columns to sort
        return row1[j].compareTo(row2[j]);
    }
}

First I sorted the array based on column one ,then tried to remove duplicates by checking the first column elements and seconds column elements but it is not removing the required column but remove other columns.首先,我根据第一列对数组进行排序,然后尝试通过检查第一列元素和第二列元素来删除重复项,但它没有删除所需的列而是删除其他列。

You may give this solution a try:你可以试试这个解决方案:

public static void main(String[] args) {
    String[][] employee_t = {
            {"1","ram","Mech"},
            {"1","ram","Mech"},
            {"1","siva","Mech"},
            {"1","siva","Mech"},
            {"1","gopi","Mech"},
            {"1","gopi","Mech"} };
    System.out.println("ID Name   Dept");
    Arrays.stream(employee_t)
          .map(Arrays::asList)
          .distinct()
          .forEach(row -> System.out.printf("%-3s%-7s%s\n", row.get(0), row.get(1), row.get(2)));
}

Output输出

ID Name   Dept
1  ram    Mech
1  siva   Mech
1  gopi   Mech

How it works: comparing arrays does rely on instance equality and not on comparing contained elements by equals .工作原理:比较数组确实依赖于实例相等性,而不是通过equals比较包含的元素。 Hence converting each row of your 2D array into a List will enable you to compare lists, which takes equals of the elements contained into account.因此,将 2D 数组的每一行转换为List将使您能够比较列表,这将考虑所包含元素的equals

The Java Stream API does provide a method distinct which relies on equals and will remove all duplicates for you. Java Stream API确实提供了一个distinct方法,它依赖于equals并将为您删除所有重复项。

Based on your code.根据您的代码。 Maybe it is not the BEST solution but it works.也许这不是最好的解决方案,但它有效。

public static void main(String[] args) {

    System.out.println("Name  ID  Dept ");
    // I added duplicated rows
    String[][] inputArray = {
            { "1", "ram", "Mech" }, 
            { "1", "siva", "Mech" }, 
            { "1", "gopi", "Mech" }, 
            { "1", "gopi", "Mech" }, 
            { "4", "jenkat", "Mech" },
            { "5", "linda", "Mech" }, 
            { "1", "velu", "Mech" },
            { "1", "velu", "Mech" }
    };

    // I will add all rows in a Set as it doesn't store duplicate values
    Set<String> solutionSet = new LinkedHashSet<String>();

    // I get all rows, create a string and insert into Set
    for (int i = 0 ; i < inputArray.length ; i++) {
        String input = inputArray[i][0]+","+inputArray[i][1]+","+inputArray[i][2];
        solutionSet.add(input);
    }

    // You know the final size of the output array
    String[][] outputArray = new String[solutionSet.size()][3];

    // I get the results without duplicated values and reconvert it to your format
    int position = 0;
    for(String solution : solutionSet) {
        String[] solutionArray = solution.split(",");

        outputArray[position][0] = solutionArray[0];
        outputArray[position][1] = solutionArray[1];
        outputArray[position][2] = solutionArray[2];

        position++;
    }


    System.out.println("Name  ID  Dept ");
    for (int i = 0; i < outputArray.length; i++) {
        for (int j = 0; j < outputArray[0].length; j++) {

            System.out.print(outputArray[i][j] + " ");
        }
        System.out.println();
    }

}

I have posted what I think is a readable and easy to maintain solution.我已经发布了我认为可读且易于维护的解决方案。

I decided to use distinct from Stream which is part of Java 8我决定使用distinct Java 8 一部分的Stream

Returns a stream consisting of the distinct elements (according to Object.equals(Object)) of this stream.返回由该流的不同元素(根据 Object.equals(Object))组成的流。 - https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#distinct-- - https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html#distinct--

Main.class主类

class Main {
    public static void main(String[] args)
    {
        //Create a list of Employee objects
        List<Employee> employeeList = new ArrayList<Employee>();
        Employee e1 = new Employee(1, "ram", "mech");
        Employee e2 = new Employee(1, "ram", "mech");
        Employee e3 = new Employee(2, "gopi", "csc");
        Employee e4 = new Employee(2, "gopi", "civil");

        employeeList.add(e1);
        employeeList.add(e2);
        employeeList.add(e3);
        employeeList.add(e4);

        System.out.println("Before removing duplicates");
        employeeList.stream().forEach(System.out::println);

        //This is where all the magic happens.
        employeeList = employeeList.stream().distinct().collect(Collectors.toList());

        System.out.println("\nAfter removing duplicates");
        employeeList.stream().forEach(System.out::println);
    }
}

Output:输出:

Before removing duplicates
Employee [valA=1, valB=ram, valC=mech]
Employee [valA=1, valB=ram, valC=mech]
Employee [valA=2, valB=gopi, valC=csc]
Employee [valA=2, valB=gopi, valC=civil]

After removing duplicates
Employee [valA=1, valB=ram, valC=mech]
Employee [valA=2, valB=gopi, valC=csc]
Employee [valA=2, valB=gopi, valC=civil]

Employee.class员工类

//This is just a regular POJO class.
class Employee {

    int valA;
    String valB, valC;

    public Employee(int valA, String valB, String valC){
        this.valA = valA;
        this.valB = valB;
        this.valC = valC;
    }

    public Employee(Employee e) {
        this.valA = e.valA;
        this.valB = e.valB;
        this.valC = e.valC;
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + valA;
        result = prime * result + ((valB == null) ? 0 : valB.hashCode());
        result = prime * result + ((valC == null) ? 0 : valC.hashCode());
        return result;
    }
    @Override
    public boolean equals(Object obj) {

        if(obj instanceof Employee && ((Employee)obj).hashCode() == this.hashCode()){
            return true;
        }

        return false;
    }

    @Override
    public String toString() {
        return "Employee [valA=" + valA + ", valB=" + valB + ", valC=" + valC + "]";
    }
}

Pre Java - 8 solution. Pre Java - 8 解决方案。 May not be the best way.可能不是最好的方法。 But a quick solution which works..但是一个有效的快速解决方案..

String[][] records = {
            {"1","ram","Mech"},
            {"1","ram","Mech"},
            {"1","gopi","csc"},
            {"1","gopi","civil"} };

    List<String[]> distinctRecordsList = new ArrayList<String[]>();
    for(String[] record : records){
        if(distinctRecordsList.size()>0){
            boolean sameValue = false;
            for(String[] distinctRecord : distinctRecordsList){
                int distinctRecordFields = distinctRecord.length;
                if(record.length==distinctRecordFields){
                    for(int k=0;k<distinctRecordFields;k++){
                        sameValue = record[k].equalsIgnoreCase(distinctRecord[k]);
                        if(!sameValue)
                            break;
                    }
                }else
                    throw new Exception("Can't compare the records");
            }
            if(!sameValue)
                distinctRecordsList.add(record);
        }else if(distinctRecordsList.size()==0)
            distinctRecordsList.add(record);            
    }
    Object[] distRecObjects = distinctRecordsList.toArray();
    String[][] distinctRecordsArray = new String[distRecObjects.length][];

    int i=0;
    for(Object distRecObject : distRecObjects){
        distinctRecordsArray[i] = (String[]) distRecObject;
        i++;
    }

Contrary to some other answers I will try to explain what went wrong in your own code and how to fix it within your code (I agree very much with kkflf that an Employee class would be a huge benefit: it's more object-oriented and it will help structure the code and give better overview of it).与其他一些答案相反,我将尝试解释您自己的代码中出了什么问题以及如何在您的代码中修复它(我非常同意 kkflf 的观点,即Employee类将是一个巨大的好处:它更面向对象,它将帮助构建代码并提供更好的概览)。

The issues I see in your code are:我在您的代码中看到的问题是:

  • You are not removing the correct element when you detect a duplicate, but always the element at index 1 (the second element since indices count from 0).当您检测到重复时,您并没有删除正确的元素,而是始终删除索引 1 处的元素(第二个元素,因为索引从 0 开始计数)。 This isn't trivial, though, because indices shift as you remove elements.不过,这并非微不足道,因为索引会随着您删除元素而发生变化。 The trick is to iterate backward so only indices that you are finished with shift when you remove an element.诀窍是向后迭代,以便仅在删除元素时完成移位的索引。
  • You are using == to compare the first element of the subarrays you are comparing.您正在使用==来比较您正在比较的子数组的第一个元素。 If you wanted to compare just the first element, you should use equals() for comparison.如果只想比较第一个元素,则应使用equals()进行比较。 However, I believe you want to compare the entire row so 2,gopi,csc and 2.gopi,civil are recognized as different and both preserved.但是,我相信您想比较整行,因此2,gopi,csc2.gopi,civil被认为是不同的并且都保留了下来。 Arrays.equals() can do the job. Arrays.equals()可以完成这项工作。
  • You need to create array2 only after the loop.您只需要在循环之后创建array2 As your code stands, if no duplicates are detected, arrays2 is never created.就您的代码而言,如果未检测到重复项,则永远不会创建arrays2

So your loop becomes:所以你的循环变成:

    for (int k = employee_t.length - 1; k >= 1; k--)
    {
        if (Arrays.equals(employee_t[k], employee_t[k - 1]))
        {
            System.out.println("same value is present");  
            l.remove(k);
        }        
    }
    array2 = l.toArray(new String[][]{});

This gives you the output you asked for.这为您提供了您要求的输出。

Further tips:更多提示:

  • Your comparator only compares one field in the inner arrays, which is not enough to guarantee that identical rows come right after each other in the sorted array.您的比较器只比较内部数组中的一个字段,这不足以保证相同行在排序数组中紧随其后。 You should compare all elements, and also require that the inner arrays have the same length.您应该比较所有元素,并且还要求内部数组具有相同的长度。
  • Use generics: class Sort extends Comparator<String[]> , and you won't need the casts in compare()使用泛型: class Sort extends Comparator<String[]> ,并且您不需要compare()的强制转换
  • According to Java naming conventions it should be class EmployeeDup , boolean doSwitch (since switch is a reserved word) and class Sort .根据 Java 命名约定,它应该是class EmployeeDupboolean doSwitch (因为switch是一个保留字)和class Sort
  • You are not using the variables Switch and rows ;您没有使用变量Switchrows delete them.删除它们。

I have wrote a solution for me.我已经为我写了一个解决方案。 This may not be the best but it works.这可能不是最好的,但它有效。

public static String[][] removeDuplicate(String[][] matrix) {
    String[][] newMatrix = new String[matrix.length][matrix[0].length];
    int newMatrixRow = 1;

    for (int i = 0; i < matrix[0].length; i++)
        newMatrix[0][i] = matrix[0][i];

    for (int j = 1; j < matrix.length; j++) {
        List<Boolean> list = new ArrayList<>();
        for (int i = 0; newMatrix[i][0] != null; i++) {
            boolean same = true;
            for (int col = 2; col < matrix[j].length; col++) {
                if (!newMatrix[i][col].equals(matrix[j][col])) {
                    same = false;
                    break;
                }
            }
            list.add(same);
        }

        if (!list.contains(true)) {
            for (int i = 0; i < matrix[j].length; i++) {
                newMatrix[newMatrixRow][i] = matrix[j][i];
            }
            newMatrixRow++;
        }
    }
    
    int i;
    for(i = 0; newMatrix[i][0] != null; i++);
    
    String finalMatrix[][] = new String[i][newMatrix[0].length];
    for (i = 0; i < finalMatrix.length; i++) {
        for (int j = 0; j < finalMatrix[i].length; j++)
            finalMatrix[i][j] = newMatrix[i][j];
    }
    
    return finalMatrix;
}

This method will return a matrix without any duplicate rows.此方法将返回一个没有任何重复行的矩阵。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM