简体   繁体   English

将列表中的元素分组到子列表中,Java 中没有重复项

[英]Grouping elements from lists into sub lists without duplicates in Java

I am working on 'Grouping Anagrams'.我正在研究“分组字谜”。 Problem statement: Given an array of strings, group anagrams together.问题陈述:给定一个字符串数组,将字谜组合在一起。

I could group the anagrams but I am not able to avoid the ones which are already grouped.我可以对字谜进行分组,但我无法避免已经分组的字谜。 I want to avoid duplicates.我想避免重复。 An element can only belong to one group.一个元素只能属于一个组。 In my code, an element belongs to multiple groups.在我的代码中,一个元素属于多个组。

Here is my code:这是我的代码:

       public class GroupAnagrams1 {

           public static void main(String[] args) {
                 String[] input = {"eat", "tea", "tan", "ate", "nat", "bat"};
                 List<List<String>> result = groupAnagrams(input);
                 for(List<String> s: result) {
                      System.out.println(" group: ");
                            for(String x:s) {
                                System.out.println(x);
                            }
                   }
      }

      public static List<List<String>> groupAnagrams(String[] strs) {

            List<List<String>> result = new ArrayList<List<String>>();

            for(int i =0; i < strs.length; i++) {
                Set<String> group = new HashSet<String>();
                   for(int j= i+1; j < strs.length; j++) {
                       if(areAnagrams(strs[i], strs[j])) {
                            group.add(strs[i]);
                            group.add(strs[j]);
                     }
            }

                 if(group.size() > 0) {
                      List<String> aList = new ArrayList<String>(group); 
                      result.add(aList);
                 }
           }
      return result;


    }

Here comes the method to check if two string are anagrams.这里是检查两个字符串是否是字谜的方法。

 private static boolean areAnagrams(String str1, String str2) {
         char[] a = str1.toCharArray();
         char[] b = str2.toCharArray();
        int[] count1 = new int[256];
        Arrays.fill(count1, 0);
        int[] count2 = new int[256];
        Arrays.fill(count2, 0);
        for(int i = 0; i < a.length && i < b.length; i++) {
           count1[a[i]]++;
           count2[b[i]]++;
         }
        if(str1.length() != str2.length())
              return false;
        for(int k=0; k < 256; k++) {
              if(count1[k] != count2[k])
                    return false;
        }
        return true;
      }
     }

expected output:预期输出:

 group: 
    tea
    ate
    eat
 group: 
    bat
 group: 
    tan
    nat

actual output:实际输出:

  group: 
     tea
     ate
     eat
  group: 
     tea
     ate
  group: 
     tan
     nat

The order in which the groups are displayed does not matter.组的显示顺序无关紧要。 The way it is displayed does not matter.它的显示方式无关紧要。

Preference: Please feel free to submit solutions using HashMaps but I prefer to see solutions without using HashMaps and using Java8偏好:请随意提交使用 HashMaps 的解决方案,但我更喜欢查看不使用 HashMaps 和使用 Java8 的解决方案

I would have taken a slightly different approach using streams:我会使用流采取稍微不同的方法:

public class Scratch {
    public static void main(String[] args) {
        String[] input = { "eat", "tea", "tan", "ate", "nat", "bat" };

        List<List<String>> result = groupAnagrams(input);

        System.out.println(result);

    }

    private static List<List<String>> groupAnagrams(String[] input) {
        return Arrays.asList(input)
                     // create a list that wraps the array

                     .stream()
                     // stream that list

                     .map(Scratch::sortedToOriginalEntryFor)
                     // map each string we encounter to an entry containing
                     // its sorted characters to the original string

                     .collect(Collectors.groupingBy(Entry::getKey, Collectors.mapping(Entry::getValue, Collectors.toList())))
                     // create a map whose key is the sorted characters and whose
                     // value is a list of original strings that share the sorted
                     // characters: Map<String, List<String>>

                     .values()
                     // get all the values (the lists of grouped strings)

                     .stream()
                     // stream them

                     .collect(Collectors.toList());
                     // convert to a List<List<String>> per your req
    }

    // create an Entry whose key is a string of the sorted characters of original
    // and whose value is original
    private static Entry<String, String> sortedToOriginalEntryFor(String original) {
        char c[] = original.toCharArray();
        Arrays.sort(c);
        String sorted = new String(c);

        return new SimpleEntry<>(sorted, original);
    }
}

This yields:这产生:

[[eat, tea, ate], [bat], [tan, nat]]

If you want to eliminate repeated strings (eg if "bat" appears twice in your input) then you can call toSet() instead of toList() in your Collectors.groupingBy call, and change the return type as appropriate.如果您想消除重复的字符串(例如,如果“bat”在您的输入中出现两次),那么您可以在Collectors.groupingBy调用中调用toSet()而不是toList() ,并根据需要更改返回类型。

I also would recommend using java Streams for that.我也建议为此使用 java Streams。 Because you don't want that here is another solution:因为您不希望这是另一种解决方案:

public static List<List<String>> groupAnagrams(String[] strs) {
    List<List<String>> result = new ArrayList<>();
    for (String str : strs) {
        boolean added = false;
        for (List<String> r : result) {
            if (areAnagrams(str, r.get(0))) {
                r.add(str);
                added = true;
                break;
            }
        }

        if (!added) {
            List<String> aList = new ArrayList<>();
            aList.add(str);
            result.add(aList);
        }
    }
    return result;
}

The problem in your solution is that you are moving each iteration one step ahead, so you just generate the not full complete group ["tea", "ate"] instead of ["bat"] .您的解决方案中的问题是您将每次迭代向前推进了一步,因此您只需生成不完整的组["tea", "ate"]而不是["bat"]

My solution uses a different approach to check if you have a group where the first word is an anagram for the searched word.我的解决方案使用不同的方法来检查您是否有一个组,其中第一个单词是搜索词的字谜。 if not create a new group and move on.如果没有创建一个新组并继续。

Because I would use Java Streams as I said at the beginning here is my initial solution using a stream:因为我会使用 Java Streams,正如我在开头所说,这是我使用流的初始解决方案:

List<List<String>> result = new ArrayList<>(Arrays.stream(words)
        .collect(Collectors.groupingBy(w -> Stream.of(w.split("")).sorted().collect(Collectors.joining()))).values());

To generate the sorted string keys to group the anagrams you can look here for more solutions.要生成排序的字符串键以对字谜进行分组,您可以在此处查看更多解决方案。

The result is both my provided solutions will be this:结果是我提供的解决方案都是这样的:

[[eat, tea, ate], [bat], [tan, nat]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM