简体   繁体   English

在有限大小的字符串数组中拆分字符串

[英]Splitting a string in an array of strings of limited size

I have a string of a random address like我有一串随机地址,例如

String s = "HN-13/1443 laal street near bharath dental lab near thana qutubsher near modern bakery saharanpur uttar pradesh 247001";

I want to split it into array of string with two conditions:我想将其拆分为具有两个条件的字符串数组:

  • each element of that array of string is of length less than or equal to 20该字符串数组的每个元素的长度小于或等于 20
  • No awkward ending of an element of array of string字符串数组的元素没有尴尬的结尾

For example, splitting every 20 characters would produce:例如,每 20 个字符拆分会产生:

"H.N.-13/1443 laal st"
"reet near bharath de"
"ntal lab near thana"
"qutubsher near moder"
"n bakery saharanpur"

but the correct output would be:但正确的输出是:

"H.N.-13/1443 laal"
"street near bharath"
"dental lab near"
"thana qutubsher near"
"modern bakery"
"saharanpur"

Notice how each element in string array is less than or equal to 20.请注意字符串数组中的每个元素如何小于或等于 20。

The above is my output for this code:以上是我对此代码的输出:

static String[] split(String s,int max){
    int total_lines = s.length () / 24;
    if (s.length () % 24 != 0) {
        total_lines++;
    }

    String[] ans = new String[total_lines];
    int count = 0;
    int j = 0;

    for (int i = 0; i < total_lines; i++) {
        for (j = 0; j < 20; j++) {
            if (ans[count] == null) {
                ans[count] = "";
            }

            if (count > 0) {
                if ((20 * count) + j < s.length()) {
                    ans[count] += s.charAt (20 * count + j);
                } else {
                    break;
                }
            } else {
                ans[count] += s.charAt (j);
            }
        }

        String a = "";

        a += ans[count].charAt (0);

        if (a.equals (" ")) {
            ans[i] = ans[i].substring (0, 0) + "" + ans[i].substring (1);
        }

        System.out.println (ans[i]);

        count++;
    }
    return ans;
}

public static void main (String[]args) {
    String add = "H.N.-13/1663 laal street near bharath dental lab near thana qutubsher near modern bakery";
    String city = "saharanpur";
    String state = "uttar pradesh";
    String zip = "247001";
    String s = add + " " + city + " " + state + " " + zip;
    String[]ans = split (s);
}

The code is not very clear, but at first glance it seems you are building character by character that is why you are getting the output you see.代码不是很清楚,但乍一看,你似乎是在逐个字符地构建,这就是为什么你会得到你看到的输出。 Instead you go word by word if you want to retain a word and overflow it to next String if necessary.相反,如果您想保留一个单词并在必要时将其溢出到下一个字符串,您可以逐字逐句。 A more promising code would be:更有希望的代码是:

static String[] splitString(String s, int max) {
    String[] words = s.split("\s+");
    List<String> out = new ArrayList<>();
    int numWords = words.length;
    int i = 0;
    while (i <numWords) {
        int len = 0;
        StringBuilder sb = new StringBuilder();
        while (i < numWords && len < max) {
            int wordLength = words[i].length();
            len += (i == numWords-1 ? wordLength : wordLength + 1);//1 for space
            if (len <= max) {
                sb.append(words[i]+ " ");
                i++;
            }
        }
        out.add(sb.toString().trim());
    }
    return out.toArray(new String[] {});
        
}

Note: It works on your example input, but you may need to tweak it so it works for cases like a long word containing more than 20 characters, etc.注意:它适用于您的示例输入,但您可能需要对其进行调整,使其适用于包含超过 20 个字符的长单词等情况。

Find all occurrences of up to 20 chars starting with a non-space and ending with a word boundary, and collect them to a List:查找以非空格开头并以单词边界结尾的最多 20 个字符的所有匹配项,并将它们收集到一个列表中:

List<String> parts = Pattern.compile("\\S.{1,19}\\b").matcher(s)
  .results()
  .map(MatchResult::group)
  .collect(Collectors.toList());

See live demo .现场演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM