简体   繁体   中英

Java, writing my own split string method

I need to be able to write my own split string method so that input like

String[] test1 = mySplit("ab#cd#efg#", "#");
System.out.println(Arrays.toString(test1));

will print [ab, #, cd, #, efg, #] to the console. So far I've got it to split like that but my way leaves awkward spaces where 2 delimiters are in a row, or a delimiter is at the start of the input.

public static String[] mySplit(String str, String regex)
{
    String[] storeSplit = new String[str.length()];
    char compare1, compare2;
    int counter = 0;

    //Initializes all the string[] values to "" so when the string
    //and char concatonates, 'null' doesn't appear.
    for(int i=0; i<str.length(); i++) {
        storeSplit[i] = "";
    }

    //Puts the str values into the split array and concatonates until
    //a delimiter is found, then it moves to the next array index.
    for(int i=0; i<str.length(); i++) {
        compare1 = str.charAt(i);
        compare2 = regex.charAt(0);

            if(!(compare1 == compare2)) {
                storeSplit[counter] += ""+str.charAt(i);
            } else {
                counter++;
                storeSplit[counter] = ""+str.charAt(i);
                counter++;
            }
    }
    return storeSplit;
}

When I use that method in my Test main, I get the output [ab, #, cd, #, efg, #, , , , ]. So I'm lost on how to fix the spacing of it all and I'll also need to be able to allow multiple delimiters which my code currently doesn't handle.

Also I know this code is really sloppy at the moment, just trying to lay down the concepts before the optimization.

The problem is straightforward, you have one offset walking through finding new matches (pos), and another showing then end of the last place you found a match (start).

public static String[] mySplit(String str, String regex)
{
    Vector<String> result = new Vector<String>;
    int start = 0;
    int pos = str.indexOf(regex);
    while (pos>=start) {
        if (pos>start) {
            result.add(str.substring(start,pos));
        }
        start = pos + regex.length();
        result.add(regex);
        pos = str.indexOf(regex,start); 
    }
    if (start<str.length()) {
        result.add(str.substring(start));
    }
    String[] array = result.toArray(new String[0]);
    return array;
}

This avoid extra looping and copies each character only once. Actually, because of the way that substring works, no characters are ever copied, only small string objects are created pointing to the original character buffer. No concatenation of strings is done at all, which is an important consideration.

I think your problem is that you are allocating storeSplit[] with a length that is longer than you need. If you are allowed to use ArrayList, use that to accumulate your results (and use the ArrayList.toArray() method to get the final return value for your function).

If you can't use ArrayList, then you will need to truncate your array before returning it (your counter variable will be of use in determining the correct length). To do that, you will need to allocate an array of correct length, then use System.arraycopy to populate it. Simpler to use ArrayList, but I don't know the exact requirements of your assignment.

As pointed out in the comments, the problem is that you are setting your array size to the length of the String. Instead, you want to set it to double the number of delimeters. Then, adjust accordingly:

  1. If the first character is a delimiter, subtract one,
  2. If the last character is not a delimiter, add one.
// Calculate number of delimiters in str
int delimiters = str.length() - str.replaceAll(regex, "").length();
// Calculate array size
int arraySize = (delimiters * 2) + (str.startsWith(regex) ? -1 : 0);
arraySize = str.endsWith(regex) ? arraySize : arraySize + 1;
String[] storeSplit = new String[arraySize];

It looks like the spacing problem you've got is because of your storeSplit array being a fixed length.

Let's say your input string is 5 characters long; your storeSplit array will have 5 'spaces' in there. That input string may only contain one delimiter; "ab#ef" for example, creating 3 sub-strings - "ab", "#" and "ef".

To avoid this, create a List instead:

List<String> storeSplit = new ArrayList<String>();

Then, rather than incrementing your counter and dropping your text in, add to the list:

storeSplit.add(""+str.charAt(i));

Instead of

storeSplit[counter] = ""+str.charAt(i);

Here is what I would do:

String[] test1 = "ab#cd#efg#".split("#");//splits the string on '#'
String result="";
for(String test:test1)//loops through the array
    result+="#"+test;//adds each member to the array putting the '#' in front of each one
System.out.println(result.substring(1));//prints out the string minus the first char, which is a '#'

I hope this helps.

here is the output of my code simply click on it package demo;

public class demo8 {

static int count = 0;
static int first = 0;
static int j = 0;

public static void main(String[] args) {

    String s = "ABHINANDAN TEJKUMAR CHOUGULE";
    int size = 0;

    for (int k = 0; k < s.length(); k++) {
        if (s.charAt(k) == ' ') {
            size++;
        }

    }

    String[] last = new String[size + 1];

    for (int i = 0; i < s.length(); i++) {
        int temp = s.length();

        if (i == s.length() - 1) {
            last[j] = s.substring(first, i + 1);
        }

        if (s.charAt(i) == ' ') {
            last[j] = s.substring(first, i);
            j++;
            first = i + 1;

        }

    }
    for (String s1 : last) {
        System.out.println(s1);
    }
[I tested my code and output is also attached with it ...!][1]}}

I have used recursion to solve it.

static void splitMethod(String str, char splitChar, ArrayList<String> list) {
        String restOfTheStr = null;
        StringBuffer strBufWord = new StringBuffer();
        int pos = str.indexOf(splitChar);
        if(pos>=0) {
            for(int i = 0; i<pos; i++) {
                strBufWord.append(str.charAt(i));
            }
            String word = strBufWord.toString();
            list.add(word);
            restOfTheStr = str.substring(pos+1);//As substring includes the 
            //splitChar, we need to do pos + 1
            splitMethod(restOfTheStr, splitChar, list);
        }
        if(pos == -1) {
            list.add(str);
            return;
        }

    }

Use:

ArrayList<String> list= new ArrayList<String>();//in this list
    //the words will be stored
    String str = "My name is Somenath";
    splitMethod(str,' ', list );

Below is the method

public static List<String> split(String str, String demarcation) {
    ArrayList<String> words = new ArrayList<>();
    int startIndex = 0, endIndex;

    endIndex = str.indexOf(demarcation, startIndex);

    while (endIndex != -1) {
        String parts = str.substring(startIndex, endIndex);

        words.add(parts);

        startIndex = endIndex + 1;
        endIndex = str.indexOf(demarcation, startIndex);

    }

    // For the last words
    String parts = str.substring(startIndex);

    words.add(parts);
    return words;
}
    public List<String> split(String str , String regex) {
    char c ;
    int count=0;
    int len = regex.length();
    String temp;
    List<String> result = new ArrayList<>();
    for(int i=0;i<str.length();i++) {
        //System.out.println(str.substring(i, i+len-1));
        temp = str.substring(i, i+len>str.length()?str.length():i+len);
        if(temp.compareTo(regex) == 0) {
            result.add(str.substring(count , i));
            count = i+len;
        }
    }
    result.add(str.substring(count, str.length()));
    return result;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM