简体   繁体   中英

How do I remove delimiter restovers from a scanner? (Java)

I admit, not the best title.

I'm having the following problem. I need to use my scanner and parse every word (without the delimiters) to separate strings.

Example: Poker; Blackjack; LasVegas, NewYork Poker; Blackjack; LasVegas, NewYork Poker; Blackjack; LasVegas, NewYork to Poker Blackjack LasVegas NewYork

Now, for the first part, I would just use a delimiter like so: sc.useDelimiter("; ") which would work fine.

Second part is where I get trouble. If I switch to sc.useDelimiter(", ") after I'm done with Blackjack, I would still include that first ; and a whitespace so the string would output ; LasVegas ; LasVegas .

I tried going over it by first resetting the delimiter and eating up the first token which is kind of a bad way of solving it, but then the string would still turn out to be "whitespace"LasVegas instead of LasVegas .

Would really appreciate some help.

There are a number of ways to deal with this, depending on your actual requirements 1 :

  1. Don't change the delimiter. The token after "Blackjack" will be "LasVegas, NewYork to Poker Blackjack LasVegas NewYork" . Create another scanner to parse that token. (Or use String::split .)
  2. Use a delimiter regex that can will match either delimiter; eg "[;,]\\\\s*" .
  3. Parse like this:

     String line = scanner.nextLine(); String[] parts = line.split(";\\\\s*"); String[] parts2 = parts[2].split(",\\\\s*");

    This is assuming that ; is a primary delimiter and , is a secondary delimiter.

  4. Change the input file syntax so that it uses only one delimiter character. (This assumes that you are free to do that, AND that an alternative syntax would "make more sense".)


1 - Obviously, we cannot infer the syntax of the file that you are trying to parse from a single line of input. Or, in general, from a single example input file.

Using a regular expression to match both types of punctuation, including any trailing whitespace, should do the trick.

sc.useDelimiter("[;,]\\s*");
                     ^^^^ Followed by 0 or more whitespace chars
                 ^^^^ Either of these

This will fail to capture the last token ( NewYork in this case) if there is no semicolon or comma after it. If these 4-tuples of games & cities come in this format (where no delimiter comes after the last token) then you can additionally match a newline character:

sc.useDelimiter("\\n|[;,]\\s*");
                     ^^^^^^^^ semi/comma delimiters
                    ^ OR
                 ^^^ New-line character

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM