简体   繁体   中英

Want to get seperate tokens from a line scanned in from a file. Java

I'm doing a date program that reads in dates and displays date ranges and invalid dates. To start I am reading in a line at a time and formatting the line the way I wish. However, I also want to take the month, day, and year from the String line I read in into separate variables I can work with. The data I read in looks like:

    June 17, 1997
    July 23, 1997
    September 28, 1980
    September 31, 1980
    Mar. 2, 1980
    Apr. 2, 1980
    May 3, 1980
    Nov 25, 1989
    Dec 25, 1989
    Jan 3, 1973

A snippet of my code so far looks like

  Scanner in = null;
    try {
        in = new Scanner(new File("dates.txt"));
    } catch (FileNotFoundException exception) {
        System.err.println("failed to open dates.txt");
        System.exit(1);
    }
    while (in.hasNextLine()) {
        String line = in.nextLine();
        line = line.replace(".", "");
        line = line.replace(",", "");
    }

So my question is, how can I "scan" my line variable and separate it into different token/variables of month, day, year. Or could I possibly scan for string tokens initially, instead of scanning the whole line, and reform them to what I want (getting rid of commas and periods), then parse them into ints? And if that's possible, what's the operation to parse it into a int?

You can use something like this..

import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.Scanner;
import java.util.StringTokenizer;

public class ParseFileName {
  public static void main(String[] args) throws IOException {
    Scanner in = null;
    try {
      in = new Scanner(new File("dates.txt"));
    } catch (FileNotFoundException exception) {
      System.err.println("failed to open dates.txt");
      System.exit(1);
    }
    while (in.hasNextLine()) {
      String line = in.nextLine();
      line = line.replace(".", "");
      line = line.replace(",", "");

      StringTokenizer st = new StringTokenizer(line);
      String strMonth = st.nextToken();
      String strDay = st.nextToken();
      String strYear = st.nextToken();

      Integer day = Integer.parseInt(strDay);
      Integer year = Integer.parseInt(strYear);
      //...
    }
  }

}

Now, you need to make sure your lines are always like that, otherwise the nextToken will need a validation.. like using hasMoreTokens method. You can also write a mapping method if you need to get integers for months.

There are more than one ways to skin a cat:

You could use a regexp

String[] strings = new String[3];

Pattern p = Pattern.compile("(\\w+) (\\d+), (\\d+)");
Matcher m = p.matcher(inputString);
if(m.matches()) {
   for(int i=0;i<3;i++) {
      strings[i] = m.group(i+1);
   }
}

The most handy tool for this is the Java online regex checker here

You could have a different approach: instead of reading the whole lines, and then separating them into fields, you could try reading the fields one by one using Scanner 's appropriate methods:

public String next()

Finds and returns the next complete token from this scanner. A complete token is preceded and followed by input that matches the delimiter pattern. This method may block while waiting for input to scan, even if a previous invocation of hasNext() returned true.

public int nextInt(int radix)

Scans the next token of the input as an int. This method will throw InputMismatchException if the next token cannot be translated into a valid int value as described below. If the translation is successful, the scanner advances past the input that matched. If the next token matches the Integer regular expression defined above then the token is converted into an int value as if by removing all locale specific prefixes, group separators, and locale specific suffixes, then mapping non-ASCII digits into ASCII digits via Character.digit, prepending a negative sign (-) if the locale specific negative prefixes and suffixes were present, and passing the resulting string to Integer.parseInt with the specified radix.

Or, you could use the SimpleDateFormat to handle the different formats you have

SimpleDateFormat format1 = new SimpleDateFormat("MMMMM d, yyyy");
SimpleDateFormat format2 = new SimpleDateFormat("MMM. d, yyyy");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM