简体   繁体   English

StringTokenizer的使用

[英]StringTokenizer use

I am trying to use the StringTokenizer to set a comma as the delimiter with lines from a file, each consisting of a zipcode and city (ex 01221, Washington, DC), but I don't want the second comma to be a delimiter since it is part of the city name. 我正在尝试使用StringTokenizer将逗号设置为带有文件中各行的定界符,每个行均包含邮政编码和城市(例如,华盛顿特区,华盛顿特区01221),但是我不希望第二个逗号成为定界符,因为它是城市名称的一部分。 I have 2 classes in which I read in the zip code and city name, but I'm not sure how to correctly use the tokenizer to set the zip code and city apart. 我有2个班级,我在其中读入邮政编码和城市名称,但是我不确定如何正确使用令牌生成器将邮政编码和城市区分开。 I want to return 1 array of each the zip code and city from each class so I don't want to combine them into one class. 我想从每个班级返回每个邮政编码和城市的1个数组,所以我不想将它们合并为一个班级。 I am also not sure if I should put the tokenizer in my main method or not. 我也不确定是否应该将标记符放入我的主要方法中。

public static String[] getZipCodes (File zips, Scanner hi, int d)
{
    //creating array of zip codes with the length of the number of lines
    String[] zipCodes=new String[d];
    //Loops through each zip code to fill the array
    for (int i=0; i<d; i++)
    {
        zipCodes[i]=hi.next();
        System.out.println(zipCodes[i]);
    }
    return zipCodes;
}

public static String[] getCities (File zips, Scanner hi, int d)
{
    //creating array of cities
    String[] cities=new String[d];
    //fills array with city names, parallel to its zip code
    for (int i=0; i<d; i++)
    {
        hi.next();
        cities[i]=hi.next();
        hi.nextLine();
    }
    return cities;
}

Here is the main method: 这是主要方法:

public static void main(String[] args) throws IOException {
    File zips=new File("ZipCodesCity.txt");
    //Scanner to count number of lines in file
    Scanner in=new Scanner(zips);

    //Counting number of zip codes
    int codenum=0;
    while (in.hasNextLine())
    {
        codenum++;
    }

    //making a second scanner to read in the zip codes
    Scanner hi= new Scanner(zips);

    //initializing array of zip codes
    String[]zipCodes=getZipCodes(zips, hi, codenum);

    //Tokenizer
    StringTokenizer wrd=new StringTokenizer(hi.nextLine(), ",");
}

A workaround solution to your problem would be to use a regular expression with the stringtokenizer. 解决您的问题的一种方法是将正则表达式与stringtokenizer一起使用。 You're only trying to split on the first comma, since the second comma is part of the city name. 您只想对第一个逗号进行拆分,因为第二个逗号是城市名称的一部分。 Conveniently, the first comma always comes after number since a zip code is always a number, and the second comma never comes after a number because no city name contains any numbers. 方便地,由于邮政编码始终是数字,因此第一个逗号始终在数字之后,而第二个逗号也永远不会在数字之后,因为没有城市名称包含任何数字。 Thus, your delimiter is any expression where there is a number followed by a comma. 因此,您的定界符是任何带有数字后跟逗号的表达式。 You can reference this article to see an example of using regex with stringtokenizer. 您可以参考本文以查看将regex与stringtokenizer一起使用的示例。

EDIT More simply you could just take the substring consisting of the first 5 characters since every zip code is of length 5, and then the rest of the string is the city and state. 编辑更简单地说,因为每个邮政编码的长度为5,所以您可以仅采用由前5个字符组成的子字符串,然后字符串的其余部分为城市和州。 This approach is much simpler 这种方法要简单得多

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM