简体   繁体   English

如何在Java文本文件中扫描某些字符?

[英]How do I scan a text file for certain characters in java?

I have to take this text file: 我必须将这个文本文件:

Ulric Schwartz ullamcorper@Quisque.ca Fringilla Donec PC urna convallis erat
Jesse Conrad Nunc@eunulla.edu Magna Praesent Interdum Incorporated et netus          
et
Ethan Eaton cursus@Nullam.co.uk Sed Consequat Auctor Institute posuere   
vulputate lacus
Griffin Stephenson habitant@mattis.com Purus Sapien Institute auctor non 
feugiat
Alan Howell lorem@penatibusetmagnis.com Mi Limited non sollicitudin a
Sawyer Stokes ornare@utmiDuis.com Ut Institute nibh Phasellus nulla
Nigel Sanford adipiscing@euerat.org Lacus Varius Corp Integer vitae nibh

and scan it for the email addresses, meaning an @ followed by atleast three characters, a period, and atleast two more characters. 并扫描它的电子邮件地址,即@,后接至少三个字符,一个句点和至少两个另外的字符。 I understand how to scan the file: 我了解如何扫描文件:

while(fscan.hasNext())
{
    //scan for emails goes in here
}

but I'm not sure how to scan for the email. 但我不确定如何扫描电子邮件。 This is what I have: 这就是我所拥有的:

import java.io.*;
import java.util.Scanner;

public class lab11_emena {

    public static void main(String[] args)
    {
   Scanner cscan = new Scanner(System.in);
   System.out.println("Please enter the file name.");
   String filename = " ";
   filename= cscan.nextLine();

   File inFile = new File(filename);


            if(!inFile.exists())
            {
            System.out.println("File " + filename + " does not exist.");
            System.exit(0);
            }

            Scanner fscan =  new Scanner(inFile);//I am getting an error     
here? Saying inFile was thrown

System.out.println("Opened file " + filename); 



   }




}

You must use a scanner to read the characters. 您必须使用扫描仪读取字符。 Then check for the different requirements for each thing like the @ character. 然后检查每个事物(例如@字符)的不同要求。 So if char=="@" then it will continue looking for the other requirements. 因此,如果char ==“ @”,它将继续寻找其他要求。 Then make it go forwards and backwards, untill it finds the spaces at either end of the email, then you can import all of the characters between them. 然后向前和向后移动,直到找到电子邮件两端的空格,然后可以导入它们之间的所有字符。

I would first recommend using a delimeter between the different pieces of information (ie a comma). 我首先建议在不同的信息之间使用分隔符(即逗号)。

Example Ulric Schwartz, ullamcorper@Quisque.ca, Fringilla Donec, PC urna convallis erat 示例:Ulric Sc​​hwartz,ullamcorper @ Quisque.ca,Fringilla Donec,PC urna convallis erat

Now if all your lines will have the same number of "categories (info between each comma)" of information (the above example would have 4). 现在,如果您的所有行都具有相同数量的“类别(每个逗号之间的信息)”信息(上面的示例中有4个)。 Then you can load each item into an array and then pull out #2,6,10, etc.. 然后,您可以将每个项目加载到数组中,然后拉出#2、6、10等。

If the categories will vary then you would have to do as D3sast3r stated, find the @, then scan forwards and backwards to the spaces. 如果类别会有所不同,则您必须按照D3sast3r的说明进行操作,找到@,然后向前和向后扫描到空格。

Try something like this. 尝试这样的事情。

Scan the entire file into an arraylist. 将整个文件扫描到arraylist中。 arrays by default use whitespace as a delimiter, since there is no whitespace in a valid email address you will be fine. 默认情况下,数组使用空格作为分隔符,因为有效的电子邮件地址中没有空格,您可以。

while(inputFile.hasNext()) {
    ArrayList.add(inputFile.next());
}

This put every character into an element of the array, using spaces to separate them. 这会将每个字符放入数组的元素中,并使用空格分隔它们。 So element 0 = Urlic, element 1 = Schwartz, etc... Now you can use a regex object as gtgaxiola suggested to compare each element of the array too 所以元素0 = Urlic,元素1 = Schwartz,依此类推...现在您可以使用regex对象,因为gtgaxiola建议也比较数组的每个元素

String email = "\\w+@\\w{3,}\\.\\w{2,}";

This is basically an string object based on your requirements. 根据您的要求,这基本上是一个字符串对象。 "stuff" then a @ symbol then at least 3 characters then a period then at least 2 more characters “填充”,然后是一个@符号,然后至少3个字符,再加上一个句点,然后再至少2个字符

Now search the array with a for loop and an if statement 现在使用for循环和if语句搜索数组

for(i = 0; i < ArrayList.length(); i++) {
    if(ArrayList.get(i).contains(email) {
        //do something with the email address
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM