简体   繁体   English

使用多个Java正则表达式

[英]Using Multiple Java regular expressions

I am trying to extract an email and replace it with a space using a pattern(EMAIL_PATTERN). 我正在尝试提取电子邮件,并使用模式(EMAIL_PATTERN)将其替换为空格。 When running the following, no output is produced when a full document is passed in. The pattern will only match the entire region. 运行以下命令时,传递完整文档时不会产生任何输出。该模式仅匹配整个区域。 So this means if we pass in only the email, the email will be matched and be replaced with a space. 因此,这意味着如果我们仅传递电子邮件,则电子邮件将被匹配并被空格替换。 But the purpose of the following method is to find the email and previous manual extraction is not required. 但是,以下方法的目的是查找电子邮件,并且不需要以前的手动提取。 After the email in the tempString has been replaced, I want to use it for the next pattern. 替换了tempString中的电子邮件后,我想将其用于下一个模式。 Should I combine the patterns I want to use in one method or should they be placed in separate methods? 我应该将要使用的模式组合在一种方法中,还是应该将其放在单独的方法中? Below is the code I have as of now. 下面是我到目前为止的代码。 Also I have other patterns, but since my method is not working correctly I have not posted them yet. 我也有其他模式,但是由于我的方法无法正常工作,所以我还没有发布它们。

private static final String EMAIL_PATTERN = "[_A-Za-z0-9-]+(\\.[_A-Za-z0-9-]+)*@[A-Za-z0-9]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})";
public static void main (String[] args) {
//Document takes in a ID, student information(which includes email, address, phone, name),   school, and text   
Document r = new Document("", "FirstName LastName, Address, example@email.com,    phoneNumber", "School", "experience", "text");
            personalEmailZone(r);

 }
public static Document personalEmailZone(Document doc){
    //tempString is the personal information section of a resume
    String tempPI = doc.tempString();
    if(doc.tempString().matches(EMAIL_PATTERN) == true){
        //Pattern pattern = Pattern.compile("");
        Pattern pattern = Pattern.compile(EMAIL_PATTERN);
        Matcher matcher = pattern.matcher(tempPI);
        String emailTemp = "";
        if(matcher.find()){
            emailTemp = matcher.group();
            System.out.println(emailTemp);
            //PI.replace(emailTemp, "");
            System.out.println(emailTemp.replace(emailTemp, ""));
            tempPI = tempPI.replace(emailTemp, "");
            System.out.println(tempPI);
        }
    }
    return doc;
}

You can place your patterns in different methods, which return the modified string for the text pattern usage. 您可以将模式放置在不同的方法中,这些方法返回修改后的字符串以用于文本模式。 For example 例如

String tempPI = doc.tempString();
tempPI = applyPattern1(tempPI);
tempPI = applyPattern2(tempPI)
tempPI = applyPattern3(tempPI);

Your code does't show any output because of doc.tempString().matches(EMAIL_PATTERN) == true . 由于doc.tempString().matches(EMAIL_PATTERN) == truedoc.tempString().matches(EMAIL_PATTERN) == true您的代码未显示任何输出。 Maybe it's not needed there, since it expects the entire string to be an email. 也许在那里不需要,因为它希望整个字符串都是电子邮件。

You have several problems: 您有几个问题:

public static Document personalEmailZone(Document doc){
    //tempString is the personal information section of a resume
    String tempPI = doc.tempString();
    if(doc.tempString().matches(EMAIL_PATTERN) == true){

The above statement attempts to match the entire document against the email address pattern. 上面的语句尝试将整个文档与电子邮件地址模式进行匹配。 This will not match unless doc.tempString() contains ONLY a single email address and nothing else. 除非doc.tempString()仅包含一个电子邮件地址,否则将不匹配。

        //Pattern pattern = Pattern.compile("");
        Pattern pattern = Pattern.compile(EMAIL_PATTERN);
        Matcher matcher = pattern.matcher(tempPI);
        String emailTemp = "";
        if(matcher.find()){
            emailTemp = matcher.group();
            System.out.println(emailTemp);
            //PI.replace(emailTemp, "");
            System.out.println(emailTemp.replace(emailTemp, ""));

Not sure what the above is for. 不知道上面是做什么的。 If your code ever reached this point, it would always print an empty line. 如果您的代码达到了这一点,它将始终打印一个空行。

            tempPI = tempPI.replace(emailTemp, "");
            System.out.println(tempPI);
        }

Since there's no loop, you will have replaced only the first occurrence of an email address. 由于没有循环,因此您将仅替换首次出现的电子邮件地址。 If you're expecting to replace ALL occurrences, you need to loop over the input. 如果您希望替换所有出现的内容,则需要遍历输入。

    }
    return doc;

At this point you haven't actually modified doc , so you're returning the document in its original form, with email addresses included. 此时,您尚未实际修改doc ,因此您将以原始格式返回文档,其中包括电子邮件地址。

}

Look at the Javadoc for String#replaceAll(String regex, String replacement) 查看Javadoc中的String#replaceAll(String regex, String replacement)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM