简体   繁体   English

字符串令牌生成器/正则表达式可在文件中查找电子邮件地址/ IP地址

[英]String Tokenizer/Regex to find email address/IP Address in a file

I have a document with lines containing email addresses and IP addresses. 我有一个包含行的文档,其中包含电子邮件地址和IP地址。 I need to split the document in terms of email addresses and IP addresses to store each IP/email address or words in the file in an array. 我需要按照电子邮件地址和IP地址来拆分文档,以将每个IP /电子邮件地址或单词存储在数组中的文件中。

Is there a way to use regex/String Tokenizer to find email/IP address to do this? 有没有办法使用正则表达式/字符串令牌生成器来查找电子邮件/ IP地址来执行此操作? I am aware of how regex/String Tokenizer can be used to separate words in a sentence line by line. 我知道如何使用regex / String Tokenizer逐行分隔句子中的单词。 Just not sure how to find email/IP addresses. 只是不确定如何找到电子邮件/ IP地址。 Because the file may contain illegal characters like @ \\ // which should not be included in the array. 因为该文件可能包含非法字符,例如@ \\ //,所以不应将其包含在数组中。

For example my document contains: 例如,我的文档包含:

You can contact test@test.com, the address is 192.168.1.1. 您可以联系test@test.com,地址为192.168.1.1。

My array should contain: 我的数组应包含:

You

can 能够

contact 联系

test@test.com test@test.com

the

address 地址

is

192.168.1.1 192.168.1.1

Here is a regexr with some examples and a regex that should work for you. 这是一个带有一些示例的regexr和一个适合您的regex。

Regex is (the email portion is copied from here , I'm also not positive it copied and pasted correct.): 正则表达式是(电子邮件部分是从此处复制的,我也不肯定复制并正确粘贴了它。):

(([^<>()\[\]\.,;:\s@\"]+(\.[^<>()\[\]\.,;:\s@\"]+)*)|(\".+\"))@(([^<>()[\]\.,;:\s@\"]+\.)+[^<>()[\]\.,;:\s@\"]{2,})|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}

The regex for email address is: 电子邮件地址的正则表达式为:

[\w!#$%&'*+/=?^_`{|}~-]+(?:\.[\w!#$%&'*+/=?^_`{|}~-]+)*@(?:[\w](?:[\w-]*[\w])?\.)+[\w](?:[\w-]*[\w])?

And the regex for IP address is: IP地址的正则表达式为:

((?:(?:25[0-5]|2[0-4]\d|((1\d{2})|([1-9]?\d)))\.){3}(?:25[0-5]|2[0-4]\d|((1\d{2})|([1-9]?\d))))

In my opinion, you can use java.util.regex.Matcher and call method matcher.group(0) like: 我认为,您可以使用java.util.regex.Matcher并调用方法matcher.group(0)如下所示:

 Pattern p = Pattern.compile("<your regex here>");
 Matcher m = p.matcher(str);
 List<String> strs = new ArrayList<>();
 while (m.find())
     strs.add(m.group(0));

These may works fine, but I'm not test yet. 这些可能工作正常,但我尚未测试。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM