[英]ruby on rails regular expression
I'm new to regular expressions and i'm feeling this very difficult to solve: 我是正则表达式的新手,感觉很难解决:
I have a the following string: 我有以下字符串:
"inforun 7970 12423 99 10:03 ? 00:09:03 abcd -PR -gmh domain.den.abc.com -gmp 6020 -guid 9c06cc02-b1c8-41cf-93e6-1d795e9fff62 -rst 180 -s FOLDER_NAME:wkf_workflow.s_session -something Session task instance [session]"
I have to extract the time
which 10:03
and the 'domain'
in domain.den.abc.com
and the FOLDER_NAME
, the 'workflow'
in the 'wkf_worklfow'
and the 'session'
in the 's_session'
the time
, domain
, foldername
, workflow
and session
keep changing for every string and there are a bunch of strings like this all attached as a single string. 我必须提取
time
10:03
以及domain.den.abc.com
和FOLDER_NAME
的'domain'
, 'wkf_worklfow'
的'workflow'
以及'wkf_worklfow'
的'session'
's_session'
的time
, domain
, foldername
, workflow
和session
随着每个字符串的变化而不断变化,并且有很多这样的字符串作为单个字符串附加。 Here are the patterns that are common for every string. 这是每个字符串通用的模式。 the
'abcd -PR -gmh'
is common and can help in finding the time that is just before the '?'
'abcd -PR -gmh'
很常见,可以帮助找到'?'
之前的时间'?'
the '-s', ':wkf'
is common for all the string and the folder_name
is right in between these two. '-s', ':wkf'
在所有字符串中都是通用的,而folder_name
就在这两个字符串之间。 the 'workflow'
is in between 'wkf_'
and '.s_'
the session is in between 's_'
and the immediate next '-'
I need the time, domain, folder name, workflow and session all in seperate strings. 'workflow'
在'wkf_'
和'.s_'
之间,会话在's_'
和紧邻的下一个'-'
我需要时间,域名,文件夹名称,工作流和会话都在单独的字符串中。 I tried to practice regular expression only a few days back it is taking so much time for me even to comprehend something like "\\A[\\w+\\-.]+@[az\\d\\-.]+\\.[az]+\\z"
which is given here Thank you for your help 几天前,我尝试练习正则表达式,即使我理解诸如
"\\A[\\w+\\-.]+@[az\\d\\-.]+\\.[az]+\\z"
东西,也要花很多时间"\\A[\\w+\\-.]+@[az\\d\\-.]+\\.[az]+\\z"
在这里给出谢谢您的帮助
Regex (time $1
, domain $2
, folder name $3
, workflow $4
, session $5
): 正则表达式(时间
$1
,域$2
,文件夹名$3
,工作流$4
,会话$5
):
(?<=\s)(\d{2}:\d{2})(?=\s).*?(?<=\s)((?:[a-zA-Z\d]+(?:\-[a-zA-Z\d]+)*\.)+[a-zA-Z]{2,4})(?=\s).*?(?<=\s)([a-zA-Z\d_]+):wkf_([a-zA-Z\d]+)\.s_([a-zA-Z\d]+)(?=\s)
Ruby: 红宝石:
text = "inforun 7970 12423 99 10:03 ? 00:09:03 abcd -PR -gmh domain.den.abc.com -gmp 6020 -guid 9c06cc02-b1c8-41cf-93e6-1d795e9fff62 -rst 180 -s FOLDER_NAME:wkf_workflow.s_session -something Session task instance [session]"
text =~ /(?<=\s)(\d{2}:\d{2})(?=\s).*?(?<=\s)((?:[a-zA-Z\d]+(?:\-[a-zA-Z\d]+)*\.)+[a-zA-Z]{2,4})(?=\s).*?(?<=\s)([a-zA-Z\d_]+):wkf_([a-zA-Z\d]+)\.s_([a-zA-Z\d]+)(?=\s)/
puts $~.captures
Output: 输出:
10:03
domain.den.abc.com
FOLDER_NAME
workflow
session
Here's a regex you could use. 这是您可以使用的正则表达式。 I'm not familiar enough with Ruby/RoR to help there, but presuming you actually want to use a regex for it, this regex should get you everything in one go
我对Ruby / RoR不够熟悉,无法为您提供帮助,但是假设您实际上要为其使用正则表达式,那么此正则表达式应该让您一劳永逸
^.* (\d\d?:\d\d) \? .*? -gmh (.*?)\..*? -s (.*?):wkf_(.*?)\.s_(.*?) .*$
http://regexr.com?31da7 should show the capturing groups and their contents http://regexr.com?31da7应该显示捕获组及其内容
$1 $2 $3 $4 $5
10:03 domain FOLDER_NAME workflow session
It presumes the Time is immediately before the question mark and is formatted as digit (optional digit) colon digit digit, that the domain immediately follows '-gmh ', that the folder name follows the -s and precedes the :wkf_, the workflow follows the :wkf_, and that the session is after the .s_ 假定时间紧接在问号之前,并且格式设置为数字(可选数字)冒号数字,域紧跟在'-gmh'之后,文件夹名称紧跟-s并紧跟:wkf_,工作流紧随其后:wkf_,并且会话在.s_之后
Assuming you're using ruby 1.9, here's a starting point: 假设您使用的是ruby 1.9,这是一个起点:
/(?<time>\d{2}:\d{2}:\d{2}) abcd -PR -gmh (?<domain>[a-zA-Z]*)/i =~ s
/-s (?<folder_name>\w*):wkf_(?<workflow>\w*)\.s_(?<session>\w*)/i =~ s
After running these two lines, you should have: 运行这两行之后,您应该具有:
1.9.3p125 :023 > time
=> "00:09:03"
1.9.3p125 :024 > domain
=> "domain"
1.9.3p125 :025 > folder_name
=> "FOLDER_NAME"
1.9.3p125 :026 > workflow
=> "workflow"
1.9.3p125 :027 > session
=> "session"
You still need to define what characters are allowed for each case and add error handling too. 您仍然需要定义每种情况允许使用的字符,并添加错误处理。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.