简体   繁体   English

ruby on rails正则表达式

[英]ruby on rails regular expression

I'm new to regular expressions and i'm feeling this very difficult to solve: 我是正则表达式的新手,感觉很难解决:

I have a the following string: 我有以下字符串:

"inforun 7970 12423 99 10:03 ? 00:09:03 abcd -PR -gmh domain.den.abc.com -gmp 6020 -guid 9c06cc02-b1c8-41cf-93e6-1d795e9fff62 -rst 180 -s FOLDER_NAME:wkf_workflow.s_session -something Session task instance [session]"

I have to extract the time which 10:03 and the 'domain' in domain.den.abc.com and the FOLDER_NAME , the 'workflow' in the 'wkf_worklfow' and the 'session' in the 's_session' the time , domain , foldername , workflow and session keep changing for every string and there are a bunch of strings like this all attached as a single string. 我必须提取time 10:03以及domain.den.abc.comFOLDER_NAME'domain''wkf_worklfow''workflow'以及'wkf_worklfow''session' 's_session'timedomainfoldernameworkflowsession随着每个字符串的变化而不断变化,并且有很多这样的字符串作为单个字符串附加。 Here are the patterns that are common for every string. 这是每个字符串通用的模式。 the 'abcd -PR -gmh' is common and can help in finding the time that is just before the '?' 'abcd -PR -gmh'很常见,可以帮助找到'?'之前的时间'?' the '-s', ':wkf' is common for all the string and the folder_name is right in between these two. '-s', ':wkf'在所有字符串中都是通用的,而folder_name就在这两个字符串之间。 the 'workflow' is in between 'wkf_' and '.s_' the session is in between 's_' and the immediate next '-' I need the time, domain, folder name, workflow and session all in seperate strings. 'workflow''wkf_''.s_'之间,会话在's_'和紧邻的下一个'-'我需要时间,域名,文件夹名称,工作流和会话都在单独的字符串中。 I tried to practice regular expression only a few days back it is taking so much time for me even to comprehend something like "\\A[\\w+\\-.]+@[az\\d\\-.]+\\.[az]+\\z" which is given here Thank you for your help 几天前,我尝试练习正则表达式,即使我理解诸如"\\A[\\w+\\-.]+@[az\\d\\-.]+\\.[az]+\\z"东西,也要花很多时间"\\A[\\w+\\-.]+@[az\\d\\-.]+\\.[az]+\\z" 在这里给出谢谢您的帮助

Regex (time $1 , domain $2 , folder name $3 , workflow $4 , session $5 ): 正则表达式(时间$1 ,域$2 ,文件夹名$3 ,工作流$4 ,会话$5 ):

(?<=\s)(\d{2}:\d{2})(?=\s).*?(?<=\s)((?:[a-zA-Z\d]+(?:\-[a-zA-Z\d]+)*\.)+[a-zA-Z]{2,4})(?=\s).*?(?<=\s)([a-zA-Z\d_]+):wkf_([a-zA-Z\d]+)\.s_([a-zA-Z\d]+)(?=\s)

Ruby: 红宝石:

text = "inforun 7970 12423 99 10:03 ? 00:09:03 abcd -PR -gmh domain.den.abc.com -gmp 6020 -guid 9c06cc02-b1c8-41cf-93e6-1d795e9fff62 -rst 180 -s FOLDER_NAME:wkf_workflow.s_session -something Session task instance [session]"
text =~ /(?<=\s)(\d{2}:\d{2})(?=\s).*?(?<=\s)((?:[a-zA-Z\d]+(?:\-[a-zA-Z\d]+)*\.)+[a-zA-Z]{2,4})(?=\s).*?(?<=\s)([a-zA-Z\d_]+):wkf_([a-zA-Z\d]+)\.s_([a-zA-Z\d]+)(?=\s)/
puts $~.captures

Output: 输出:

10:03
domain.den.abc.com
FOLDER_NAME
workflow
session

See and test the code here . 此处查看和测试代码。

Here's a regex you could use. 这是您可以使用的正则表达式。 I'm not familiar enough with Ruby/RoR to help there, but presuming you actually want to use a regex for it, this regex should get you everything in one go 我对Ruby / RoR不够熟悉,无法为您提供帮助,但是假设您实际上要为其使用正则表达式,那么此正则表达式应该让您一劳永逸

^.* (\d\d?:\d\d) \? .*? -gmh (.*?)\..*? -s (.*?):wkf_(.*?)\.s_(.*?) .*$

http://regexr.com?31da7 should show the capturing groups and their contents http://regexr.com?31da7应该显示捕获组及其内容

$1    $2     $3          $4       $5
10:03 domain FOLDER_NAME workflow session

It presumes the Time is immediately before the question mark and is formatted as digit (optional digit) colon digit digit, that the domain immediately follows '-gmh ', that the folder name follows the -s and precedes the :wkf_, the workflow follows the :wkf_, and that the session is after the .s_ 假定时间紧接在问号之前,并且格式设置为数字(可选数字)冒号数字,域紧跟在'-gmh'之后,文件夹名称紧跟-s并紧跟:wkf_,工作流紧随其后:wkf_,并且会话在.s_之后

Assuming you're using ruby 1.9, here's a starting point: 假设您使用的是ruby 1.9,这是一个起点:

/(?<time>\d{2}:\d{2}:\d{2}) abcd -PR -gmh (?<domain>[a-zA-Z]*)/i =~ s
/-s (?<folder_name>\w*):wkf_(?<workflow>\w*)\.s_(?<session>\w*)/i =~ s

After running these two lines, you should have: 运行这两行之后,您应该具有:

1.9.3p125 :023 > time
=> "00:09:03" 
1.9.3p125 :024 > domain
 => "domain" 
1.9.3p125 :025 > folder_name
 => "FOLDER_NAME" 
1.9.3p125 :026 > workflow
 => "workflow" 
1.9.3p125 :027 > session
 => "session" 

You still need to define what characters are allowed for each case and add error handling too. 您仍然需要定义每种情况允许使用的字符,并添加错误处理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM