简体   繁体   English

Ruby .split(各种分隔符)将分隔符保留在匹配数组中

[英]Ruby .split (various delimiters) keep delimiters in match array

I'm dealing with an export file that contains a note field in an exported .csv file. 我正在处理一个导出文件,该文件在导出的.csv文件中包含一个注释字段。 The notes exported look like this. 导出的注释如下所示。

"AC) Dianne - # or code dialed is incorrect. AC) dianne - # or code dialed is incorrect. WDB) Dianne - Wrong Number. AC) Dianne - # or code dialed is incorrect." “” AC)Dianne-#或所拨的代码不正确。AC)dianne-#或所拨的代码不正确。WDB)Dianne-错误的号码。AC)Dianne-#或所拨的代码不正确。“

Within this field we have a user who left the note and the actual note itself. 在此字段中,我们有一个用户留下了笔记和实际笔记本身。

The delimiters in the files are "AC)" or "WDB)" 文件中的分隔符是“ AC)”或“ WDB)”

I need to write these as: 我需要这样写:

AC) Dianne - # or code dialed is incorrect.
AC) dianne - # or code dialed is incorrect.
WDB) Dianne - Wrong Number.
AC) Dianne - # or code dialed is incorrect.

Using a regular expression and the ruby function .split I can output the text following the delimiter but I lose the user who captured the notes. 使用正则表达式和ruby函数.split,我可以在定界符之后输出文本,但是我失去了捕获注释的用户。

Ruby 红宝石

notes.split( /AC\)|WDB\)/ ).each do |n|
    puts n  
end     

Output 产量

Dianne - # or code dialed is incorrect.
dianne - # or code dialed is incorrect.
Dianne - Wrong Number.
Dianne - # or code dialed is incorrect.

In the code above I have no idea who the user (AC, WDB) who left the individual note. 在上面的代码中,我不知道谁留下了单个注释的用户(AC,WDB)。

I'm not sure if I need to switch to a .scan, alter the regex (ie. include a lookbehind), etc. Does anyone have any idea how I can capture the user and text to look like this? 我不确定是否需要切换到.scan,更改正则表达式(例如,在后方添加一个后缀)等。是否有人知道我如何捕获用户和文本?

Output 产量

AC) Dianne - # or code dialed is incorrect.
AC) dianne - # or code dialed is incorrect.
WDB) Dianne - Wrong Number.
AC) Dianne - # or code dialed is incorrect.

Just split the input according to the below lookahead, 只需按照以下先行方式拆分输入,

(?=AC\)|WDB\))

Lookarounds are zero width assertions. 环顾四周是零宽度的断言。 It won't match any character but it was used for condition checking purposes. 它不匹配任何字符,但用于条件检查。

Code: 码:

> "AC) Dianne - # or code dialed is incorrect. AC) dianne - # or code dialed is incorrect. WDB) Dianne - Wrong Number. AC) Dianne - # or code dialed is incorrect.".split(/(?=AC\)|WDB\))/)
=> ["AC) Dianne - # or code dialed is incorrect. ", "AC) dianne - # or code dialed is incorrect. ", "WDB) Dianne - Wrong Number. ", "AC) Dianne - # or code dialed is incorrect."]

You can try this - 你可以试试这个-

notes.gsub!(/\s+((?:AC\))|(?:WDB\)))/, '\n\1')

Test out the regex here 这里测试正则表达式

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM