[英]Regex: keep same pattern found multiple times in same line and replace line by appending single pattern in front
Is it possible with notepad++ (or maybe from linux bash shell) to create multiple lines from a pattern found , as many times as the pattern is found and also append single found pattern in the newly created line? 是否可以使用notepad ++(或从linux bash shell中)从找到的模式创建多行,与找到该模式的次数相同,并在新创建的行中追加单个找到的模式?
The multi pattern is val=[0-9]+
The single pattern is id=[a-zA-Z0-9]+
多重模式为
val=[0-9]+
单一模式为id=[a-zA-Z0-9]+
Example: 例:
Input lines: 输入线:
id=af2477,val=333,val=777
id=af3456,val=222,val=444,val=678
id=af3327,val=3234,val=123,val=701
Output lines: 输出线:
id=af2477,val=333
id=af2477,val=777
id=af3456,val=222
id=af3456,val=444
id=af3456,val=678
id=af3327,val=3234
id=af3327,val=123
id=af3327,val=701
I have tried with 2 subgroups but it wont work. 我尝试了2个子组,但无法正常工作。 It will only replace the second group once:
它将仅替换第二组:
find what: (id=[a-zA-Z0-9]+,)(val=[0-9]+,)*
replace: \\n\\1,\\2
查找内容:
(id=[a-zA-Z0-9]+,)(val=[0-9]+,)*
替换: \\n\\1,\\2
UPDATE: Both answers from Toto
and Wiktor Stribiżew
seem to do the job. 更新:
Toto
和Wiktor Stribiżew
答案似乎都可以胜任。 Haven't tested them yet. 尚未测试过。 I would still like to see how this can work with the use of Notepad++ (even if multiple steps are needed)
我仍然想看看如何使用Notepad ++进行工作(即使需要多个步骤)
Since you also consider using Linux tools for this, an awk
solution looks much more viable: 由于您还考虑为此使用Linux工具,因此
awk
解决方案看起来更可行:
awk 'BEGIN{FS=OFS=","} /^id=[a-zA-Z0-9]+(,val=[0-9]+)*$/{
for(i=2; i<=NF; i++) {
print $1,$i
}; next;
}{print $0}' file > outfile
See the online demo . 请参阅在线演示 。
Here, any line that matches ^id=[a-zA-Z0-9]+(,val=[0-9]+)*$
(ie matches the format of the lines you need to expand) is split the way you need with for(i=2; i<=NF; i++) {print $1,$i}; next;
在这里,任何与
^id=[a-zA-Z0-9]+(,val=[0-9]+)*$
匹配的行(即与您需要扩展的行的格式匹配)都按照您的方式拆分需要with for(i=2; i<=NF; i++) {print $1,$i}; next;
for(i=2; i<=NF; i++) {print $1,$i}; next;
. 。 Else, the line is written as is (
print $0
). 否则,该行按原样写入(
print $0
)。
The BEGIN{FS=OFS=","}
part sets the input and output field separator to a comma. BEGIN{FS=OFS=","}
部分将输入和输出字段分隔符设置为逗号。
This perl one-liner does the job (output on STDOUT): 这个perl单线工作(在STDOUT上输出):
perl -anE '($id,$vals)=/(id=\w+),(.+)$/;say "$id,$_" for split/,/,$vals' file
id=af2477,val=333
id=af2477,val=777
id=af3456,val=222
id=af3456,val=444
id=af3456,val=678
id=af3327,val=3234
id=af3327,val=123
id=af3327,val=701
Explanation: 说明:
($id,$vals)=/(id=\w+),(.+)$/; # explode id and values for each line in input file
say "$id,$_" for split/,/,$vals # print id and each value
You can redirect the output to another file: 您可以将输出重定向到另一个文件:
perl -anE '($id,$vals)=/(id=\w+),(.+)$/;say "$id,$_" for split/,/,$vals' file > outputfile
Or do the change in-place: 或就地进行更改:
perl -i -anE '($id,$vals)=/(id=\w+),(.+)$/;say "$id,$_" for split/,/,$vals' file
It is possible, yet very complex to do that with one regular expression for which you are gonna have to use (?R)
and conditional statements. 使用一个正则表达式来执行此操作是可能的,但非常复杂,您将不得不使用
(?R)
和条件语句。
With multiple steps would be pretty simple. 通过多个步骤将非常简单。 You can for instance do find and replace using the max number of
val
that you might have in the longest lines, such as, imagine 4 would be the largest number of val
, then we'll have four of (,val=[^\\r\\n,]*)
in our initial expression: 例如,您可以使用最长的
val
中的最大val
数来查找和替换,例如,假设4是val
的最大数,那么我们将有四个(,val=[^\\r\\n,]*)
在我们的初始表达式中:
^(id=[^\r\n,]*)(,val=[^\r\n,]*)(,val=[^\r\n,]*)(,val=[^\r\n,]*)(,val=[^\r\n,]*)$
and replace that with four lines, 并用四行替换
$1$2\n$1$3\n$1$4\n$1$5
---- ---- ---- ----
For any additional step, we can simply remove one val
and one line from the end of initial expression and replacement. 对于任何其他步骤,我们只需从初始表达式和替换的末尾删除一个
val
和一行。 For example, our expression would look like 例如,我们的表达式看起来像
^(id=[^\r\n,]*)(,val=[^\r\n,]*)(,val=[^\r\n,]*)(,val=[^\r\n,]*)$
in the second step, for which we'd replace it with: 在第二步中,我们将其替换为:
$1$2\n$1$3\n$1$4
---- ---- ----
In the third and final step, our expression has two vals, 在第三步(也是最后一步)中,我们的表达式具有两个值,
^(id=[^\r\n,]*)(,val=[^\r\n,]*)(,val=[^\r\n,]*)$
and our replacement will have two lines: 我们的替代品将有两行:
$1$2\n$1$3
---- ----
For the case exampled in the question, only two steps are required and the second and third expressions would likely work just fine. 对于问题中示例的情况,仅需要两个步骤,第二个和第三个表达式可能就可以正常工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.