[英]In lines starting with specific word followed by words separated by semicolon, replace semicolon with a comma and wrap the words in double quotes
I'm trying to change certain lines in my file using notepad++ and I have very less knowledge at regular expressions and henceforth seeking help. 我正在尝试使用notepad ++更改文件中的某些行,而且我对正则表达式的知识非常少,从而寻求帮助。 Any kind of help is appreciable.
任何形式的帮助都是值得的。
Find all the lines that looks like as See ABC'D EFG;IJKL;FOO;BAR;XXXXson on.
找到所有看起来像
See ABC'D EFG;IJKL;FOO;BAR;XXXXson on.
"See"
"See"
开头的行 Words can have special characters 单词可以有特殊字符
a) 一种)
space
空间
b) '
(apostrophy) b)
'
(萎缩)
c) ,
(comma) c)
,
(逗号)
d) -
(hiphen) d)
-
(hiphen)
Ends with a full stop .
结束了一个句号
.
And replace those lines as: 并将这些行替换为:
See:["ABC'D EFG","IJKL","FOO","BAR",....]
find what: See ([A-Z'\\-, ]+)\\;([A-Z'\\-, ]+)\\.
找到什么:
See ([A-Z'\\-, ]+)\\;([A-Z'\\-, ]+)\\.
replace with: See:["\\1", "\\2"]
替换为:
See:["\\1", "\\2"]
see https://regex101.com/r/bfJkN6/3 请参阅https://regex101.com/r/bfJkN6/3
also tested on my notepad++, got See:["ABC'D EFG", "IJKL"] 还在我的记事本++上测试,看到了:[“ABC'D EFG”,“IJKL”]
I updated the regex to catch multi hits on https://regex101.com/r/bfJkN6/5 我更新了正则表达式,以便在https://regex101.com/r/bfJkN6/5上获得多次点击
See ((([A-Z'\\-, ]+)\\;)+)([A-Z'\\-, ]+)\\.
Use \\W
which matches any non-word character 使用
\\W
匹配任何非单词字符
Example https://regex101.com/r/lFANF0/4 示例https://regex101.com/r/lFANF0/4
Find See\\s([AZ' ]+)\\W(\\w+)\\.
找到
See\\s([AZ' ]+)\\W(\\w+)\\.
and Replace See:["$1","$2"]
和替换
See:["$1","$2"]
1stGroup
(\\w+\\'\\w+\\s+)
\\w+
matches any word character (equal to[a-zA-Z0-9_]
)1stGroup
(\\w+\\'\\w+\\s+)
\\w+
匹配任何单词字符(等于[a-zA-Z0-9_]
)
+
Matches between one and unlimited times+
一次和无限次之间的匹配
\\s+
matches any whitespace character (equal to[\\r\\n\\t\\f\\v ]
)\\s+
匹配任何空白字符(等于[\\r\\n\\t\\f\\v ]
)
2nd Group(\\w+\\W*\\w+)
\\W*
matches any non-word character (equal to [^a-zA-Z0-9_]
)第二组
(\\w+\\W*\\w+)
\\W*
匹配任何非单词字符(等于[^a-zA-Z0-9_]
)
Lets say the number of semi-colon is variable. 可以说分号的数量是可变的。 You need to proceed in two passes.
你需要两次通过。
Use Replace All
for the two passes: 两次通过使用
Replace All
:
find: ^See \\K([AZ ,;'-]+)\\.
find:
^See \\K([AZ ,;'-]+)\\.
replace: ["$1"]
替换:
["$1"]
and then: 接着:
find: (?:\\G(?!^)|^See \\["(?=[^"]*"]))[^";]*\\K;
find:
(?:\\G(?!^)|^See \\["(?=[^"]*"]))[^";]*\\K;
replace: ", "
替换:
", "
The first pass is easy to understand, it only finds corresponding lines, remove the final dot and encloses the part with uppercase letters, commas, spaces, semi-colons, apostrophes and hyphens between double quotes and square brackets. 第一遍很容易理解,它只找到相应的行,删除最后一个点,并用双引号和方括号之间的大写字母,逗号,空格,分号,撇号和连字符包围该部分。
The second pass needs to replace only semi-colons inside quotes and square brackets for lines that start with See
. 第二遍需要仅替换引号内的分号和方括号中的以
See
开头的行。 To do that I used the second branch ^See \\["(?=[^"]*"])
to reach the interesting lines and the \\G
anchor in the second branch to ensure that the next matches are contiguous to the first. Since [^";]*
excludes the double quote, once the last semicolon is reached, the first branch can no longer succeed and the contiguity is broken. 为此,我使用了第二个分支
^See \\["(?=[^"]*"])
到达有趣的行和第二个分支中的\\G
锚点,以确保下一个匹配与第一个匹配。由于[^";]*
排除双引号,一旦到达最后一个分号,第一个分支就不能再成功并且连续性被破坏。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.