[英]Regular expression for XML element with arbitrary attribute value
I'm not very confortable with RegEx. 我对RegEx不太满意。
I have a text file with a lot of data and different formats. 我有一个包含大量数据和不同格式的文本文件。 I want to keep this kind of string. 我想保留这种字符串。
<data name=\"myProptertyValue\" xml:space=\"preserve\">
Only the value of the name property can change. 只有name属性的值可以更改。
So I imagined a regex like this <data name=\\\\\\"(.)\\\\\\" xml:space=\\\\\\"preserve\\\\\\">
but it's not working. 因此,我想到了一个像这样的正则表达式<data name=\\\\\\"(.)\\\\\\" xml:space=\\\\\\"preserve\\\\\\">
但它无法正常工作。
Any tips? 有小费吗?
try this 尝试这个
<data name=\\".*?\\" xml:space=\\"preserve\\">
no need to add \\
to "
无需添加\\
来"
Your (.)
will capture only a single character; 您的(.)
仅捕获单个字符; add a quantifier like +
(“one or more”): 添加类似+
(“一个或多个”)的量词:
/<data name=\\"(.+)\\" xml:space=\\"preserve\\">/
Depending on what exactly your input is (element by element or entire document) and on what you want to achieve (removing/replacing/testing/capturing), you should make the regex global (by adding the g
flag), so it is applied not only once. 根据输入的确切内容(按元素或整个文档的元素)以及要实现的内容(删除/替换/测试/捕获),应将正则表达式设为全局(通过添加g
标志),以便对其进行应用不止一次 Also, you should make the +
quantifier lazy by adding a ?
另外,您应该通过添加?
来使+
量词变得懒惰?
to it. 对此。 That will make it non-greedy, because you want capturing to stop at the ending quote of the attribute (like all but quotation mark: [^"]
). Then, it will look like this: 这会使它变得不贪心,因为您希望捕获在属性的结束引号处停止(就像除引号之外的所有引号一样: [^"]
)。然后,它看起来像这样:
/<data name=\\"(.+?)\\" xml:space=\\"preserve\\">/g
<data name=\\"(.+)\\" xml:space=\\"preserve\\">
It will catch what's inside "data name". 它将捕获“数据名称”中的内容。
If you're having trouble with regex, using this kind of sites to construct your regex can help you : https://regex101.com/ , http://regexr.com/ etc. 如果您在使用正则表达式时遇到麻烦,使用此类网站来构建正则表达式可以为您提供帮助: https : //regex101.com/,http ://regexr.com/等。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.