简体   繁体   English

一步解析和重写XHTML?

[英]Parsing and rewriting XHTML in one step?

I need to take this input: 我需要输入以下内容:

<Person>
  <name>
    <first>John</first>
    <last>Galt</last>
  </name>
</Person>

And regex my way to this output: 正则表达式我的输出方式:

<div>&lt;Person&gt;
  <div>&lt;name&gt;
    <div>&lt;firstt&gt;John&lt;/first&gt;</div>
    <div>&lt;lastt&gt;Galt&lt;/last&gt;</div>
  &lt;/name&gt;</div>
&lt;/Person&gt;</div>

I have a solution that *works: 我有一个可以正常工作的解决方案:

var output = input.replace(/([<])\/([a-zA-Z][A-Z0-9]*)([^>]*)([>])/g, "&lt;$2$3&gt </div>");
    output = output.replace(/([<])([a-zA-Z][A-Z0-9]*)([^>]*)([>])/g, "<div>&lt;$2$3&gt;");

But I feel like its a little inefficient and was wondering if a regex savant could help me clean it up a little - ideally into one step? 但是我觉得它有点低效,并且想知道正则表达式专家是否可以帮助我将其清理干净-理想的情况是一步? My problem was that my regex couldn't handle nested elements (when I tried to do it all in one step). 我的问题是我的正则表达式无法处理嵌套元素(当我尝试一步完成所有操作时)。 Thanks! 谢谢!

**EDIT: Good catch racraman **编辑:好抓拉克拉曼

To inject <div> and </div> You could've used empty-group matching: 要注入<div></div>您可以使用空组匹配:

input.replace(/(<(\/)[^>\/]*>)|(<[^>\/]*>)/g,"$1<$2div>$3");

This would've produced: 这样会产生:

<div><Person>
  <div><name>
    <div><first>John</first></div>
    <div><last>Galt</last></div>
  </name></div>
</Person></div>

But You're also asking to replace < and > with &lt; 但是您还要求将<>替换为&lt; and &gt; &gt; respectively - known regexp engines don't support such group-content transformations within same step. 分别-已知的正则表达式引擎在同一步骤中不支持此类组内容转换。 Eg You're limited to use either portions of matched groups or quite primitive (uppercase/lowercase) transformation of those . 例如,您只能使用匹配组的一部分,也可以使用原始的(大写/小写)转换

So I would've either simplified Yours: 所以我要么简化了您的:

var output = input.replace(/<\/([^>]*)>)/g, "&lt;$1&gt;</div>");
    output = output.replace(/<([^>\/]*)>/g, "<div>&lt;$1&gt;");

or would've used the empty-groups approach: 或会使用空组方法:

var ouptut = input.
replace(/<((\/)([^>\/]*)|([^>\/]*))>/g,"&lt;$2$3&gt;<$2div>&lt;$4&gt;").
replace(/&lt;&gt;/g,"");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM