简体   繁体   English

使用 TRegEx 删除 HTML 标记之间的字符串

[英]Remove string between HTML tags with TRegEx

I am designing by code a report sent by email with Outlook using HTML format.我正在通过代码设计 email 和 Outlook 使用 HTML 格式发送的报告。

To do that, I'm loading first a HTML template where I can insert all dynamic parts using predefined tags like [CustomerName] .为此,我首先加载一个 HTML 模板,我可以在其中使用[CustomerName]等预定义标签插入所有动态部件。

<p>You will find below reports for customer [CustomerName] dated [ReportdDate]</p>

<tag-1>
<h3>TableTitleA</h3>
<table>
  <thead id="t01">
    <tr>
        <th align='center' width='80'>Order Nr</th>
        <th align='left' width='400'>Date</th> 
        <th align='left' width='200'>Info</th> 
        <th align='center' width='200'>Site Name</th> 
    </tr>
  </thead>
  <tbody>
    [TableA]
  </tbody>
</table>
</tag-1>

<tag-2>
<h3>TableTitleB</h3>
<table>
  <thead id="t01">
    <tr>
        <th align='center' width='80'>Order Nr</th>
        <th align='left' width='100'>Date</th> 
        <th align='left' width='400'>Info</th> 
        <th align='left' width='200'>Site Name</th> 
    </tr>
  </thead>
  <tbody>
    [TableB]
  </tbody>
</table>
</tag-2>

<p>Best regards</p>

This template is ready to insert two HTML tables: [TableA] and [TableB]此模板准备插入两个 HTML 表: [TableA][TableB]

But sometimes a table has no data.但有时一个表没有数据。 So, I want to remove that complete HTML section.所以,我想删除完整的 HTML 部分。 To achieve this, I have inserted fake tags:为此,我插入了假标签:

<tag-1></tag-1> and <tag-2></tag-2>

And then removing the complete section including the two fake tags using TRegEx.然后使用 TRegEx 删除包括两个假标签的完整部分。 This is working just fine here:这在这里工作得很好:

https://regex101.com/r/5OFlyC/1 https://regex101.com/r/5OFlyC/1

But with this code in Delphi, it doesn't work as expected:但是使用 Delphi 中的这段代码,它不能按预期工作:

TRegEx.Replace(MessageBody.Text, '<tag-1>.*?</tag-1>', '');

Could you tell me what's wrong here?你能告诉我这里有什么问题吗?

My problem is fixed.我的问题已解决。 Thanks to all of you感谢大家

Just use the roSingleLine option to deal with line feeds:只需使用roSingleLine选项来处理换行:

MessageBody.Text := TRegEx.Replace(MessageBody.Text, '<tag-1>.*?</tag-1>', '', [roSingleLine]);

first you have to remove all the CR LF from your string and then use the expression with escape before < and >首先,您必须从字符串中删除所有 CR LF,然后在 < 和 > 之前使用带有转义的表达式

  S:=StringReplace(messagebody.Text,#13#10,'<br>',[rfReplaceAll]);
  S:=TRegEx.Replace(S,'(\<tag-1\>.*?\<\/tag-1\>)','');
  messagebody.text:=StringReplace(S,'<br>',#13#10,[rfReplaceAll]);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM