简体   繁体   English

如何在C#中使用Regexp替换出现的任何情况 <img> 给定字符串中带有ALT参数的HTML标记

[英]How to use Regexp in C# to replace any occurence of <img> HTML tag with it's ALT parameter within the given string

I have string containing many of HTML tag of exactly that construction: 我有一个字符串,其中包含许多与该结构完全相同的HTML标记:

Hello <img src="./images/foo.gif" alt="bar"> World!
<img src="./images/foo2.gif" alt="bar2">

The string containing that tags is autogenerated from external tool and it is guaranteed to have exactly that declaration of tag 包含该标记的字符串是从外部工具自动生成的,并且保证完全具有该标记的声明

Now I wish to replace every occurence of such tag within given string with its "alt" parameter, so for the above sample results should be: 现在我希望用“alt”参数替换给定字符串中的这种标记的每个出现,因此对于上面的示例结果应该是:

Hello bar World!
bar2

I am using C# and .NET Framework 我正在使用C#和.NET Framework

Using Expresso , I've devised the following expression: 使用Expresso ,我设计了以下表达式:

<img src="[^"]*" alt="([^"]*)">

This obviously requires the generated tag to match pretty exactly to your example, so any changes in what's generating the tags will cause this to break. 显然,这需要生成的标记与您的示例完全匹配,因此生成标记的内容的任何更改都将导致该标记中断。 For that reason, I'd advise you to consider using something like the HtmlAgilityPack instead of regex to solve this problem. 因此,建议您考虑使用类似HtmlAgilityPack的产品代替正则表达式来解决此问题。

Here's the code that Expresso generated with a number of usage examples. 这是Expresso使用许多用法示例生成的代码。

//  using System.Text.RegularExpressions;

/// <summary>
///  Regular expression built for C# on: Mon, Nov 22, 2010, 03:51:18 PM
///  Using Expresso Version: 3.0.2766, http://www.ultrapico.com
///  
///  A description of the regular expression:
///  
///  <img src="
///      <img
///      Space
///      src="
///  Any character that is NOT in this class: ["], any number of repetitions
///  " alt="
///      "
///      Space
///      alt="
///  [1]: A numbered capture group. [[^"]*]
///      Any character that is NOT in this class: ["], any number of repetitions
///  ">
///      ">
///  
///
/// </summary>
public static Regex regex = new Regex(
      "<img src=\"[^\"]*\" alt=\"([^\"]*)\">",
      RegexOptions.Compiled
    );


// This is the replacement string
public static string regexReplace = "$1";


//// Replace the matched text in the InputText using the replacement pattern
// string result = regex.Replace(InputText,regexReplace);

//// Split the InputText wherever the regex matches
// string[] results = regex.Split(InputText);

//// Capture the first Match, if any, in the InputText
// Match m = regex.Match(InputText);

//// Capture all Matches in the InputText
// MatchCollection ms = regex.Matches(InputText);

//// Test to see if there is a match in the InputText
// bool IsMatch = regex.IsMatch(InputText);

//// Get the names of all the named and numbered capture groups
// string[] GroupNames = regex.GetGroupNames();

//// Get the numbers of all the named and numbered capture groups
// int[] GroupNumbers = regex.GetGroupNumbers();

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM