简体   繁体   English

在C#中匹配URL编码的电子邮件地址

[英]Matching a URL Encoded e-mail address in C#

I did some searching and didn't quite figure out why my solution is not working. 我进行了一些搜索,但并不太清楚为什么我的解决方案无法正常工作。 Basically I need to take a string (which is HTML code) parse it and look for mailto links (which I then want to replace as part of an obfuscation). 基本上,我需要使用一个字符串(这是HTML代码)来解析它并查找mailto链接(然后我希望将其替换为混淆的一部分)。 Here is what I have thus far: 到目前为止,这是我所拥有的:

    string text = "<p>Some Person<br /> Person's Position<br />p. 123-456-7890<br /> e. <a  title=\"Email Some Person\" target=\"_blank\" href=\"mailto:someperson%40domain.com\">someperson@domain.com</a></p>";
    text = Server.UrlDecode(text);
    string safeEmails = Regex.Replace(text, "(<a href=\"mailto:)(.*?)(%40)(.*?)(\">)(.*?)(</a>)", "<a class=\"mailme\" href=\"$2*$4\">$6</a>");
    Response.Write( Server.HtmlDecode(safeEmails));

The text is coming out of a WYSIWYG text editor (Telrik RadEditor for those familiar) and for all intents and purposes I don't have access to be able to control what is coming out of it. 文本来自“所见即所得”文本编辑器(熟悉的人使用Telel RadEditor),出于所有目的和目的,我无权控制其输出内容。

Basically I need to find and replace any: 基本上,我需要查找并替换任何内容:

<a href="mailto:someone%40domain.com">someone@domain.com</a>

With: 带有:

<a class="mailme" href="someone@domain.com">someone@domain.com</a>

Some background: I am attempting to create a mailto link that will avoid detection by harvesters. 一些背景知识:我正在尝试创建一个mailto链接,以避免收割者发现。 The problem is that I receive a string with the e-mail as a standard mailto link. 问题是我收到了带有电子邮件的字符串,作为标准的mailto链接。 I cannot control the incoming string, so the mailto will always be an unprotected mailto. 我无法控制传入的字符串,因此mailto将始终是不受保护的mailto。 My object is to find all of them, obfuscate them, then use JavaScript to "fix" the link so that human vistors can easily use the mailto links. 我的目标是找到所有这些对象,对其进行混淆,然后使用JavaScript“修复”链接,以便人类访问者可以轻松使用mailto链接。 I am open to new approaches as well as modifications to the above code. 我愿意接受新方法以及对以上代码的修改。

You could use a regex or the HTML agility pack to find and obfuscate all your mailto. 您可以使用正则表达式或HTML敏捷性包来查找和混淆所有mailto。 If you want a good obfuscation try reading ten methods to obfuscate e-mail addresses compared 如果您希望进行良好的混淆处理,请尝试阅读比较混淆的电子邮件地址的十种方法

EDIT: sorry, from the first version of your question I didn't get you had a problem in making your regex work. 编辑:对不起,从您的问题的第一个版本开始,我没让您在使正则表达式工作时遇到问题。 Since you're usign a WYSIWYG text editor, I think the HTML that comes out of it should be pretty "regular", so you may be fine using a regex. 由于您使用的是WYSIWYG所见即所得的文本编辑器,因此我认为它所产生的HTML应该是“常规”的,因此使用正则表达式可能会很好。 You can try changing your Replace line like this: 您可以尝试更改您的替换行,如下所示:

string safeEmails = Regex.Replace(text, "href=\"mailto:.*\">(.*)</a>", "class=\"mailme\" href=\"$1\">$1</a>");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM