简体   繁体   English

如何从字符串中提取特定文本。 困难的部分是所需的文本定期更改

[英]How to extract a specific text from a string. The hard part is the desired text changes periodically

I have an HTML document which contains this text somewhere in it 我有一个HTML文档,其中包含此文本

function deleteFolder() {
        var mailbox = "CN=John Urban,OU=Sect-1,DC=TestServer ,DC=acme,DC=com";
        var path = "/Inbox/";

//string of interest: "CN=John Urban,OU=Sect-1,DC=TestServer ,DC=acme,DC=com"

I just want to extract this text and store it in a variable in C#. 我只想提取此文本并将其存储在C#中的变量中。 My problem is that string of interest will slightly change each time the page is loaded, something like this: 我的问题是,每次加载页面时,感兴趣的字符串都会略有变化,如下所示:

  • "CN=John Urban,OU=Sect-1,DC=TestServer ,DC=acme,DC=com" “ CN = John Urban,OU = Sect-1,DC = TestServer,DC = acme,DC = com”
  • "CN=Jane Doe,OU=Sect-1,DC=TestServer ,DC=acme,DC=com" “ CN = Jane Doe,OU = Sect-1,DC = TestServer,DC = acme,DC = com”
  • etc.... 等等....

How do I extract that ever changing string, without regular expression? 如何在不使用正则表达式的情况下提取不断变化的字符串?

Is it always a function deleteFolder() which has its first line as var mailbox = "somestring" ? 它是否始终是函数deleteFolder() ,其第一行为var mailbox = "somestring" And you are interested in somestring ? 而您对somestring感兴趣?

Based on the requirements you told us, could just search your string containing the HTML for var mailbox =" and then the next " and take all text between these two occurrences. 根据您告诉我们的要求,可以只在包含HTML的字符串中搜索var mailbox =" ,然后搜索下一个"然后获取这两次出现之间的所有文本。

var htmlstring= "..."; //
var i1 = htmlstring.IndexOf("var mailbox = \"");
var i2 = i1 >= 0 ? htmlstring.IndexOf("\"", i1+15) : -1;
var result = i2 >= 0 ? htmlstring.Substring(i1+15, i2-(i1+15)): "not found";

VERY, VERY ugly, not maintainable, but without more information, I can't do any better. 非常非常丑陋,无法维护,但是如果没有更多信息,我无法做得更好。 However Regex would be much nicer! 但是正则表达式会更好!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM