[英]Extracting only tags from html text file
I'm working on a steganography method which hides text withing html tags. 我正在研究一种隐藏带有html标签的文本的隐写术方法。
for example this tag: <heEAd>
I have to extract every character within the tag and then 例如这个标签:
<heEAd>
我必须提取标签中的每个字符然后
analyze the case of the letter if it is capital then the bit is set to 1 else 0 and I also want to check the end if it sees the matching closing /head tag 分析字母的大小写,如果它是大写,那么该位设置为1,否则我也想检查结果是否看到匹配的结束/头标记
WebClient client = new WebClient(); String htmlCode = client.DownloadString("url"); String Tags = ""; for(int i = 0; i < htmlCode.Length; i++){ if(htmlCode[i] ='<'){ if(htmlCode[i] = '>') continue; else{ Tags += htmlCode[i]; } } }
That logic is terrible but how do I use IndexOf
and lastIndexOf
to get the desired 这个逻辑很糟糕,但我如何使用
IndexOf
和lastIndexOf
来获得所需的
substring
I tried to use that but I'm just missing something due to the lack of my knowledge about c# substring
我试着用它,但由于缺乏对c#的了解,我只是遗漏了一些东西
I think you need to use REGEX. 我认为你需要使用REGEX。
I tried to do this once with Substring
and i had much job. 我尝试用
Substring
做一次,我有很多工作。 Latter i decided to use regex and it was easier than the first one. 后来我决定使用正则表达式,它比第一个更容易。
var regex = new Regex(@"(?<=<head>).*(?=</head>)");
return regex.Matches(strInput);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.