简体   繁体   English

从字符串C#中删除单词

[英]Remove words from string c#

I am working on a ASP.NET 4.0 web application, the main goal for it to do is go to the URL in the MyURL variable then read it from top to bottom, search for all lines that start with "description" and only keep those while removing all HTML tags. 我正在使用ASP.NET 4.0 Web应用程序,它的主要目标是转到MyURL变量中的URL,然后从上至下读取它,搜索以“ description”开头的所有行,并仅保留那些同时删除所有HTML标签。 What I want to do next is remove the "description" text from the results afterwords so I have just my device names left. 我接下来要做的是从结果后缀中删除“描述”文本,以便仅剩下设备名称。 How would I do this? 我该怎么做?

protected void parseButton_Click(object sender, EventArgs e)
    {
        MyURL = deviceCombo.Text;
        WebRequest objRequest = HttpWebRequest.Create(MyURL);
        objRequest.Credentials = CredentialCache.DefaultCredentials;
        using (StreamReader objReader = new StreamReader(objRequest.GetResponse().GetResponseStream()))
        {
            originalText.Text = objReader.ReadToEnd();
        }

        //Read all lines of file
        String[] crString = { "<BR>&nbsp;" };
        String[] aLines = originalText.Text.Split(crString, StringSplitOptions.RemoveEmptyEntries);

        String noHtml = String.Empty;

        for (int x = 0; x < aLines.Length; x++)
        {
            if (aLines[x].Contains(filterCombo.SelectedValue))
            {
                noHtml += (RemoveHTML(aLines[x]) + "\r\n");

            }
        }
        //Print results to textbox
        resultsBox.Text = String.Join(Environment.NewLine, noHtml);
    }
    public static string RemoveHTML(string text)
    {
        text = text.Replace("&nbsp;", " ").Replace("<br>", "\n");
        var oRegEx = new System.Text.RegularExpressions.Regex("<[^>]+>");
        return oRegEx.Replace(text, string.Empty);
    }

Ok so I figured out how to remove the words through one of my existing functions: 好的,所以我想出了如何通过现有功能之一删除单词:

public static string RemoveHTML(string text)
{
    text = text.Replace("&nbsp;", " ").Replace("<br>", "\n").Replace("description", "").Replace("INFRA:CORE:", "")
        .Replace("RESERVED", "")
        .Replace(":", "")
        .Replace(";", "")
        .Replace("-0/3/0", "");
        var oRegEx = new System.Text.RegularExpressions.Regex("<[^>]+>");
        return oRegEx.Replace(text, string.Empty);
}
public static void Main(String[] args)
{
    string str = "He is driving a red car.";

    Console.WriteLine(str.Replace("red", "").Replace("  ", " "));
}   

Output: He is driving a car. 输出:他在开车。

Note: In the second Replace its a double space. 注意:在第二个替换其双精度空格。

Link : https://i.stack.imgur.com/rbluf.png 链接: https : //i.stack.imgur.com/rbluf.png

Try this.It will remove all occurrence of the word which you want to remove. 试试这个。它将删除所有要删除的单词。

Adapted From Code Project 改编自代码项目

string value = "ABC - UPDATED";
int index = value.IndexOf(" - UPDATED");
if (index != -1)
{
    value = value.Remove(index);
}

It will print ABC without - UPDATED 它将不打印ABC - UPDATED

Try something like this, using LINQ: 使用LINQ尝试这样的事情:

List<string> lines = new List<string>{
"Hello world",
"Description: foo",
"Garbage:baz",
"description purple"};

 //now add all your lines from your html doc.
 if (aLines[x].Contains(filterCombo.SelectedValue))
 {
       lines.Add(RemoveHTML(aLines[x]) + "\r\n");
 }

var myDescriptions = lines.Where(x=>x.ToLower().BeginsWith("description"))
                          .Select(x=> x.ToLower().Replace("description",string.Empty)
                                       .Trim());

// you now have "foo" and "purple", and anything else.

You may have to adjust for colons, etc. 您可能需要调整冒号等。

void Main()
{
    string test = "<html>wowzers description: none <div>description:a1fj391</div></html>";
    IEnumerable<string> results = getDescriptions(test);
    foreach (string result in results)
    {
        Console.WriteLine(result);  
    }

    //result: none
    //        a1fj391
}

static Regex MyRegex = new Regex(
      "description:\\s*(?<value>[\\d\\w]+)",
    RegexOptions.Compiled);

IEnumerable<string> getDescriptions(string html)
{
    foreach(Match match in MyRegex.Matches(html))
    {
        yield return match.Groups["value"].Value;
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM