簡體   English   中英

C#從字符串中提取單詞

[英]C# extract a word from a string

我首先要說:我仍然不擅長編程,但它非常有趣! 我正在研究類似Siri的程序,並且正在嘗試實現Wikipedia函數。 為此,我問一個問題,例如:告訴我有關超人的信息

我需要從字符串中提取超人或其他任何隨機單詞。 這並不難,但是真正的問題始於有人問:您能告訴我有關超人的信息嗎,我仍然想提取超人這個詞。

這是我之前嘗試過的一個示例:

if ((c.Contains("tell me about")) || (c.Contains("Tell me about")))
{
    string query = c;
    var part = query.Split('t').Last(); //cant search for words containing the letter t like artificial intelligence

    string url = ("http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=" + part + "&MaxHits=1");

    XmlReader reader = XmlReader.Create(url);
    while (reader.Read())
        switch (reader.Name.ToString())
        {
            case "Description":
                sp(reader.ReadString());
                break;

        }
}

我幾乎能夠解決該問題,似乎這種解決方案大約有80%的時間有效。 但是,這是朝正確方向邁出的一步。

     if ((c.Contains("tell me about")) || (c.Contains("Tell me about")))
        {
            string query = c;
            string[] lines = Regex.Split(query, "about ");
            foreach (string line in lines)
            {

            string url = ("http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=" + line + "&MaxHits=1");

                XmlReader reader = XmlReader.Create(url);
                while (reader.Read())

                    switch (reader.Name.ToString())
                    {
                        case "Description":
                            sp(reader.ReadString());
                            break;

                    }
            }

有更好/更容易的方法嗎?

如注釋中所建議,如果它適用於任何類型的生產應用程序,則最佳選擇是使用某些現有庫。

仍然可以自己做一個有趣的練習。

我想說還有更多關於超人的問題。

"what do you know about Superman"
"let's talk about Superman"
"who is Superman"

還有很多。

所有問題均由一些輔助詞構成:“什么”,“誰”,“一個”,“關於”以及描述問題主題的實際詞:“超人”。 簡化的方法是消除所有輔助設備並取走剩下的所有東西。

為了快速建立疑問詞和問題短語的簡單清單,我使用了英語語法網站 我采取了這些短語,並刪除了問題的主題。 這給了我50-60個輔助詞的清單。

現在,我要做的就是刪除句子並刪除輔助列表中的所有單詞。 代碼如下:

class Program
{
    // All the words collected from the sample question phrases.
    private static string auxStr = @"Who is the Who are Who is that there Where is the Where do you Where are my 
        When do the When is his When are we Why do we Why are they always Why does he What is What is her What is the Which 
        drink did you Which Which is How do you How does he know the answer How can I learn many much often far tell say 
        explain answer for from with about on me he his him her hers your yours they theyr theyrs";

    private static List<string> aux = new List<string>();

    static void Main(string[] args)
    {
        // Build a list of auxiliary words.
        aux = auxStr.ToLower().Split(' ').Distinct().ToList();

        // Test the method to get a subject.
        var subject = GetSubject("Do you know where is Poland", aux);

        foreach(var s in subject)
        {
            Console.WriteLine(s);
        }

        Console.ReadLine();
    }

    private static List<string> GetSubject(string question, List<string> auxiliaries)
    {
        // Convert the question to a list of strings
        var listQuestion = question.ToLower().Split(' ').Distinct().ToList();

        // Remove from the question all the words 
        // that are in the list of auxiliary phrases
        var notAux = listQuestion.Where(w => !auxiliaries.Contains(w)).ToList();

        return notAux;
    }
}

這很簡單,但是卻毫不費力地縮小了潛在問題的范圍。

我終於找到了答案:

    if ((c.Contains("tell me about")) || (c.Contains("Tell me about")))
        {
            string query = c;
            string[] lines = Regex.Split(query, "about ");
            string finalquery = lines[lines.Length - 1];

            string url = ("http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=" + finalquery + "&MaxHits=1");

                XmlReader reader = XmlReader.Create(url);
                while (reader.Read())

                    switch (reader.Name.ToString())
                    {
                        case "Description":
                            sp(reader.ReadString());
                            break;

                    }
        }

現在它可以100%地起作用! 如果有人知道更好的方法,我會很高興聽到。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM