繁体   English   中英

用给定的字典构建C ++翻译器?

[英]Building C++ translator with given dictionary?

我正在尝试构建一个简单的翻译器,根据给定的词典翻译句子。 假设我们有两个字串

string ENG[] = {"black","coffee", "want","yesterday"};
string SPA[] = {"negro", "café", "quiero", ayer"};

如果用户给出“我要黑咖啡”。 结果应该是“ I?quiro a?negro cafe”。 这意味着对于字典字符串中未翻译的单词,其旁边应带有问号。

#include <iostream>
using namespace std;

int main(int argc, char *argv[]) {

  string input string ENG[] = {"black", "coffee", "want", "yesterday"};
  string SPA[] = {"negro", "café", "quiero", "ayer"};

  cout << "Enter a word";
  cin >> input;

  for (int i = 0; i < 10; ++i) {
    if (ENG[i] == input) {
      cout << "You entered " << SPA[i] << endl;
    }
  }
  return 0;
}

我写的只转换单词。 如何编写此代码并使之成为句子?

干得好。

#include <iostream>
#include <string>
#include <vector>

using namespace std;

vector <string> split_sentence(const string& arg)
{

    vector <string> ret;

    auto it = arg.begin();
    while (it != arg.end()) {

        string tmp;

        while (it != arg.end() && *it == ' ') ++it;
        while (it != arg.end() && *it != ' ')
            tmp += *it++;

        if (tmp.size())
            ret.push_back(tmp);
    }

    return ret;
}

int main(int argc, char *argv[])
{
    string input = "I want a black     coffee .";

    string ENG[4] = {"black","coffee", "want","yesterday"};
    string SPA[4] = {"negro", "café", "quiero", "ayer"};

    cout << "Enter sentence\n";
    /*
        cin >> input;
    */

    for (auto& str: split_sentence(input)) {

        bool found = false;

        for (int j=0; j<4 && !found; ++j) {

            if (ENG[j] == str) {
                cout << SPA[j] << " ";
                found = true;
            }
        }

        if (!found)
            cout << str << "? ";
    }

    cout << endl;
}

输出:

Enter sentence
I? quiero a? negro café .?

用空格分隔句子,然后从字典中找到合适的词。 如果您的字典big enough需要使用数据结构之类的树来提高速度或排序和散列。

编辑:

Trie will be faster for this. For each query you 
can get the appropriate word in O(m), m = length of
query(English word)

正如评论中所建议的那样,对于这两个分离的数组,使用起来确实很麻烦,并且很难更新。 想象一下在中间插入一个新的值对并弄乱偏移量……

因此,这里更好的解决方案是使用std::map ,尤其是考虑到这应该是简单的1:1映射。

这样,您可以使用std::string作为键(原始单词)和std::string作为其值(转换)来定义std::map

使用现代C ++时,初始化可能如下所示:

std::map<std::string, std::string> translations {
    {"black", "negro"},
    {"coffee", "café"},
    // ...
};

现在,要逐字获取输入字符串,最快的内置方法是使用std::istringstream

std::istringstream stream(myInputText);
std::string word;

while (stream >> word) {
    // do something with each word
}

查找实际的翻译也变得无关紧要。 遍历所有翻译的过程在后台发生(在std::map类内部):

const auto &res = translations.find(word);

if (res == translations.end()) // nothing found
    std::cout << "? ";
else
    std::cout << res->second << " "; // `res->second` is the value, `res->first` would be the key, i.e. `word`

至于一个完整的小例子:

#include <iostream>
#include <string>
#include <sstream>
#include <map>

int main(int argc, char **argv) {
    std::map<std::string, std::string> translations {
        {"black", "negro"},
        {"coffee", "café"}
    };

    std::string source("I'd like some black coffee");
    std::istringstream stream(source);
    std::string word;

    while (stream >> word) {
        const auto &t = translations.find(word);

        if (t != translations.end()) // found
            std::cout << word << ": " << t->second << "\n";
        else
            std::cout << word << ": ???\n";
        }

        return 0;
    }

此特定示例将创建以下输出:

I'd: ???
like: ???
some: ???
black: negro
coffee: café

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM