將std :: string中的迭代字符與unicode C ++進行比較

Question

我已經在這個問題上掙扎了很長一段時間，這是我第一次基本上處理unicode或UTF-8。

這就是我想要做的，我只想迭代一個std :: string，其中包含來自普通字母和unicode符號的組合，在我的例子中是短划線“ - ”。 更多信息： http ： //www.fileformat.info/info/unicode/char/2013/index.htm

這是我嘗試過的代碼，它不會運行：

#include <iostream>
#include <string>

int main()
{
    std::string str = "test string with symbol – and !";
    for (auto &letter : str) {
        if (letter == "–") {
            std::cout << "found!" << std::endl;
        }
    }
    return 0;
}

這是我的編譯器的結果：

main.cpp: In function 'int main()':
main.cpp:18:23: error: ISO C++ forbids comparison between pointer and 
integer [-fpermissive]
     if (letter == "–") {
                   ^

此外，當我通過互聯網瀏覽時，我發現了一個有趣的信息，我需要解決這類任務。 如何在c ++字符串中搜索非ASCII字符？

但是當我試圖用那些UTF-8十六進制代碼修改我的代碼時，它也不會運行：

    if (letter == "\xE2\x80\x93") {
        std::cout << "found!" << std::endl;
    }

與我的編譯器完全相同的消息，這是c ++禁止指針和整數之間的比較。

我錯過了什么？ 或者我是否需要使用ICU或Boost等庫？ 非常感謝您的幫助。 謝謝！

更新

基於UnholySheep的答案，我一直在改進我的代碼，但它仍然無法工作。 它可以通過編譯，但當我試圖運行它，它不能輸出“發現！” 出去 那么，我該如何解決這個問題呢？ 謝謝。

Answer 1

這段代碼怎么樣？

#include <iostream>
#include <string>

int main()
{
    std::wstring str = L"test string with symbol – and !";
    for (auto &letter : str) {
        if (letter == L'–') {
            std::cout << "found!" << std::endl;
        }
    }
    return 0;
}

Answer 2

正如UnholySheep在評論中所說，char字面"–"是一個char數組。 假設有一個utf8表示， char em_dash = "–"; 與char em_dash = {'\\xe2', '\\x80', '\\x93'}; 。

您只能使用當前代碼找到真實字符。 例如，這將正常工作：

...
if (letter == '!')
...

因為'!' 是一個char常數。

如果你只想處理基本多語言平面中的unicode字符（代碼低於0xFFFF），那么使用寬字符就足夠了@ ArashMohammadi的答案。 對於BMP之外的字符（如表情符號字符）的替代解決方案是使用std::u32string ，其中每個unicode字符由單個char32_t字符表示。

如果要直接處理UTF8編碼的單字節字符串，則必須使用compare方法：

std::string em_dash = "–"; // or "\xe2\x80\x93"
...
    for (size_t pos=0; pos <= str.size() - em_dash.size(); pos++) {
        if (str.compare(pos, em_dash.size(), em_dash()) == 0) {
            std::cout << "found!" << std::endl;
        }
    }
...

或者直接使用find方法：

...
    if (str.find(em_dash) != str.npos) {
        std::cout << "found!" << std::endl;
    }
...

將std :: string中的迭代字符與unicode C ++進行比較

問題描述

2 個解決方案

解決方案1
2 已采納 2017-11-17 08:48:11

解決方案2
1 2017-11-17 09:06:29

將std :: string中的迭代字符與unicode C ++進行比較

問題描述

2 個解決方案

解決方案1 2 已采納 2017-11-17 08:48:11

解決方案2 1 2017-11-17 09:06:29

解決方案1
2 已采納 2017-11-17 08:48:11

解決方案2
1 2017-11-17 09:06:29