在Unix上使用C ++的正則表達式

Question

我對Regex本身很熟悉，但每當我試圖找到任何用於在Unix計算機上使用正則表達式的示例或文檔時，我只會獲得有關如何編寫正則表達式或如何使用適用於Windows的.NET特定庫的教程。 我一直在尋找一段時間，我在Unix機器上找不到關於C ++正則表達式的任何好的教程。

我正在嘗試做什么：

使用正則表達式解析一個字符串，然后將其分解，然后讀取不同的子組。 要進行PHP類比，比如preg_match，返回所有$匹配。

Answer 1

考慮使用Boost.Regex 。

一個例子（來自網站）：

bool validate_card_format(const std::string& s)
{
   static const boost::regex e("(\\d{4}[- ]){3}\\d{4}");
   return regex_match(s, e);
}

另一個例子：

// match any format with the regular expression:
const boost::regex e("\\A(\\d{3,4})[- ]?(\\d{4})[- ]?(\\d{4})[- ]?(\\d{4})\\z");
const std::string machine_format("\\1\\2\\3\\4");
const std::string human_format("\\1-\\2-\\3-\\4");

std::string machine_readable_card_number(const std::string s)
{
   return regex_replace(s, e, machine_format, boost::match_default | boost::format_sed);
}

std::string human_readable_card_number(const std::string s)
{
   return regex_replace(s, e, human_format, boost::match_default | boost::format_sed);
}

Answer 2

查找TR1正則表達式的文檔或（幾乎等效地）提升正則表達式。 兩者在各種Unix系統上都能很好地工作。 TR1正則表達式類已被C ++ 0x接受，因此雖然它們還不是標准的一部分，但它們很快就會合理。

編輯：要將字符串分成子組，可以使用sregex_token_iterator。 您可以指定要作為標記匹配的內容，或者要作為分隔符匹配的內容。 以下是兩者的快速演示：

#include <iterator>
#include <regex>
#include <string>
#include <iostream>

int main() { 

    std::string line;

    std::cout << "Please enter some words: " << std::flush;
    std::getline(std::cin, line);

    std::tr1::regex r("[ .,:;\\t\\n]+");
    std::tr1::regex w("[A-Za-z]+");

    std::cout << "Matching words:\n";
    std::copy(std::tr1::sregex_token_iterator(line.begin(), line.end(), w),
        std::tr1::sregex_token_iterator(), 
        std::ostream_iterator<std::string>(std::cout, "\n"));

    std::cout << "\nMatching separators:\n";
    std::copy(std::tr1::sregex_token_iterator(line.begin(), line.end(), r, -1), 
        std::tr1::sregex_token_iterator(), 
        std::ostream_iterator<std::string>(std::cout, "\n"));

    return 0;
}

如果你給它輸入如下：“這是一些999文本”，結果是這樣的：

Matching words:
This
is
some
text

Matching separators:
This
is
some
999
text

Answer 3

您正在尋找regcomp，regexec和regfree 。

需要注意的一點是，Posix正則表達式實際上實現了兩種不同的語言：常規（默認）和擴展（在調用regcomp時包含標志REG_EXTENDED）。 如果您來自PHP世界，擴展語言更接近您習慣的語言。

Answer 4

對於perl兼容的正則表達式（pcre / preg），我建議使用boost.regex 。

Answer 5

我最好的選擇是boost :: regex 。

Answer 6

試試pcre 。 和pcrepp 。

Answer 7

隨意看看我寫的這個小顏色grep工具。

在github

它使用R Samuel Klatchko所指的regcomp，regexec和regfree。

Answer 8

我使用“GNU正則表達式”： http ： //www.gnu.org/s/libc/manual/html_node/Regular-Expressions.html

效果不錯但無法找到UTF-8正則表達式的明確解決方案。

問候

在Unix上使用C ++的正則表達式

問題描述

8 個解決方案

解決方案1
13 已采納 2010-02-08 20:51:23

解決方案2
9 2010-02-08 20:49:13

解決方案3
0 2010-02-08 20:48:23

解決方案4
0 2010-02-08 20:49:19

解決方案5
0 2010-02-08 20:49:51

解決方案6
0 2010-02-08 20:50:02

解決方案7
0 2010-02-08 20:53:15

解決方案8
0 2010-02-08 21:48:42

在Unix上使用C ++的正則表達式

問題描述

8 個解決方案

解決方案1 13 已采納 2010-02-08 20:51:23

解決方案2 9 2010-02-08 20:49:13

解決方案3 0 2010-02-08 20:48:23

解決方案4 0 2010-02-08 20:49:19

解決方案5 0 2010-02-08 20:49:51

解決方案6 0 2010-02-08 20:50:02

解決方案7 0 2010-02-08 20:53:15

解決方案8 0 2010-02-08 21:48:42

解決方案1
13 已采納 2010-02-08 20:51:23

解決方案2
9 2010-02-08 20:49:13

解決方案3
0 2010-02-08 20:48:23

解決方案4
0 2010-02-08 20:49:19

解決方案5
0 2010-02-08 20:49:51

解決方案6
0 2010-02-08 20:50:02

解決方案7
0 2010-02-08 20:53:15

解決方案8
0 2010-02-08 21:48:42