简体   繁体   English

在 C++ 项目中使用 pcre2

[英]Using pcre2 in a c++ project

I am looking at using pcre2 in my simple c++ app, (I am using vs2015).我正在考虑在我的简单 C++ 应用程序中使用 pcre2(我使用的是 vs2015)。 (I am looking at various regex libraries and the general feeling is that pcre/pcre2 are the most flexible) (我在看各种正则表达式库,总体感觉pcre/pcre2是最灵活的)

First I downloaded pcre2 from the official location, ( http://sourceforge.net/projects/pcre/files/pcre2/10.20/ ) and created a very simple example.首先,我从官方位置( http://sourceforge.net/projects/pcre/files/pcre2/10.20/ )下载了 pcre2 并创建了一个非常简单的示例。

#define PCRE2_CODE_UNIT_WIDTH 8
#include <pcre2.h>
...
PCRE2_SPTR subject = (PCRE2_SPTR)std::string("this is it").c_str();
PCRE2_SPTR pattern = (PCRE2_SPTR)std::string("([a-z]+)|\\s").c_str();

...
int errorcode;
PCRE2_SIZE erroroffset;
pcre2_code *re = pcre2_compile(pattern, PCRE2_ZERO_TERMINATED, 
                                PCRE2_ANCHORED | PCRE2_UTF, &errorcode,  
                                &erroroffset, NULL);
...

First of all the file " pcre2.h " does not exist, so I renamed pcre2.h.generic to pcre2.h首先文件“ pcre2.h ”不存在,所以我将pcre2.h.generic命名pcre2.h

But then I get linker errors with unresolved externals.但是后来我收到了未解析的外部链接器错误。

I am guessing I need to include one or more files from the source to project.我猜我需要从源代码到项目包含一个或多个文件。 But I am reluctant to just randomly add files without knowing what it all does.但我不愿意在不知道这一切的情况下随意添加文件。

Can someone give some simple steps to follow to successfully build a project using pcre2?有人可以提供一些简单的步骤来使用 pcre2 成功构建项目吗?

UPDATE更新
This is not an import library issue, pcre2.h does not come with a librar, (not one that I can see in their release location).这不是导入库问题,pcre2.h 没有库,(我在他们的发布位置看不到)。

If you don't mind using a wrapper, here's mine: JPCRE2如果你不介意使用包装器,这是我的: JPCRE2

You need to select the basic character type ( char , wchar_t , char16_t , char32_t ) according to the string classes you will use (respectively std::string , std::wstring , std::u16string , std::u32string ):您需要根据您将使用的字符串类(分别为std::stringstd::wstringstd::u16stringstd::u32string )选择基本字符类型( charwchar_tchar16_tchar32_t ):

typedef jpcre2::select<char> jp;
//Selecting char as the basic character type will require
//8 bit PCRE2 library where char is 8 bit,
//or 16 bit PCRE2 library where char is 16 bit,
//or 32 bit PCRE2 library where char is 32 bit.
//If char is not 8, 16 or 32 bit, it's a compile error.

Match Examples:匹配示例:

Check if a string matches a pattern:检查字符串是否与模式匹配:

if(jp::Regex("(\\d)|(\\w)").match("I am the subject")) 
    std::cout<<"\nmatched";
else
    std::cout<<"\nno match";

Match all and get the match count:匹配所有并获取匹配计数:

size_t count = 
jp::Regex("(\\d)|(\\w)","mi").match("I am the subject", "g");
// 'm' modifier enables multi-line mode for the regex
// 'i' modifier makes the regex case insensitive
// 'g' modifier enables global matching

Get numbered substrings/captured groups:获取编号的子串/捕获的组:

jp::VecNum vec_num;
count = 
jp::Regex("(\\w+)\\s*(\\d+)","im").initMatch()
                                  .setSubject("I am 23, I am digits 10")
                                  .setModifier("g")
                                  .setNumberedSubstringVector(&vec_num)
                                  .match();
std::cout<<"\nTotal match of first match: "<<vec_num[0][0];      
std::cout<<"\nCaptrued group 1 of first match: "<<vec_num[0][1]; 
std::cout<<"\nCaptrued group 2 of first match: "<<vec_num[0][2]; 

std::cout<<"\nTotal match of second match: "<<vec_num[1][0];
std::cout<<"\nCaptrued group 1 of second match: "<<vec_num[1][1];
std::cout<<"\nCaptrued group 2 of second match: "<<vec_num[1][2]; 

Get named substrings/captured groups:获取命名的子串/捕获的组:

jp::VecNas vec_nas;
count = 
jp::Regex("(?<word>\\w+)\\s*(?<digit>\\d+)","m")
                         .initMatch()
                         .setSubject("I am 23, I am digits 10")
                         .setModifier("g")
                         .setNamedSubstringVector(&vec_nas)
                         .match();
std::cout<<"\nCaptured group (word) of first match: "<<vec_nas[0]["word"];
std::cout<<"\nCaptured group (digit) of first match: "<<vec_nas[0]["digit"];

std::cout<<"\nCaptured group (word) of second match: "<<vec_nas[1]["word"];
std::cout<<"\nCaptured group (digit) of second match: "<<vec_nas[1]["digit"];

Iterate through all matches and substrings:遍历所有匹配项和子字符串:

//Iterating through numbered substring
for(size_t i=0;i<vec_num.size();++i){
    //i=0 is the first match found, i=1 is the second and so forth
    for(size_t j=0;j<vec_num[i].size();++j){
        //j=0 is the capture group 0 i.e the total match
        //j=1 is the capture group 1 and so forth.
        std::cout<<"\n\t("<<j<<"): "<<vec_num[i][j]<<"\n";
    }
}

Replace/Substitute Examples:替换/替换示例:

std::cout<<"\n"<<
///replace all occurrences of a digit with @
jp::Regex("\\d").replace("I am the subject string 44", "@", "g");

///swap two parts of a string
std::cout<<"\n"<<
jp::Regex("^([^\t]+)\t([^\t]+)$")
             .initReplace()
             .setSubject("I am the subject\tTo be swapped according to tab")
             .setReplaceWith("$2 $1")
             .replace();

Replace with Match Evaluator:替换为匹配评估器:

jp::String callback1(const jp::NumSub& m, void*, void*){
    return "("+m[0]+")"; //m[0] is capture group 0, i.e total match (in each match)
}
int main(){
    jp::Regex re("(?<total>\\w+)", "n");
    jp::RegexReplace rr(&re);
    String s3 = "I am ঋ আা a string 879879 fdsjkll ১ ২ ৩ ৪ অ আ ক খ গ ঘ আমার সোনার বাংলা";
    rr.setSubject(s3)
      .setPcre2Option(PCRE2_SUBSTITUTE_GLOBAL);
    std::cout<<"\n\n### 1\n"<<
            rr.nreplace(jp::MatchEvaluator(callback1));
            //nreplace() treats the returned string from the callback as literal,
            //while replace() will process the returned string
            //with pcre2_substitute()

    #if __cplusplus >= 201103L
    //example with lambda
    std::cout<<"\n\n### Lambda\n"<<
            rr.nreplace(
                jp::MatchEvaluator(
                    [](const jp::NumSub& m1, const jp::MapNas& m2, void*){
                        return "("+m1[0]+"/"+m2.at("total")+")";
                    }
                ));
    #endif
    return 0;
}

You can read the complete documentation here .您可以在此处阅读完整的文档。

In case someone wants to build the library using visual studio如果有人想使用 Visual Studio 构建库

  1. Download pcre2 from the website, ( http://www.pcre.org/ )从网站下载 pcre2,( http://www.pcre.org/ )
  2. in Visual Studio 2015, (and maybe others), create an empty project "Win32 project" and call it pcre2.在 Visual Studio 2015(可能还有其他)中,创建一个空项目“Win32 项目”并将其命名为 pcre2。
  3. Copy all the files in \\pcre2\\src\\ to your newly created empty project.将 \\pcre2\\src\\ 中的所有文件复制到新创建的空项目中。
  4. Add all the files listed in "NON-AUTOTOOLS-BUILD", (located in the base folder)添加“NON-AUTOTOOLS-BUILD”中列出的所有文件,(位于基本文件夹中)
    • pcre2_auto_possess.c pcre2_auto_possess.c
    • pcre2_chartables.c pcre2_chartables.c
    • pcre2_compile.c pcre2_compile.c
    • pcre2_config.c pcre2_config.c
    • etc...等等...
  5. Rename the file config.h.generic to config.h将文件config.h.generic重命名为config.h
  6. Add the config.h file to the project.将 config.h 文件添加到项目中。
  7. In your project, select all the *.c file Go Properties > C/C++ > Precompiled Header > "Not Using Precompiled header"在您的项目中,选择所有 *.c 文件 Go Properties > C/C++ > Precompiled Header > “Not Using Precompiled header”
  8. Select the project, Go to Properties > Preprocessor > Preprocessor Definition and select the drop down list, and add...选择项目,转到 Properties > Preprocessor > Preprocessor Definition 并选择下拉列表,然后添加...
    • PCRE2_CODE_UNIT_WIDTH=8 PCRE2_CODE_UNIT_WIDTH=8
    • HAVE_CONFIG_H HAVE_CONFIG_H

Compile and the lib file should be created fine.编译并创建 lib 文件。

PCRE2_SPTR pattern = (PCRE2_SPTR)std::string("([a-z]+)|\\s").c_str();

Using this pointer with any of the PCRE functions will result in undefined behavior.将此指针与任何 PCRE 函数一起使用将导致未定义的行为。 The std::string temporary is destroyed at the end of the definition of pattern , causing pattern to dangle. std::string临时对象在pattern的定义结束时被销毁,导致pattern悬空。

My recommendation is to change pattern 's type to std::string and call c_str() when passing arguments to a PCRE function.我的建议是将pattern的类型更改为std::string并在将参数传递给 PCRE 函数时调用c_str() It is a very fast operation in C++11 (and you are not using the old GCC 4 ABI).这是 C++11 中的一个非常快的操作(并且您没有使用旧的 GCC 4 ABI)。

There are also several C++ wrappers for PCRE that might help you avoid such issues and make PCRE easier to use, but I do not the status of Windows support.还有几个用于 PCRE 的 C++ 包装器可以帮助您避免此类问题并使 PCRE 更易于使用,但我不了解 Windows 支持的状态。

I don't know if this is still something you're looking at or not... but just-in-case does this help?我不知道这是否仍然是你在看的东西……但以防万一这有帮助吗?

From pcre2api man page:从 pcre2api 手册页:

In a Windows environment, if you want to statically link an application program against a non-dll PCRE2 library, you must define PCRE2_STATIC before including pcre2.h.在 Windows 环境中,如果要将应用程序静态链接到非 dll PCRE2 库,则必须在包含 pcre2.h 之前定义 PCRE2_STATIC。

You can follow these steps:您可以按照以下步骤操作:

  1. Download and install cmake.下载并安装 cmake。
  2. Set the source folder location and a VS project folder.设置源文件夹位置和 VS 项目文件夹。
  3. Hit configure and select your VS version.点击配置并选择您的 VS 版本。
  4. Once the configure process is done you can select 8, 16, and / or 32 bit from the list.配置过程完成后,您可以从列表中选择 8、16 和/或 32 位。
  5. Press generate, then open the VS solution file in the project folder.按生成,然后打开项目文件夹中的 VS 解决方案文件。 This will open the solution in VS.这将在 VS 中打开解决方案。
  6. There are about 6 projects.大约有6个项目。 Highlight the pcre2._ project.突出显示 pcre2._ 项目。 Go to preferences and make sure the output file is for a DLL.转到首选项并确保输出文件用于 DLL。 Repeat this step for the pcre2posix project.对 pcre2posix 项目重复此步骤。 And then set the greptest to be built as an exe (executable) and the other one as executable.然后将 greptest 设置为 exe(可执行文件),将另一个设置为可执行文件。
  7. At this point you can try to builld all, but you might need to build the DLLs first because the executables rely on them (or rather their static libraries) for linking.此时您可以尝试构建所有,但您可能需要先构建 DLL,因为可执行文件依赖它们(或者更确切地说是它们的静态库)进行链接。
  8. After all 6 projects are built successfully you should have shared /static libraries and test programs in your debug or release folders.成功构建所有 6 个项目后,您应该在调试或发布文件夹中共享 /static 库和测试程序。

Here's more detail on Charles Thomas's answer...这是有关查尔斯·托马斯(Charles Thomas)答案的更多详细信息......

If you're using this on Windows from C++ and you built PCRE2 as a static library...in the pcre2.h, there's this...如果您从 C++ 在 Windows 上使用它,并且您将 PCRE2 构建为静态库...在 pcre2.h 中,有这个...

#if defined(_WIN32) && !defined(PCRE2_STATIC)
#  ifndef PCRE2_EXP_DECL
#    define PCRE2_EXP_DECL  extern __declspec(dllimport)
#  endif
#endif

_WIN32 is defined because you're on Windows, but you need to define PCRE2_STATIC at the top of pcre2.h, like so... _WIN32 被定义是因为你在 Windows 上,但是你需要在 pcre2.h 的顶部定义 PCRE2_STATIC,就像这样......

#define PCRE2_STATIC 1

This makes it puts extern "C" in front of each function, instead of extern __declspec(dllimport) so you can link statically.这使得它将extern "C"放在每个函数前面,而不是extern __declspec(dllimport)以便您可以静态链接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM