简体   繁体   English

使用正则表达式和C ++从字符串获取全名和值

[英]Getting full name and values from string with regex and c++

I have a project where I am reading data from a text file in c++ which contains a person's name and up to 4 numerical numbers like this. 我有一个项目,我在其中从c ++的文本文件中读取数据,其中包含一个人的名字和最多4个这样的数字。 (each line has an entry) (每行都有一个条目)

Dave Light 89 71 91 89
Hua Tran Du 81 79 80 

I am wondering if regex would be an efficient way of splitting the name and numerical values or if I should find an alternative method. 我想知道正则表达式是否是分割名称和数值的有效方法,或者我是否应该找到替代方法。

I would also like to be able to pick up any errors in the text file when reading each entry such as a letter instead of a number as if an entry like this was found. 我还希望能够在读取每个条目(例如字母而不是数字)时发现文本文件中的任何错误,就像找到了这样的条目一样。

Andrew Van Den J 88 95 85

This non -regex solution: 非正则表达式解决方案:

std::string str = "Dave Light 89 71 91 89";
std::size_t firstDig = str.find_first_of("0123456789");
std::string str1 = str.substr (0,firstDig);
std::string str2 = str.substr (firstDig);

would give you the letter part in str1 and the number part in str2. 将在str1中给您字母部分 ,在str2中给您数字部分

Check this code at ideone.com . 在ideone.com上检查此代码

It sounds like it's something like this you want...(?) I'm not quite sure what kind of errors you mean to pick. 听起来像是您想要的东西...(?)我不太确定您要挑什么样的错误。 As paxdiablo pointed out, a name could be quite complex, so getting the letter part probably would be the safest. 正如paxdiablo所指出的,名称可能非常复杂,因此将字母部分分开可能是最安全的。

You should better use a separator instead of space. 您最好使用分隔符代替空格。 The separator could be : , | 分隔符可以是:| , ^ or anything that cannot be part of your data. ^或任何不属于您数据的内容。 With this approach, your data should be stored as: 使用这种方法,您的数据应存储为:

Dave Light:89:71:91:89
Hua Tran Du:81:79:80 

And then you can use find , find_first_of , strchr or strstr or any other searching (and re-searching) to find relevant data. 然后,您可以使用findfind_first_ofstrchrstrstr或任何其他搜索(和重新搜索)来找到相关数据。

Try this code. 试试这个代码。

#include <iostream>
#include <regex>
#include <string>
#include <vector>

int main(){
    std::vector<std::string> data {"Dave Light 89 71 91 ","Hua Tran Du 81 79 80","zyx 1 2 3 4","zyx 1 2"};
    std::regex pat {R"((^[A-Za-z\s]*)(\d+)\s*(\d+)\s*(\d+)(\s*)$)"};
    for(auto& line : data) {
        std::cout<<line<<std::endl;
        std::smatch matches; // matched strings go here
        if (regex_search(line, matches, pat)) {
            //std::cout<<"size:"<<matches.size()<<std::endl;
            if (matches.size()==6)
                std::cout<<"Name:"<<matches[1].str()<<"\t"<<"data1:"<<matches[2].str()<<"\tdata2:"<<matches[3].str()<<"\tdata3:"<<matches[4].str()<<std::endl;
        }
    }
}

With regex number of lines code reduced greatly. 使用正则表达式的行数大大减少了。 Main trick in regex is using right pattern. 正则表达式的主要技巧是使用正确的模式。

Hope this will help you. 希望这会帮助你。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM