[英]Getting full name and values from string with regex and c++
I have a project where I am reading data from a text file in c++ which contains a person's name and up to 4 numerical numbers like this. 我有一个项目,我在其中从c ++的文本文件中读取数据,其中包含一个人的名字和最多4个这样的数字。 (each line has an entry)
(每行都有一个条目)
Dave Light 89 71 91 89
Hua Tran Du 81 79 80
I am wondering if regex would be an efficient way of splitting the name and numerical values or if I should find an alternative method. 我想知道正则表达式是否是分割名称和数值的有效方法,或者我是否应该找到替代方法。
I would also like to be able to pick up any errors in the text file when reading each entry such as a letter instead of a number as if an entry like this was found. 我还希望能够在读取每个条目(例如字母而不是数字)时发现文本文件中的任何错误,就像找到了这样的条目一样。
Andrew Van Den J 88 95 85
This non -regex solution: 此非正则表达式解决方案:
std::string str = "Dave Light 89 71 91 89";
std::size_t firstDig = str.find_first_of("0123456789");
std::string str1 = str.substr (0,firstDig);
std::string str2 = str.substr (firstDig);
would give you the letter part in str1 and the number part in str2. 将在str1中给您字母部分 ,在str2中给您数字部分 。
Check this code at ideone.com . 在ideone.com上检查此代码 。
It sounds like it's something like this you want...(?) I'm not quite sure what kind of errors you mean to pick. 听起来像是您想要的东西...(?)我不太确定您要挑什么样的错误。 As paxdiablo pointed out, a name could be quite complex, so getting the letter part probably would be the safest.
正如paxdiablo所指出的,名称可能非常复杂,因此将字母部分分开可能是最安全的。
You should better use a separator instead of space. 您最好使用分隔符代替空格。 The separator could be
:
, |
分隔符可以是
:
, |
, ^
or anything that cannot be part of your data. ,
^
或任何不属于您数据的内容。 With this approach, your data should be stored as: 使用这种方法,您的数据应存储为:
Dave Light:89:71:91:89
Hua Tran Du:81:79:80
And then you can use find
, find_first_of
, strchr
or strstr
or any other searching (and re-searching) to find relevant data. 然后,您可以使用
find
, find_first_of
, strchr
或strstr
或任何其他搜索(和重新搜索)来找到相关数据。
Try this code. 试试这个代码。
#include <iostream>
#include <regex>
#include <string>
#include <vector>
int main(){
std::vector<std::string> data {"Dave Light 89 71 91 ","Hua Tran Du 81 79 80","zyx 1 2 3 4","zyx 1 2"};
std::regex pat {R"((^[A-Za-z\s]*)(\d+)\s*(\d+)\s*(\d+)(\s*)$)"};
for(auto& line : data) {
std::cout<<line<<std::endl;
std::smatch matches; // matched strings go here
if (regex_search(line, matches, pat)) {
//std::cout<<"size:"<<matches.size()<<std::endl;
if (matches.size()==6)
std::cout<<"Name:"<<matches[1].str()<<"\t"<<"data1:"<<matches[2].str()<<"\tdata2:"<<matches[3].str()<<"\tdata3:"<<matches[4].str()<<std::endl;
}
}
}
With regex number of lines code reduced greatly. 使用正则表达式的行数大大减少了。 Main trick in regex is using right pattern.
正则表达式的主要技巧是使用正确的模式。
Hope this will help you. 希望这会帮助你。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.