简体   繁体   English

从字符串中提取x,y,z坐标?

[英]Extracting x, y, z coordinates from string?

I have a vector of string, vector <string> shapes holding these coordinates data:我有一个字符串向量, vector <string> shapes保存这些坐标数据:

Shape1, [3, 2]
Shape1, [6, 7]
Shape2, [7, 12, 3], [-9, 13, 68]
Shape1, [10, 3]
Shape2, [30, -120, 3], [-29, 1, 268]
Shape3, [15, 32], [1, 5]
Shape4, [24, 31, 56]

I am trying to cout the coordinates x and y from Shape1 and Shape3 and x , y , z from Shape2 and Shape4 .我正在尝试从Shape1Shape3中计算坐标xy以及从Shape2Shape4中计算坐标xyz This is a reproducible code:这是一个可重现的代码:

#include <stdio.h>
#include <iostream>
#include <vector>
#include <string>

using namespace std;

int main()
{
    vector <string> shapes;
    
    shapes.push_back("Shape1, [3, 2]");
    shapes.push_back("Shape1, [632, 73]");
    shapes.push_back("Shape2, [7, 12, 3], [-9, 13, 68]");
    shapes.push_back("Shape1, [10, 3]");
    shapes.push_back("Shape2, [30, -120, 3], [-29, 1, 268]");
    shapes.push_back("Shape3, [15, 32], [1, 5]");
    shapes.push_back("Shape4, [24, 31, 56]");
    
    for(int i = 0; i < shapes.size(); i++)
    {
        // attempt to extract x
        size_t string_start = shapes[i].find(", [");
        string extracted = shapes[i].substr(string_start + 3, 1);
        
        cout << extracted << endl;
    }
    
    return 0;
}

As it is now, my current code can't cout the x properly - only the first character of x is cout .就像现在一样,我当前的代码无法正确地cout x - 只有x的第一个字符是cout How should I handle the length of x ?我应该如何处理x的长度? Subsequently, how should I cout the y and z in the data?随后,我应该如何cout数据中的yz The delimiter is , but there are multiples , everywhere.分隔符是,但到处都有倍数,

Since you already have the starting point of the x-coordinate, you can use this position to start finding the next ',' from there.由于您已经有了 x 坐标的起点,您可以使用这个 position 从那里开始寻找下一个“,”。 eg例如

size_t string_start = shapes[i].find(", [");
size_t x_end = shapes[i].find_first_of(',', string_start + 3);

std::string parsed_x = shapes[i].substr(string_start, x_end - (string_start + 3));

This does not include the case when there is a Shape2, which has seemingly multiple x positions.这不包括有一个 Shape2 的情况,它看起来有多个 x 位置。 But for this case you can just create a function which extracts the 'x' coordinate and let it run through you line multiple times.但是对于这种情况,您可以创建一个 function 提取“x”坐标并让它多次穿过您的线。

The below code is adapted from this stack overflow answer , go there to get the full explanation.下面的代码改编自这个堆栈溢出答案,go 以获得完整的解释。

#include <stdio.h>
#include <iostream>
#include <vector>
#include <string>
#include <map>

using namespace std;

map<char, int> extract(string s) {
    map<char, int> r;
    size_t pos = 0;
    string token;
    char axes[] = {'x', 'y', 'z'};
    int count = 0;
    while ((pos = s.find(", ")) != string::npos) {
        token = s.substr(0, pos);
        r[axes[count++]] = stoi(token);
        s.erase(0, pos + 2); # ", ".length() == 2
    }
    r[axes[count]] = stoi(s);
    return r;
}

int main()
{
    vector <string> shapes;
    
    shapes.push_back("Shape1, [3, 2]");
    shapes.push_back("Shape1, [632, 73]");
    shapes.push_back("Shape2, [7, 12, 3], [-9, 13, 68]");
    shapes.push_back("Shape1, [10, 3]");
    shapes.push_back("Shape2, [30, -120, 3], [-29, 1, 268]");
    shapes.push_back("Shape3, [15, 32], [1, 5]");
    shapes.push_back("Shape4, [24, 31, 56]");

    
    size_t pos = 0;
    string token;
    for(int i = 0; i < shapes.size(); i++) {
        string s = shapes[i];
        while ((pos = s.find(", [")) != string::npos) {
            auto r = extract(s.substr(pos + 3, s.find("]") - (pos + 3)));  # ", [".length() == 3
            cout << "X: " << r['x'] << ", Y: " << r['y'] << (r.count('z') ? ", Z: " + to_string(r['z']) : "") << endl;
            s.erase(0, pos + 3); # ", [".length() == 3
        }
    }
    
    return 0;
}

Further improvement进一步改进

The above code can be enhanced by having it store the extracted values in a Shape class or struct of some kind.可以通过将提取的值存储在Shape class 或某种结构中来增强上述代码。 This way, you'd have to do this operation once and work with the data as much times as you'd like.这样,您必须执行一次此操作,并根据需要多次处理数据。 But, if your only goal is to print the data, then the above code suffices.但是,如果您的唯一目标是打印数据,那么上面的代码就足够了。

A different approach here, which I think is simpler, is to use regex pattern matching and searching.我认为更简单的另一种方法是使用正则表达式模式匹配和搜索。 I think this will be more suitable to deal with the variables coordinate data and makes for easier handling of strings.我认为这将更适合处理变量坐标数据并更容易处理字符串。std::regex_token_iterator can do what you need.std::regex_token_iterator可以做你需要的。 It is (according to cppreference):它是(根据cppreference):

a read-only LegacyForwardIterator that accesses the individual sub-matches of every match of a regular expression within the underlying character sequence.一个只读的 LegacyForwardIterator,它访问底层字符序列中正则表达式的每个匹配项的各个子匹配项。 It can also be used to access the parts of the sequence that were not matched by the given regular expression (eg as a tokenizer).它还可以用于访问给定正则表达式不匹配的序列部分(例如,作为分词器)。

First of all, you can use a regex to get the coordinates in each shape string.首先,您可以使用正则表达式来获取每个形状字符串中的坐标。 The following regex will match against a sequence beginning with [ and ending with ] , capturing the text within these characters:以下正则表达式将匹配以[开头并以]结尾的序列,捕获这些字符中的文本:

std::regex reg(R"(\[(.+?)\])");

Then using the string extracted, we can tokenise the string into individual coordinates.然后使用提取的字符串,我们可以将字符串标记为单独的坐标。 Now we use a regex for the delimeter ", " , and pass -1 as the fourth parameter to std::sregex_token_iterator to get the text between them.现在我们为分隔符", "使用正则表达式,并将-1作为第四个参数传递给std::sregex_token_iterator以获取它们之间的文本。

This function I think does what you need:我认为这个 function 可以满足您的需求:

#include <iostream>
#include <regex>
#include <string>
#include <map>
#include <vector>

namespace
{
    std::map<int, std::string> lookup = { {0, "x"}, {1, "y"}, {2, "z"} };
}

void PrintShape(const std::string &shape)
{
    std::regex reg(R"(\[(.+?)\])");
    std::smatch mr;
    std::regex_search(shape, mr, reg);

    size_t string_start = shape.find(",");
    std::cout << shape.substr(0, string_start) << ":" << "\t";

    auto start = std::sregex_iterator(shape.begin(), shape.end(), reg);
    auto end = std::sregex_iterator{};

    for (std::sregex_iterator it = start; it != end; ++it)
    {
        //Get the first capturing group: [x, y, z]
        auto str = (*it)[1].str();

        //Tokenize group into x,y,z coordinates using delimiter ", "
        std::regex rgx(R"(, )");
        std::sregex_token_iterator iter(str.begin(), str.end(), rgx, -1);
        std::sregex_token_iterator iter_end{};

        //Print the coordinates
        int i = 0;
        std::cout << "[";
        for (; iter != iter_end; ++iter)
        {
            std::cout << lookup[i++] << " = " << *iter;

            if (std::next(iter) != iter_end)
            {
                std::cout << ", ";
            }
        }
        std::cout << "] ";
    }
    std::cout << "\n";
}

Here's a demo .这是一个演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM