简体   繁体   English

表达式:字符串迭代器在使用 boost 正则表达式时不可解引用

[英]Expression: string iterator not dereferencable while using boost regex

I want to recover all the links from a page, while executing this code I get:我想从页面中恢复所有链接,在执行此代码时,我得到:

Microsoft Visual C++ Debug Library Microsoft Visual C++ 调试库

Debug Assertion Failed!调试断言失败!

Program: C:\\Users\\Gandalf\\Desktop\\proxy\\Debug\\Proxy.exe File: C:\\Program Files\\Microsoft Visual Studio 10.0\\VC\\include\\xstring Line: 78程序:C:\\Users\\Gandalf\\Desktop\\proxy\\Debug\\Proxy.exe 文件:C:\\Program Files\\Microsoft Visual Studio 10.0\\VC\\include\\xstring 行:78

Expression: string iterator not dereferencable表达式:字符串迭代器不可解引用

For information on how your program can cause an assertion failure, see the Visual C++ documentation on asserts.有关程序如何导致断言失败的信息,请参阅有关断言的 Visual C++ 文档。

(Press Retry to debug the application) (按重试调试应用程序)

Abort Retry Ignore中止重试忽略

void Deltacore::Client::get_links() {
boost::smatch matches;
boost::match_flag_type flags = boost::match_default;
boost::regex URL_REGEX("^<a[^>]*(http://[^\"]*)[^>]*>([ 0-9a-zA-Z]+)</a>$");

if(!response.empty()) {

    std::string::const_iterator alfa = this->response.begin();
    std::string::const_iterator omega   = this->response.end();

    while (boost::regex_search(alfa, omega, matches, URL_REGEX))
    {
        std::cout << matches[0];
        //if(std::find(this->Links.begin(), this->Links.end(), matches[0]) != this->Links.end()) {
            this->Links.push_back(matches[0]);
        //}
        alfa = matches[0].second;
    }
}
}

Any Ideea?任何想法?

Added more code:添加了更多代码:

        Deltacore::Client client;
    client.get_url(target);
    client.get_links();

            boost::property_tree::ptree props;
            for(size_t i = 0; i < client.Links.size(); i++)
                props.push_back(std::make_pair(boost::lexical_cast<std::string>(i), client.Links.at(i)));

            std::stringstream ss;
            boost::property_tree::write_json(ss, props, false);

            boost::asio::async_write(socket_,
                boost::asio::buffer(ss.str(), ss.str().length()),
                boost::bind(&session::handle_write, this,
                boost::asio::placeholders::error));

Thanks in advance提前致谢

The problem is on this line:问题出在这一行:

boost::asio::buffer(ss.str(), ss.str().length())

str() returns a temporary std::string object, so you are actually invalidating the buffer as soon as you create it – vanilla UB, as I commented. str()返回一个临时的std::string对象,因此您实际上是在创建缓冲区后立即使其无效 - 正如我所评论的,vanilla UB。 ;-] ;-]

Token documentation citation :令牌文档引用

The buffer is invalidated by any non-const operation called on the given string object.缓冲区因对给定字符串对象调用的任何非常量操作而无效。

Of course, destroying the string qualifies as a non-const operation.当然,销毁字符串属于非常量操作。

Skipping the lecture on using regex to parse HTML (and how you really shouldn't...), your regex doesn't look like it will work like you intend.跳过关于使用正则表达式解析 HTML 的讲座(以及你真的不应该如何......),你的正则表达式看起来不像你想要的那样工作。 This is yours:这是你的:

"^<a[^>]*(http://[^\"]*)[^>]*>([ 0-9a-zA-Z]+)</a>$"

The first character class will be greedy and eat up your http and following parts.第一个字符类将是贪婪的并且会吃掉您的 http 和以下部分。 You want to add a question mark to make it not greedy.您想添加一个问号以使其不贪婪。

"^<a[^>]*?(http://[^\"]*)[^>]*>([ 0-9a-zA-Z]+)</a>$"

This might or might not be related to the exception.这可能与异常有关,也可能无关。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM