简体   繁体   English

如何将IHTMLDocument2-> get_body-> get_innerHTML转换为小写字符串?

[英]How to get IHTMLDocument2 ->get_body ->get_innerHTML into a lowercase string?

I am trying to get the innerHTML from a webpage body on c++, I have this so far: 我正在尝试从c ++的网页正文中获取innerHTML,到目前为止,我已经做到了:

// I get "Document" from a parameter when calling this code
BSTR bstrContent = NULL;
IHTMLElement *p = 0;
Document->get_body( &p );

if( p )
{
    p->get_innerHTML( &bstrContent );
    p->Release();
}

Now I need to turn bstrContent into a lowercase std::string or LPSTR, I've tried this: 现在我需要将bstrContent转换为小写的std :: string或LPSTR,我已经尝试过了:

LPSTR pagecontent = NULL;

int responseLength = (int)wcslen(bstrContent);
pagecontent = new CHAR[ responseLength + 1 ];
wcstombs( pagecontent, bstrContent, responseLength);

But "pagecontent" does not always contain the full innerHTML, only a first chunk. 但是“ pagecontent”并不总是包含完整的innerHTML,而仅包含第一块。 I even if it worked, I don't know how to easily make it all lowercase, with a std::string I'd use "transform"+"tolower" to do it. 即使它工作了,我也不知道如何使用std :: string使其全部变为小写,我将使用“ transform” +“ tolower”来实现。

So, how can I turn bstrContent into a std::string? 那么,如何将bstrContent转换为std :: string?

I'm not sure I fully understand your question. 我不确定我是否完全理解您的问题。 I don't know of any reason why get_innerHTML would give you an incomplete body, but you can convert a BSTR to a std::string (assuming you don't need to support unicode, in which case you should have been using a std::wstring anyway) using a function found on the following page: 我不知道为什么get_innerHTML会给您一个不完整的正文,但是您可以将BSTR转换为std :: string(假设您不需要支持unicode,在这种情况下,您应该一直使用std :: wstring),使用以下页面上的函数:

http://www.codeguru.com/forum/showthread.php?t=275978 http://www.codeguru.com/forum/showthread.php?t=275978

If you're using ATL there is also the CA2W conversion utility, but the function I linked you to is better since it'll at least support UTF8 if relevant. 如果您使用的是ATL,则还有CA2W转换实用程序,但是我链接到的功能更好,因为如果相关的话,它将至少支持UTF8。

Hope that helps, 希望能有所帮助,

  • Taxilian 的士

std::transform works fine if you have a start-pointer and an end-pointer, too. 如果您同时具有起点和终点,则std :: transform也可以正常工作。 It works on anything that behaves as sequence iterators (regular pointers qualify). 它适用于任何充当序列迭代器的行为(常规指针有效)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM