如何将IHTMLDocument2-> get_body-> get_innerHTML转换为小写字符串？

Question

I am trying to get the innerHTML from a webpage body on c++, I have this so far: 我正在尝试从c ++的网页正文中获取innerHTML，到目前为止，我已经做到了：

// I get "Document" from a parameter when calling this code
BSTR bstrContent = NULL;
IHTMLElement *p = 0;
Document->get_body( &p );

if( p )
{
    p->get_innerHTML( &bstrContent );
    p->Release();
}

Now I need to turn bstrContent into a lowercase std::string or LPSTR, I've tried this: 现在我需要将bstrContent转换为小写的std :: string或LPSTR，我已经尝试过了：

LPSTR pagecontent = NULL;

int responseLength = (int)wcslen(bstrContent);
pagecontent = new CHAR[ responseLength + 1 ];
wcstombs( pagecontent, bstrContent, responseLength);

But "pagecontent" does not always contain the full innerHTML, only a first chunk. 但是“ pagecontent”并不总是包含完整的innerHTML，而仅包含第一块。 I even if it worked, I don't know how to easily make it all lowercase, with a std::string I'd use "transform"+"tolower" to do it. 即使它工作了，我也不知道如何使用std :: string使其全部变为小写，我将使用“ transform” +“ tolower”来实现。

So, how can I turn bstrContent into a std::string? 那么，如何将bstrContent转换为std :: string？

Answer 1

I'm not sure I fully understand your question. 我不确定我是否完全理解您的问题。 I don't know of any reason why get_innerHTML would give you an incomplete body, but you can convert a BSTR to a std::string (assuming you don't need to support unicode, in which case you should have been using a std::wstring anyway) using a function found on the following page: 我不知道为什么get_innerHTML会给您一个不完整的正文，但是您可以将BSTR转换为std :: string（假设您不需要支持unicode，在这种情况下，您应该一直使用std :: wstring），使用以下页面上的函数：

http://www.codeguru.com/forum/showthread.php?t=275978 http://www.codeguru.com/forum/showthread.php?t=275978

If you're using ATL there is also the CA2W conversion utility, but the function I linked you to is better since it'll at least support UTF8 if relevant. 如果您使用的是ATL，则还有CA2W转换实用程序，但是我链接到的功能更好，因为如果相关的话，它将至少支持UTF8。

Hope that helps, 希望能有所帮助，

Taxilian 的士

Answer 2

std::transform works fine if you have a start-pointer and an end-pointer, too. 如果您同时具有起点和终点，则std :: transform也可以正常工作。 It works on anything that behaves as sequence iterators (regular pointers qualify). 它适用于任何充当序列迭代器的行为（常规指针有效）。

如何将IHTMLDocument2-> get_body-> get_innerHTML转换为小写字符串？

问题描述

2 个解决方案

解决方案1
0 2011-01-15 03:15:09

解决方案2
0 2011-01-15 03:33:29

如何将IHTMLDocument2-&gt; get_body-&gt; get_innerHTML转换为小写字符串？

问题描述

2 个解决方案

解决方案1 0 2011-01-15 03:15:09

解决方案2 0 2011-01-15 03:33:29

如何将IHTMLDocument2-> get_body-> get_innerHTML转换为小写字符串？

解决方案1
0 2011-01-15 03:15:09

解决方案2
0 2011-01-15 03:33:29