简体   繁体   English

使用VBA将HTML表转换为Excel

[英]Convert HTML-table to Excel using VBA

Convert HTML-table to Excel 将HTML表转换为Excel

The code below fetches the HTML-table at https://rasmusrhl.github.io/stuff , and converts it to Excel-format. 下面的代码在https://rasmusrhl.github.io/stuff上获取HTML表,并将其转换为Excel格式。

The problem is that: 问题是:

  • Numbers in parentheses are converted to negative numbers 括号中的数字将转换为负数
  • Numbers are rounded or truncated 数字被舍入或截断

Solution

Thank you all for your great contributions. 谢谢大家的贡献。 The varied anwers helped me understand, that for my purposes a workaround was the best solution: Because I generate the HTML-tables myself, I can control the CSS of each cell. 各种各样的anwers帮助我理解,为了我的目的,解决方法是最好的解决方案:因为我自己生成HTML表,我可以控制每个单元格的CSS。 CSS codes exists that instruct Excel how to interpret cell contents: http://cosicimiento.blogspot.dk/2008/11/styling-excel-cells-with-mso-number.html , also explained in this question: Format HTML table cell so that Excel formats as text? 存在CSS代码,指示Excel如何解释单元格内容: http//cosicimiento.blogspot.dk/2008/11/styling-excel-cells-with-mso-number.html ,也在此问题中进行了解释: 格式化HTML表格单元格以便Excel格式化为文本?

In my case the CSS should be text, which is mso-number-format:\\"\\\\@\\" . 在我的例子中,CSS应该是文本,它是mso-number-format:\\"\\\\@\\" It is integrated in R code below: 它集成在下面的R代码中:

library(htmlTable)
library(nycflights13)
library(dplyr)

nycflights13::planes %>% 
    slice(1:10) %>% mutate( seats = seats*1.0001,
                            s1    = c("1-5", "5-10", "1/2", "1/10", "2-3", "1", "1.0", "01", "01.00", "asfdkjlæ" ),
                            s2    = c("(10)", "(12)", "(234)", "(00)", "(01)", "(098)", "(01)", "(01.)", "(001.0)", "()" )) -> df 


rle_man <- rle(df$manufacturer)

css_matrix <- matrix( data = "mso-number-format:\"\\@\"", nrow = nrow(df), ncol = ncol(df))
css_matrix[,1] <- "padding-left: 0.4cm;mso-number-format:\"\\@\""
css_matrix[,2:10] <- "padding-left: 1cm;mso-number-format:\"\\@\""
css_matrix[,5] <- "padding-left: 2cm;mso-number-format:\"\\@\""


htmlTable( x = df,  
           rgroup   = rle_man$values, n.rgroup = rle_man$lengths, 
           rnames   = FALSE, align = c("l", "r" ), 
           cgroup   =  rbind(  c("", "Some text goes here. It is long and does not break", "Other text goes here", NA),
                               c( "", "Machine type<br>(make)", "Specification of machine", "Other variables")),
           n.cgroup = rbind(   c(1,8,2, NA),
                               c(1, 3, 5, 2)), 
           css.cell = css_matrix )            -> html_out

temp_file <- tempfile( pattern = "table", fileext = ".html" )
readr::write_file( x = html_out, path = temp_file)
utils::browseURL( temp_file)

That HTML-file can be dragged and dropped into Excel with all cells interpreted as text. 可以将HTML文件拖放到Excel中,并将所有单元格解释为文本。 Note, only dragging-and-dropping the html-file into excel works, it does not work to open the table in a browser and copy-pasting it into excel. 请注意,只将html文件拖放到excel中,才能在浏览器中打开表并将其复制粘贴到excel中。

The only thing missing from this method is the horizontal lines, but I can live with that. 这种方法唯一缺少的是水平线,但我可以忍受。

Below is VBA with the same effect as dragging and dropping: 下面是VBA,与拖放效果相同:

Sub importhtml()
'
' importhtml Macro
'

'
With ActiveSheet.QueryTables.Add(Connection:= _
                                 "URL;file:///C:/Users/INSERTUSERNAME/Desktop/table18b85c0a20f3html.HTML", Destination:=Range("$a$1"))

.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = False
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingAll
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = True
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With

End Sub

For a client side solution 对于客户端解决方案

So run this code after the first block of code, it rewrites the final two columns. 因此,在第一个代码块之后运行此代码,它会重写最后两列。

Sub Test2()
    '* tools references ->
    '*   Microsoft HTML Object Library


    Dim oHtml4 As MSHTML.IHTMLDocument4
    Set oHtml4 = New MSHTML.HTMLDocument

    Dim oHtml As MSHTML.HTMLDocument
    Set oHtml = Nothing

    '* IHTMLDocument4.createDocumentFromUrl
    '* MSDN - IHTMLDocument4 createDocumentFromUrl method - https://msdn.microsoft.com/en-us/library/aa752523(v=vs.85).aspx
    Set oHtml = oHtml4.createDocumentFromUrl("https://rasmusrhl.github.io/stuff/", "")
    While oHtml.readyState <> "complete"
        DoEvents  '* do not comment this out it is required to break into the code if in infinite loop
    Wend
    Debug.Assert oHtml.readyState = "complete"


    Dim oTRs As MSHTML.IHTMLDOMChildrenCollection
    Set oTRs = oHtml.querySelectorAll("TR")
    Debug.Assert oTRs.Length = 17

    Dim lRowNum As Long
    For lRowNum = 3 To oTRs.Length - 1

        Dim oTRLoop As MSHTML.HTMLTableRow
        Set oTRLoop = oTRs.Item(lRowNum)
        If oTRLoop.ChildNodes.Length > 1 Then

            Debug.Assert oTRLoop.ChildNodes.Length = 14

            Dim oSecondToLastColumn As MSHTML.HTMLTableCell
            Set oSecondToLastColumn = oTRLoop.ChildNodes.Item(12)

            ActiveSheet.Cells(lRowNum + 2, 13).Value2 = "'" & oSecondToLastColumn.innerText


            Dim oLastColumn As MSHTML.HTMLTableCell
            Set oLastColumn = oTRLoop.ChildNodes.Item(13)

            ActiveSheet.Cells(lRowNum + 2, 14).Value2 = "'" & oLastColumn.innerText

        End If
        'Stop

    Next lRowNum

    ActiveSheet.Columns("M:M").EntireColumn.AutoFit
    ActiveSheet.Columns("N:N").EntireColumn.AutoFit


End Sub

For a Server Side Solution 对于服务器端解决方案

Now that we know you control the source script and that it is in R then one can change the R script to style the final columns with mso-number-format:'\\@' . 现在我们知道您控制源脚本并且它在R中,然后可以更改R脚本以使用mso-number-format设置最终列的样式:'\\ @'。 Here is a sample R script that achieves this, one builds a CSS matrix of the same dimensions as the data and passes the CSS matrix as a parameter into htmlTable . 下面是一个实现此目的的示例R脚本,一个构建与数据尺寸相同的CSS矩阵,并将CSS矩阵作为参数传递给htmlTable I have not tampered with your R source instead I give here a simple illustration for you to interpret. 我没有篡改你的R源代替我在这里给你一个简单的插图供你解释。

A=matrix(c("(2)","(4)","(3)","(1)","(5)","(7)"),nrow=2,ncol=3,byrow=TRUE)
css_matrix <- matrix(data="",nrow=2,ncol=3)
css_matrix[,3] <- "mso-number-format:\"\\@\""
htmlTable(x=A,css.cell=css_matrix)

Opening in Excel I get this 在Excel中打开我得到了这个 在此输入图像描述

Robin Mackenzie adds 罗宾麦肯齐补充道

you might mention in your server-side solution that OP just needs to add css_matrix[,10:11] <- "mso-number-format:\\"\\@\\"" to their existing R code (after the last css_matrix... line) and it will implement your solution for their specific problem 您可能会在服务器端解决方案中提到OP只需要将css_matrix [,10:11] < - “mso-number-format:\\”\\ @ \\“”添加到他们现有的R代码中(在最后一个css_matrix之后) 。)它将针对他们的具体问题实施您的解决方案

Thanks Robin 谢谢罗宾

To get the tabular data (keeping the format as it is) from that page, you can try like below: 要从该页面获取表格数据(保持格式不变),您可以尝试如下:

 Sub Fetch_Data()
    Dim http As New XMLHTTP60, html As New HTMLDocument
    Dim posts As Object, post As Object, elem As Object
    Dim row As Long, col As Long

    With http
        .Open "GET", "https://rasmusrhl.github.io/stuff/", False
        .send
        html.body.innerHTML = .responseText
    End With

    Set posts = html.getElementsByClassName("gmisc_table")(0)

    For Each post In posts.Rows
        For Each elem In post.Cells
            col = col + 1: Cells(row + 1, col).NumberFormat = "@": Cells(row + 1, col) = elem.innerText
        Next elem
        col = 0
        row = row + 1
    Next post
End Sub

Reference to add to the library: 参考添加到库:

1. Microsoft HTML Object Library
2. Microsoft XML, v6.0  'or whatever version you have

This is how that portion looks like when get parsed. 这是解析时该部分的样子。 在此输入图像描述

This works with a temp file. 这适用于临时文件。

What it does: Downloads Data Locally. 它的作用:本地下载数据。 Then, replaces the "(" with a "\\". Then, imports the data. Formats the data as text (to ensure we can change it back without error). Then, changes the text. This cannot be done with Range.Replace because that will reformat the cell contents. 然后,用“\\”替换“(”。然后,导入数据。将数据格式化为文本(以确保我们可以无错误地将其更改)。然后,更改文本。这不能用Range.Replace完成因为这将重新格式化单元格内容。

' Local Variables
Public FileName As String ' Temp File Path
Public FileUrl As String ' Url Formatted Temp File Path
Public DownloadUrl As String ' Where We're Going to Download From

' Declares Have to Be At Top
Private Declare Function GetTempPath Lib "kernel32" _
  Alias "GetTempPathA" _
  (ByVal nBufferLength As Long, _
  ByVal lpBuffer As String) As Long
Private Declare Function GetTempFileName Lib "kernel32" _
  Alias "GetTempFileNameA" _
  (ByVal lpszPath As String, _
  ByVal lpPrefixString As String, _
  ByVal wUnique As Long, _
  ByVal lpTempFileName As String) As Long

' Loads the HTML Content Without Bug
Sub ImportHtml()

    ' Set Our Download URL
    DownloadUrl = "https://rasmusrhl.github.io/stuff"

    ' Sets the Temporary File Path
    SetFilePath

    ' Downloads the File
    DownloadFile

    ' Replaces the "(" in the File With "\(", We Will Later Put it Back
    ' This Ensures Formatting of Content Isn't Modified!!!
    ReplaceStringInFile


    ' Our Query Table is Now Coming From the Local File, Instead
    Dim s As QueryTable
    Set s = ActiveSheet.QueryTables.Add(Connection:=("FINDER;file://" + FileUrl), Destination:=Range("$A$1"))

    With s

        .Name = "stuff"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = False
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlEntirePage
        .WebFormatting = xlWebFormattingAll
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = True
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False

        ' Sets Formatting So When We Change Text the Data Doesn't Change
        .ResultRange.NumberFormat = "@"

        ' Loop Through Cells in Range
        ' If You Do Excel Replace, Instead It Will Change Cell Format
        Const myStr As String = "\(", myReplace As String = "("
        For Each c In .ResultRange.Cells
            Do While c.Value Like "*" & myStr & "*"
                c.Characters(InStr(1, c.Value, myStr), Len(myStr)).Text = myReplace
            Loop
        Next


    End With
End Sub

' This function replaces the "(" in the file with "\("
Sub ReplaceStringInFile()

    Dim sBuf As String
    Dim sTemp As String
    Dim iFileNum As Integer
    Dim sFileName As String

    ' Edit as needed
    sFileName = FileName

    iFileNum = FreeFile
    Open sFileName For Input As iFileNum

    Do Until EOF(iFileNum)
        Line Input #iFileNum, sBuf
        sTemp = sTemp & sBuf & vbCrLf
    Loop
    Close iFileNum

    sTemp = Replace(sTemp, "(", "\(")

    iFileNum = FreeFile
    Open sFileName For Output As iFileNum
    Print #iFileNum, sTemp
    Close iFileNum

End Sub

' This function sets file paths because we need a temp file
Function SetFilePath()

    If FileName = "" Then
        FileName = GetTempHtmlName
        FileUrl = Replace(FileName, "\", "/")
    End If

End Function

' This subroutine downloads the file from the specified URL
' The download is necessary because we will be editing the file
Sub DownloadFile()

    Dim myURL As String
    myURL = "https://rasmusrhl.github.io/stuff"

    Dim WinHttpReq As Object
    Set WinHttpReq = CreateObject("Microsoft.XMLHTTP")
    WinHttpReq.Open "GET", DownloadUrl, False, "username", "password"
    WinHttpReq.send

    myURL = WinHttpReq.responseBody
    If WinHttpReq.Status = 200 Then
        Set oStream = CreateObject("ADODB.Stream")
        oStream.Open
        oStream.Type = 1
        oStream.Write WinHttpReq.responseBody
        oStream.SaveToFile FileName, 2 ' 1 = no overwrite, 2 = overwrite
        oStream.Close
    End If

End Sub

'''''''''''''''''''''''''''''
' THIS BLOCK OF CODE GETS A TEMPORARY FILE PATH USING THE GetTempHtmlName Function
'''''''''''''''''''''''''''''


Public Function GetTempHtmlName( _
  Optional sPrefix As String = "VBA", _
  Optional sExtensao As String = "") As String
  Dim sTmpPath As String * 512
  Dim sTmpName As String * 576
  Dim nRet As Long
  Dim F As String
  nRet = GetTempPath(512, sTmpPath)
  If (nRet > 0 And nRet < 512) Then
    nRet = GetTempFileName(sTmpPath, sPrefix, 0, sTmpName)
    If nRet <> 0 Then F = Left$(sTmpName, InStr(sTmpName, vbNullChar) - 1)
    If sExtensao > "" Then
      Kill F
      If Right(F, 4) = ".tmp" Then F = Left(F, Len(F) - 4)
      F = F & sExtensao
    End If
    F = Replace(F, ".tmp", ".html")
    GetTempHtmlName = F
  End If
End Function

'''''''''''''''''''''''''''''
' End - GetTempHtmlName
'''''''''''''''''''''''''''''

You may give this a try to see if you get the desired output... 你可以尝试一下,看看你是否得到了想要的输出......

Sub GetWebData()
Dim IE As Object
Dim doc As Object
Dim TRs As Object
Dim TR As Object
Dim Cell As Object
Dim r As Long, c As Long

Application.ScreenUpdating = False

Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = False
IE.navigate "https://rasmusrhl.github.io/stuff/"
Do While IE.Busy Or IE.readyState <> 4
    DoEvents
Loop
Set doc = IE.document

Set TRs = doc.getElementsByTagName("tr")
Cells.Clear

For Each TR In TRs
    r = r + 1
    For Each Cell In TR.Children
        c = c + 1
        Cells(r, c).NumberFormat = "@"
        Cells(r, c) = Cell.innerText
    Next Cell
    c = 0
Next TR
IE.Quit
Columns.AutoFit
Application.ScreenUpdating = True
End Sub

Solution 2: 解决方案2:

To make it work, you need to add the following two references by going to Tools (on VBA Editor) --> References and then find the below mentioned two references and check the checkboxes for them and click OK. 要使其工作,您需要通过转到工具(在VBA编辑器上) - >引用添加以下两个引用,然后找到下面提到的两个引用并选中它们的复选框,然后单击确定。

1) Microsoft XML, v6.0 (find the max version available) 1)Microsoft XML,v6.0(找到最大版本)

2) Microsoft HTML Object Library 2)Microsoft HTML对象库

Sub GetWebData2()
Dim XMLpage As New MSXML2.XMLHTTP60
Dim doc As New MSHTML.HTMLDocument
Dim TRs As IHTMLElementCollection
Dim TR As IHTMLElement
Dim Cell As IHTMLElement
Dim r As Long, c As Long

Application.ScreenUpdating = False

Set XMLpage = CreateObject("MSXML2.XMLHTTP")

XMLpage.Open "GET", "https://rasmusrhl.github.io/stuff/", False
XMLpage.send
doc.body.innerhtml = XMLpage.responsetext
Set TRs = doc.getElementsByTagName("tr")
Set TRs = doc.getElementsByTagName("tr")
Cells.Clear

For Each TR In TRs
    r = r + 1
    For Each Cell In TR.Children
        c = c + 1
        Cells(r, c).NumberFormat = "@"
        Cells(r, c) = Cell.innerText
    Next Cell
    c = 0
Next TR
Columns.AutoFit
Application.ScreenUpdating = True
End Sub
<style type=text/css>
    td {mso-number-format: '\@';}
</style>
<table ...

Putting the above global style definition for the cells ( <td> s) on the output you generate using R or rewriting the document on the client side like below just works. 将单元格( <td> s)的上述全局样式定义放在使用R生成的输出上, 或者在客户端重写文档,如下所示。

Sub importhtml()
    '*********** HTML document rewrite process ***************
    Const TableUrl = "https://rasmusrhl.github.io/stuff"

    Const adTypeBinary = 1, adSaveCreateOverWrite = 2, TemporaryFolder = 2
    Dim tempFilePath, binData() As Byte

    With CreateObject("Scripting.FileSystemObject")
        tempFilePath = .BuildPath(.GetSpecialFolder(TemporaryFolder), .GetTempName() & ".html")
    End With

    'download HTML document
    With CreateObject("MSXML2.ServerXMLHTTP")
        .Open "GET", TableUrl, False
        .Send
        If .Status <> 200 Then Err.Raise 3, "importhtml", "200 expected"
        binData = .ResponseBody
    End With

    With CreateObject("Adodb.Stream")
        .Charset = "x-ansi"
        .Open
        .WriteText "<style type=text/css>td {mso-number-format:'\@';}</style>"
        .Position = 0 'move to start
        .Type = adTypeBinary 'change stream type
        .Position = .Size 'move to end
        .Write binData 'append binary data end of stream
        .SaveToFile tempFilePath, adSaveCreateOverWrite 'save temporary file
        .Close
    End With
    '*********** HTML document rewrite process ***************

    With ActiveSheet.QueryTables.Add(Connection:= _
        "URL;" & tempFilePath, Destination:=Range("$A$1"))
        'load HTML document from rewritten local copy

        .Name = "stuff"
        .FieldNames = True
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = False
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .WebSelectionType = xlEntirePage
        .WebFormatting = xlWebFormattingAll
        .WebPreFormattedTextToColumns = True
        .WebConsecutiveDelimitersAsOne = True
        .WebSingleBlockTextImport = False
        .WebDisableDateRecognition = True
        .WebDisableRedirections = False
        .Refresh BackgroundQuery:=False

    End With

    Kill tempFilePath
End Sub

Try this, to import the data as a table: 试试这个,将数据导入表:

Sub ImportDataAsTable()
    ActiveWorkbook.Queries.Add Name:="Table 0", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Source = Web.Page(Web.Contents(""https://rasmusrhl.github.io/stuff/""))," & Chr(13) & "" & Chr(10) & "    Data0 = Source{0}[Data]," & Chr(13) & "" & Chr(10) & "    #""Changed Type"" = Table.TransformColumnTypes(Data0,{{""tailnum"", type text}, {"""", type text}, {""Some text goes here. It is long and does not break Machine type (make) year"", type text}, {""Some text goes here. It is long and does not break Mach" & _
        "ine type (make) type"", type text}, {""Some text goes here. It is long and does not break Machine type (make) manufacturer"", type text}, {""Some text goes here. It is long and does not break"", type text}, {""Some text goes here. It is long and does not break Specification of machine model"", type text}, {""Some text goes here. It is long and does not break Specifi" & _
        "cation of machine engines"", type text}, {""Some text goes here. It is long and does not break Specification of machine seats"", type text}, {""Some text goes here. It is long and does not break Specification of machine speed"", type text}, {""Some text goes here. It is long and does not break Specification of machine engine"", type text}, {""2"", type text}, {""Oth" & _
        "er text goes here Other variables s1"", type text}, {""Other text goes here Other variables s2"", type text}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Changed Type"""
    ActiveWorkbook.Worksheets.Add
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 0"";Extended Properties=""""" _
        , Destination:=Range("$A$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 0]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_0"
        .Refresh BackgroundQuery:=False
    End With
End Sub

Processing the HTML and then Copying and Pasting it into Excel will 处理HTML,然后将其复制并粘贴到Excel中

Here are the steps I used: 以下是我使用的步骤:

  • CreateObject("MSXML2.XMLHTTP") : get the URL's responseText CreateObject("MSXML2.XMLHTTP") :获取URL的responseText
  • CreateObject("HTMLFile") : create a HTML Document from responseText CreateObject("HTMLFile") :从responseText创建一个HTML文档
  • Replace grey with black to darken the borders 用黑色替换灰色以使边框变暗
  • Prefix columns s1 and s2 with @ to preserve formatting 使用@前缀列s1和s2以保留格式
  • Copy the HTML to the Windows Clipboard 将HTML复制到Windows剪贴板
    • Note: The HTML need to enclosed in HTML and Body tags to paste properly 注意:HTML需要包含在HTML和Body标签中才能正确粘贴
  • Setup the destination Worksheet 设置目标工作表
  • Paste the HTML into the Worksheet 将HTML粘贴到工作表中
  • Replace the @ sign with ' 更换@带符号'
    • Note: This preserves the formatting by storing the data as text 注意:这通过将数据存储为文本来保留格式
  • Finish formatting the Worksheet 完成格式化工作表

在此输入图像描述


Sub LoadTable()
    Const URL = "https://rasmusrhl.github.io/stuff/"
    Dim x As Long
    Dim doc As Object, tbl As Object, rw As Object
    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", URL, False
        .send
        If .readyState = 4 And .Status = 200 Then
            Set doc = CreateObject("HTMLFile")
            doc.body.innerHTML = .responseText
            doc.body.innerHTML = Replace(doc.body.innerHTML, "grey", "black")
            Set tbl = doc.getElementsByTagName("TABLE")(0)

            For x = 0 To tbl.Rows.Length - 1
                Set rw = tbl.Rows(x)

                If rw.Cells.Length = 14 Then
                    'If InStr(rw.Cells(12).innerText, "-") Or InStr(rw.Cells(12).innerText, "/") Then
                    rw.Cells(12).innerText = "@" & rw.Cells(12).innerText
                    rw.Cells(13).innerText = "@" & rw.Cells(13).innerText
                End If
            Next

            With CreateObject("new:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
                .SetText "<html><body>" & doc.body.innerHTML & "</body></html>"
                .PutInClipboard
            End With

            With Worksheets("Sheet1")
                .Cells.Clear
                .Range("A1").PasteSpecial
                .Cells.Interior.Color = vbWhite
                .Cells.WrapText = False
                .Columns.AutoFit
                .Columns("M:N").Replace What:="@", Replacement:="'"
            End With

        Else
            MsgBox "URL:  " & vbCrLf & "Ready state: " & .readyState & vbCrLf & "HTTP request status: " & .Status, vbInformation, "URL Not Responding"
        End If
    End With
End Sub

Based on the documentation from Microsoft MSDN Library: WebFormatting Property you could try the below change to your code: 根据Microsoft MSDN Library:WebFormatting Property中的文档,您可以尝试以下对代码的更改:

 .WebFormatting = xlWebFormattingNone

This may allow the data to be copied without any number formatting - then you can set your own number format for those cells (using MSDN: Excel VBA NumberFormat property ) 这可能允许复制数据而不进行任何数字格式化 - 然后您可以为这些单元格设置自己的数字格式(使用MSDN:Excel VBA NumberFormat属性

A similar solution should solve the issue with numbers being truncated or rounding - set the decimal points for the affected cells in your target range... 类似的解决方案应解决数字被截断或舍入的问题 - 设置目标范围内受影响单元格的小数点...

With the url https://rasmusrhl.github.io/stuff , it's by luck that Excel can simply just open it directly and save as .xlsx (how come no one try this before the tedious process). 使用网址https://rasmusrhl.github.io/stuff ,幸运的是Excel可以直接打开它并保存为.xlsx(在繁琐的过程之前怎么没有人尝试这个)。 If direct open fails, all other methods here are great option! 如果直接打开失败,这里的所有其他方法都是很好的选择!

Option Explicit

Sub OpenWebFile()
    Const URL As String = "https://rasmusrhl.github.io/stuff"
    Dim oWB As Workbook
    On Error Resume Next
    Set oWB = Workbooks.Open(Filename:=URL, ReadOnly:=True)
    If oWB Is Nothing Then
        MsgBox "Cannot open the url " & URL, vbExclamation + vbOKOnly, "ERR " & Err.Number & ":" & Err.Description
        Err.Clear
    Else
        ' Change to your desired path and filename
        oWB.SaveAs Filename:="C:\Test\stuff.xlsx", FileFormat:=xlOpenXMLWorkbook
        Set oWB = Nothing
    End If
End Sub

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM