简体   繁体   English

很大的Excel文件-如何在工作表之间复制数据?

[英]Very large excel file - how to copy data between sheets?

I need to to import some csv files into excel 2010 and create a very simple, but very large database. 我需要将一些csv文件导入excel 2010,并创建一个非常简单但非常大的数据库。
The whole story will be - five columns and thousands of rows. 整个故事将是-五列和数千行。
VBA is also simple - copy data from one sheet to another - and vice versa. VBA也很简单-将数据从一张纸复制到另一张纸,反之亦然。
But I need to care about memory requirement, because of potentially very large file size. 但是我需要考虑内存需求,因为文件可能很大。

Dim ws1 As Worksheet
Dim ws2 As Worksheet
Dim r1 As Range
Dim r2 As Range
Set ws1 = Sheets("01")
Set ws2 = Sheets("02")
Set r1 = ws1.Range("A1:B10") ' for example
Set r2 = ws2.Range("C5:D14")
r1.Copy Destination:=r2 'first way
r2.Value = r1.Value ' second way

Is there any differences between this two methods, in the scope of memory/time consuming? 这两种方法在内存/时间消耗方面有什么区别吗?
At the and I will have over 10,000 rows. 在和,我将有10,000多行。 What will be the size of the file? 文件的大小是多少?

You can utilize ADO to query text files as if they were a database table. 您可以利用ADO来查询文本文件,就像它们是数据库表一样。 This allows you to write SQL queries to pull data out of your text files. 这使您可以编写SQL查询以从文本文件中提取数据。 You can do this any text file or even .xls files if you wanted to. 您可以根据需要在任何文本文件甚至.xls文件中执行此操作。

The code/process for doing so is fairly simple. 这样做的代码/过程非常简单。 You'll need to reference the Microsoft ActiveX Data Objects 2.X Library first and then use something like the following: 您需要先引用Microsoft ActiveX数据对象2.X库,然后使用类似以下的内容:

Dim cn as New ADODB.Connection
Dim rs as New ADODB.Recordset
Dim i as Integer

With cn
    .Provider = "Microsoft.Jet.OLEDB.4.0"
    .ConnectionString = "Data Source=C:\SomeFolder;" & _
                        "Extended Properties=""text; HDR=Yes;FMT=Delimited"""
    .Open

    With rs
        .Open "SELECT * from fileName.txt", cn

        'Loop through each row in query
        While Not (.EOF Or .BOF)
            'Loop through each column in row
            For i = 0 to .Fields.Count - 1
                Debug.Print .Fields(i).Value 'Print value of field to Immediate Window
            Next i

            .MoveNext
        Wend

        .Close
    End With

    .Close
End With

Set rs = Nothing
Set cn = Nothing

This will loop through your text file and display the value of the first column in your VBA immediate window. 这将遍历您的文本文件,并在VBA立即窗口中显示第一列的值。 It also assumes that your file has header rows. 它还假定您的文件具有标题行。 If it does not then you need to alter HDR in your ConnectionString to No . 如果不是,则需要将ConnectionString中的HDR更改为No。

The code will automatically try and infer types for you but if you're running into issues with it not discovering the correct type (such as leading zeros) then you can explicity define a schema for your file. 该代码将自动为您尝试推断类型,但是如果遇到找不到正确类型(例如前导零)的问题,则可以为文件明确定义模式。 It's important to note that if you go the schema route then your ConnectionString arguments like HDR and FMT WILL BE IGNORED . 重要的是要注意,如果您进行模式路由,则将忽略诸如HDRFMT之类的ConnectionString参数。 They will retain their default settings as defined in the Registry unless you override them in the schema definition. 它们将保留注册表中定义的默认设置,除非您在架构定义中覆盖它们。 More info on schema.ini files can be found here: http://msdn.microsoft.com/en-us/library/windows/desktop/ms709353(v=vs.85).aspx . 有关schema.ini文件的更多信息,可以在这里找到: http : //msdn.microsoft.com/zh-cn/library/windows/desktop/ms709353( v=vs.85) .aspx

Here is another useful link: http://msdn.microsoft.com/en-us/library/ms974559.aspx . 这是另一个有用的链接: http : //msdn.microsoft.com/zh-cn/library/ms974559.aspx It's an article written by the Microsoft Scripting Guys and is how I originally learned about the process. 这是Microsoft脚本专家撰写的文章,也是我最初了解该过程的方式。

Lastly, if you ever use this process with .xls files then you should know that you should NEVER query an OPEN .xls file. 最后,如果您曾经对.xls文件使用此过程,则应该知道永远不要查询OPEN .xls文件。 There's a nasty memory leak bug with OPEN .xls files (more info here: http://support.microsoft.com/default.aspx?scid=kb;en-us;319998&Product=xlw ). OPEN .xls文件存在一个令人讨厌的内存泄漏错误(此处有更多信息: http : //support.microsoft.com/default.aspx? scid=kb;zh-cn;319998&Product=xlw)。 As long as you query CLOSED .xls documents then you shouldn't have any issues whatsoever =D. 只要您查询CLOSED .xls文档,就不会有任何问题= D。 The syntax in the SQL FROM clause is a bit different since you have to target particular sheet but IIRC the Scripting Guys article I linked explains how to do so. SQL FROM子句中的语法有些不同,因为您必须针对特定的工作表,但是我链接的IIRC脚本专家文章对IIRC进行了说明。

This code block had some specifics for a project I was on, but should help get you started on how to import CSV files (somewhat cleaning) through VBA: 该代码块具有我所参与的项目的一些细节,但是应该可以帮助您入门如何通过VBA导入CSV文件(进行某种程度的清理):

Public Sub ImportCSV(strPath As String, strFile As String, strExt As String, wbDestination As Workbook, Optional wsDest As Worksheet, Optional strRange As String, Optional blHeaders As Boolean = True)
'imports given CSV file into given sheet at given range _
    defaults to comma separated delimiters

Dim wsDestination As Worksheet
Dim strFileName As String
strFileName = strPath & strFile & ".csv"


If wsDest Is Nothing Then Set wsDestination = wbDestination.Worksheets.Add(, wbDestination.Worksheets(wbDestination.Worksheets.Count)) Else: Set wsDestination = wsDest
If strRange = "" Then strRange = "$A$1"

With wsDestination.QueryTables.Add(Connection:="TEXT;" & strFileName, Destination:=wsDestination.Range(strRange))
        .FieldNames = False
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = False
        .AdjustColumnWidth = False
        .RefreshPeriod = 0
        .TextFilePromptOnRefresh = False
        .TextFilePlatform = 437
        .TextFileStartRow = 1
        .TextFileParseType = xlDelimited
        .TextFileTextQualifier = xlTextQualifierDoubleQuote
        .TextFileConsecutiveDelimiter = False
        .TextFileTabDelimiter = False
        .TextFileSemicolonDelimiter = False
        .TextFileCommaDelimiter = True
        .TextFileSpaceDelimiter = False
        .TextFileTrailingMinusNumbers = True
        .Refresh BackgroundQuery:=False
        .Delete
    End With

If Not blHeaders Then wsDestination.Range(strRange).EntireRow.Delete

End Sub

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM