简体   繁体   中英

Excel VBA - Merge specific columns from multiple files to one sheet

I have hundreds of excel files that I want to combine into one file. The problem is that these files contain hundreds of columns of extra data that I do not need. Further complicating things is that the column positions differ between workbooks and workbooks have differing number of columns. I want to create a macro that will go through and open each file, search for the columns I need, and then copy those columns of data and combine them into one master file.

The way the below code works is as follows: place all the files you want to combine into one folder Type the headers you want to search for and combine within those files on a new workbook.

If you have 4 columns in your files named: Name Date Product and Time

Then typing Date and Time in A1 and B1 in a new worksheet will search all the files and combine any columns found with matching headers to a compilation sheet.

Thanks to Ron DeBruin for most of the filesystem selection.

    'Option Explicit

    'takes worksheet and returns last row
    Private Function LastRowUsed(sh As Worksheet) As Long
        On Error Resume Next
        LastRowUsed = sh.Cells.Find(What:="*", _
                        After:=sh.Range("A1"), _
                        LookAt:=xlPart, _
                        LookIn:=xlFormulas, _
                        SearchOrder:=xlByRows, _
                        SearchDirection:=xlPrevious, _
                        MatchCase:=False).Row
        On Error GoTo 0
    End Function
    'takes worksheet and returns last column
    Private Function LastColUsed(sh As Worksheet) As Long
On Error Resume Next
LastColUsed = sh.Cells.Find(What:="*", _
                        After:=sh.Range("A1"), _
                        LookAt:=xlPart, _
                        LookIn:=xlFormulas, _
                        SearchOrder:=xlByColumns, _
                        SearchDirection:=xlPrevious, _
                        MatchCase:=False).Column
        On Error GoTo 0
    End Function

    Function GetFileListArray() As String()
Dim fileDialogBox As FileDialog
Dim SelectedFolder As Variant
Dim MYPATH As String
Dim MYFILES() As String
Dim FILESINPATH
Dim FNUM, i As Integer

        '''''
        Set fileDialogBox = Application.FileDialog(msoFileDialogFolderPicker)

         'Use a With...End With block to reference the FileDialog object.
         With fileDialogBox
             If .Show = -1 Then 'the user chose a folder
         For Each SelectedFolder In .SelectedItems
            MYPATH = SelectedFolder 'asign mypath to the selected folder name
           ' MsgBox "The path is: " & SelectedFolder 'display folder selected
         Next SelectedFolder
         'The user pressed Cancel.
         Else
            MsgBox "Cancel was pressed or Invalid folder chosen, ending macro"
            Exit Function
         End If
         End With
         'Set the file dialog object variable to Nothing to clear memory
         Set fileDialogBox = Nothing
           If Right(MYPATH, 1) <> "\" Then
             MYPATH = MYPATH & "\"
           End If
        FILESINPATH = Dir(MYPATH & "*.csV")
        If FILESINPATH = "" Then
           MsgBox "No files found"
          Exit Function
        End If

        'Fill the array(myFiles)with the list of Excel files in the folder
        FNUM = 0
        Do While FILESINPATH <> ""
          FNUM = FNUM + 1
          ReDim Preserve MYFILES(1 To FNUM)
          MYFILES(FNUM) = FILESINPATH
          FILESINPATH = Dir()
        Loop


GetFileListArray = MYFILES()
End Function
    Sub RFSSearchThenCombine()
'search first worksheet in files opened, change to search other worksheets
Const SHEET_TO_SEARCH = 1

Dim FileList() As String
Dim CurrentFolder As String
Dim openedWorkBook As Workbook, HeadingWorkbook As Workbook
Dim OpenedWorkSheet As Worksheet, HeadingWorkSheet As Worksheet
Dim i, counter, x As Integer
Dim LRowHeading, LRowOpenedBook, LColHeading, LColOpenedBook As Long
Dim dict As dictionary
Dim searchValue
'set original workbook with headings to retrieve
Set HeadingWorkbook = ActiveWorkbook
Set HeadingWorkSheet = HeadingWorkbook.Sheets(1)
'find last column on heading worksheet
LColHeading = LastColUsed(HeadingWorkSheet)
'create dictionary to link headers to position in heading worksheet

    Set dict = CreateObject("Scripting.Dictionary")
    For x = 1 To LColHeading
        dict.Add HeadingWorkSheet.Cells(1, x).Value, x
    Next x

FileList() = GetFileListArray()

For counter = 1 To UBound(FileList)
    Set openedWorkBook = Workbooks.Open(CurrentFolder & FileList(counter))
    Set OpenedWorkSheet = openedWorkBook.Sheets(SHEET_TO_SEARCH)
    LColOpenedBook = LastColUsed(openedWorkBook.Sheets(1))
    LRowOpenedBook = LastRowUsed(openedWorkBook.Sheets(1))
    LRowHeading = LastRowUsed(HeadingWorkSheet)

           For i = 1 To LColOpenedBook 'search headers from a1 to last header
                searchValue = OpenedWorkSheet.Cells(1, i).Value 'set search value to current header
                If dict.Exists(searchValue) Then
                      OpenedWorkSheet.Range(OpenedWorkSheet.Cells(1, i), _
                      OpenedWorkSheet.Cells(LRowOpenedBook, i)).Copy _
                      (HeadingWorkSheet.Cells(LRowHeading, dict.Item(searchValue)))
                End If
            Next
        openedWorkBook.Close (False)
Next ' move on to next file

    End Sub

Here's how you'd use a dictionary to store the name and column number of the columns of interest (based on an arbitrarily named "COMPILATION SHEET"). Remember you need to enable the reference to "Microsoft Scripting Runtime".

Sub InitiateDictionary()
Dim d As Dictionary
Set d = CreateObject("Scripting.Dictionary")

Dim wb As Workbook, ws As Worksheet
Set wb = ThisWorkbook
Set ws = wb.Sheets("COMPILATION SHEET")

lastCol = LastColUsed(ws)
For x = 1 To lastCol
    d.Add ws.Cells(1, x), x
Next x
End Sub


Private Function LastColUsed(sh As Worksheet)
On Error Resume Next
LastColUsed = sh.Cells.Find(What:="*", _
                        After:=sh.Range("A1"), _
                        LookAt:=xlPart, _
                        LookIn:=xlFormulas, _
                        SearchOrder:=xlByColumns, _
                        SearchDirection:=xlPrevious, _
                        MatchCase:=False).Column
On Error GoTo 0
End Function

All you need to do is elaborate a way to know if an element is contained by the dictionary (define the function DContains(dictionary, string) ). There are examples on Google on how to do that. Once you know that the header is inside the dictionary, you can use that header name to know the column number it refers to. A bit like this :

colNumber = 0
headerToFind = "Header_A"
found = DContains(d, headerToFind)
if found then
    colNumber = d(headerToFind)
end if
if colNumber > 0 then
    'Perform copy to column "colNumber" !
end if

To determine how many entries are in the dictionary, simply use the .Count property.

And yes, in this case Cells(x,1) is the same as Cells(x,1).value .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM