简体   繁体   中英

Migrating Powerpoint information to Access database using VBA

I am interning with a large firm that stores a lot of its source data in the form of PowerPoints. These PowerPpoints serve well when communicating across departments and between suppliers but, as you may guess, lack any robust analysis. Because of this, I have decided to database these Powerpoints into Access.

There is no direct way of doing this, that I know of. Due to strict IT policies, I am limited to VBA as my coding platform. I have spent the last week coding up a macro to solve my problem. Again, since there is no direct conversion of PowerPoint to Access, I have had to solve this problem rather inefficiently as there are a few caveats. I will list my steps and caveats below.

  1. The powerpoint information I want to database is formatted as a table instead of text. I have been unable to find a Macro that converts PPT tables directly to Excel or CSV files. Because of this, I will convert all PPT files (roughly 3000) to PDFs.

  2. From these generated PDF's I can use Adobe to convert them to Excel or CSV files.

  3. Using multiple online resources and a bit of my own experience, I have coded up a VBA script that will automatically format a folder of CSV files into a format that Access will store correctly. See Code 1.

    • (The "Personal.xlsb!Module1.FormatAccess" is a macro created mostly with "Record Macro." I omitted this code due to its length and redundancy.)
  4. After formatting the CSVs, I will then automate them all to Access.

  5. Following the Access automation, I will need to embed each PPT file to its respective Access entry

Again, this is not an efficient process. Because I am limited to Microsoft only applications, I have chosen this route. I thought about leaving the information as Excel files, but the idea is to make this data accessible and searchable by any department, hence why I chose Access to database them.

Now that I have explained to you where I am coming from and what I am doing, I ask: what recommendations do you have for me? I feel my round-about way is a good solution and practical, but I wonder if there is a better solution.

Code 1

Sub LoopCSVFile()

Dim fso As Object  'Scritping.FileSystemObject
Dim fldr As Object 'Scripting.Folder
Dim file As Object 'Scripting.File
Dim wb As Workbook

Set fso = CreateObject("Scripting.FileSystemObject")
Set fldr = fso.GetFolder("C:\Users\HMM105289\Documents\Powerpoint Parsing\Test Folder\Test Save Folder")

For Each file In fldr.Files

    Set wb = Workbooks.Open(file.Path)

    Application.Run "Personal.xlsb!Module1.FormatAccess"

    wb.Close SaveChanges = True

Next

Set file = Nothing
Set fldr = Nothing
Set fso = Nothing

End Sub

Edit 1

Having played around with some of Tim's suggestions, I have come up with this code to run a check on each PPT slide. The idea is to have it run his "ExtractTable" macro inside. As it stands, I am unable to get it to execute.

Sub PPTableXtraction()

Dim oSlide As Slide
Dim oSlides As Slides
Dim oPPT As Object: Set oPPT = ActivePresentation
Dim oShapes As Shape
Dim oTable As Object



For Each oSlide In oPPT.Slides
    For Each oShapes In oSlide.Shapes
        If oShapes.HasTable Then
            Application.Run "VBAProject.xlsb!Module3.ExtractTableContent"
        End If
    Next
Next


End Sub

Edit 2

I was able to build on Tim's code to create a code that loops each PowerPoint file and extracts the information into Excel. The code doesn't break into the debugger but for whatever reason it is not performing any functions. Would anyone have any idea why?

Sub Tester()
Dim ppts As PowerPoint.Application

Dim FolderPath As String
Dim FileName As String

FolderPath = "FolderPath"
FileName = Dir(FolderPath & "*.ppt*")

Do While FileName <> ""
    Set ppts = New PowerPoint.Application
    ppts.Visible = True
    ppts.Presentations.Open FileName:=FolderPath & FileName
    A = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row + 5
    B = "B" & A
    X = "A" & A
    Range(X).Value = "New"

Dim ppt As Object, tbl As Object
Dim slide As Object, pres As Object, shp
Dim rngDest As Range


    Set ppt = GetObject(, "Powerpoint.Application")

    Set pres = ppt.ActivePresentation

    Set rngDest = Sheets("Data").Range(B) '

    For Each slide In pres.Slides
        For Each shp In slide.Shapes
            If shp.HasTable Then
                ExtractTableContent shp.Table, rngDest
                Set rngDest = rngDest.Offset(shp.Table.Rows.Count + 3, 0)
            End If
        Next
    Next

    ppts.ActivePresentation.Close
    FileName = Dir
Loop
End Sub


Sub ExtractTableContent(oTable As Object, rng As Range)
    Dim r, c, offR As Long, offC As Long

    For Each r In oTable.Rows '<< Loop over each row in the PPT table

        offC = 0 '<< reset the column offset

        For Each c In r.Cells '<< Loop over each cell in the row

            'Copy the cell's text content to Excel, using the offsets
            '    offR and offC to select where it gets placed relative
            '    to the starting point (rng)
            rng.Offset(offR, offC).Value = c.Shape.TextFrame.TextRange.Text

            offC = offC + 1 '<< increment the column offset

        Next c

        offR = offR + 1 '<< increment the row offset

    Next r

End Sub

Sub N()
Range("A3").Value = "New"
End Sub

Here's an example of extracting a table from PPT to Excel.

Looping over the slides and tables (modified from your posted code)

Sub Tester()

    Dim ppt As Object, tbl As Object
    Dim slide As Object, pres As Object, shp
    Dim rngDest As Range


    Set ppt = GetObject(, "Powerpoint.Application")

    Set pres = ppt.ActivePresentation

    Set rngDest = Sheets("Data").Range("a1") '<< where to start placing ppt data

    For Each slide In pres.Slides
        For Each shp In slide.Shapes
            If shp.HasTable Then
                ExtractTableContent shp.Table, rngDest
                Set rngDest = rngDest.Offset(shp.Table.Rows.Count + 3, 0)
            End If
        Next
    Next

End Sub

The sub to extract each table's data:

Sub ExtractTableContent(oTable As Object, rng As Range)
    Dim r, c, offR As Long, offC As Long

    For Each r In oTable.Rows '<< Loop over each row in the PPT table

        offC = 0 '<< reset the column offset

        For Each c In r.Cells '<< Loop over each cell in the row

            'Copy the cell's text content to Excel, using the offsets
            '    offR and offC to select where it gets placed relative
            '    to the starting point (rng)
            rng.Offset(offR, offC).Value = c.Shape.TextFrame.TextRange.Text

            offC = offC + 1 '<< increment the column offset

        Next c

        offR = offR + 1 '<< increment the row offset

    Next r
End Sub

In case anyone skims this and wants the solution used

It is out of the box ready, with the exception of setting your file path.

Sub Tester()
Dim rng As Range

Set rng = Range("A1")    'This code is necessary to prevent a constant loop of the formatting for each extraction. It adds a "1" into "A1"
rng.Value = 1


Dim ppts As PowerPoint.Application

Dim FolderPath As String
Dim FileName As String

FolderPath = "FolderPath"                'Define your Folder Path
FileName = Dir(FolderPath & "*.ppt*")    'Locate .PPT files

Do While FileName <> ""
    Set ppts = New PowerPoint.Application 'Left this in after finding another fix. Opens new instance each time
    ppts.Visible = True
    ppts.Presentations.Open FileName:=FolderPath & FileName
    'The code below sets 3 variables to help in formatting Tim's extraction code. 
    'It searches for the last cell entry and then adds 5 rows before copying more information.
    A = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row + 5 
    B = "B" & A
    X = "A" & A
    Range(X).Value = "New" 

'Beginning of Tim's code
Dim ppt As Object, tbl As Object
Dim slide As Object, pres As Object, shp
Dim rngDest As Range


    Set ppt = GetObject(, "Powerpoint.Application")

    Set pres = ppt.ActivePresentation

    Set rngDest = Sheets("Data").Range(B) 'Moved it over one column for formatting

    For Each slide In pres.Slides
        For Each shp In slide.Shapes
            If shp.HasTable Then
                ExtractTableContent shp.Table, rngDest
                Set rngDest = rngDest.Offset(shp.Table.Rows.Count + 3, 0)
            End If
        Next
    Next

    ppts.ActivePresentation.Close    'Close PPT and loop for next one
    FileName = Dir
Loop
End Sub

'More of Tim's code

Sub ExtractTableContent(oTable As Object, rng As Range)
    Dim r, c, offR As Long, offC As Long

    For Each r In oTable.Rows '<< Loop over each row in the PPT table

        offC = 0 '<< reset the column offset

        For Each c In r.Cells '<< Loop over each cell in the row

            'Copy the cell's text content to Excel, using the offsets
            '    offR and offC to select where it gets placed relative
            '    to the starting point (rng)
            rng.Offset(offR, offC).Value = c.Shape.TextFrame.TextRange.Text

            offC = offC + 1 '<< increment the column offset

        Next c

        offR = offR + 1 '<< increment the row offset

    Next r

End Sub

Sub N()
Range("A3").Value = "New" 'Simply adds "New" next to each new file opened. Helps for deliniation between files
End Sub

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM