I am interning with a large firm that stores a lot of its source data in the form of PowerPoints. These PowerPpoints serve well when communicating across departments and between suppliers but, as you may guess, lack any robust analysis. Because of this, I have decided to database these Powerpoints into Access.
There is no direct way of doing this, that I know of. Due to strict IT policies, I am limited to VBA as my coding platform. I have spent the last week coding up a macro to solve my problem. Again, since there is no direct conversion of PowerPoint to Access, I have had to solve this problem rather inefficiently as there are a few caveats. I will list my steps and caveats below.
The powerpoint information I want to database is formatted as a table instead of text. I have been unable to find a Macro that converts PPT tables directly to Excel or CSV files. Because of this, I will convert all PPT files (roughly 3000) to PDFs.
From these generated PDF's I can use Adobe to convert them to Excel or CSV files.
Using multiple online resources and a bit of my own experience, I have coded up a VBA script that will automatically format a folder of CSV files into a format that Access will store correctly. See Code 1.
After formatting the CSVs, I will then automate them all to Access.
Following the Access automation, I will need to embed each PPT file to its respective Access entry
Again, this is not an efficient process. Because I am limited to Microsoft only applications, I have chosen this route. I thought about leaving the information as Excel files, but the idea is to make this data accessible and searchable by any department, hence why I chose Access to database them.
Now that I have explained to you where I am coming from and what I am doing, I ask: what recommendations do you have for me? I feel my round-about way is a good solution and practical, but I wonder if there is a better solution.
Sub LoopCSVFile()
Dim fso As Object 'Scritping.FileSystemObject
Dim fldr As Object 'Scripting.Folder
Dim file As Object 'Scripting.File
Dim wb As Workbook
Set fso = CreateObject("Scripting.FileSystemObject")
Set fldr = fso.GetFolder("C:\Users\HMM105289\Documents\Powerpoint Parsing\Test Folder\Test Save Folder")
For Each file In fldr.Files
Set wb = Workbooks.Open(file.Path)
Application.Run "Personal.xlsb!Module1.FormatAccess"
wb.Close SaveChanges = True
Next
Set file = Nothing
Set fldr = Nothing
Set fso = Nothing
End Sub
Having played around with some of Tim's suggestions, I have come up with this code to run a check on each PPT slide. The idea is to have it run his "ExtractTable" macro inside. As it stands, I am unable to get it to execute.
Sub PPTableXtraction()
Dim oSlide As Slide
Dim oSlides As Slides
Dim oPPT As Object: Set oPPT = ActivePresentation
Dim oShapes As Shape
Dim oTable As Object
For Each oSlide In oPPT.Slides
For Each oShapes In oSlide.Shapes
If oShapes.HasTable Then
Application.Run "VBAProject.xlsb!Module3.ExtractTableContent"
End If
Next
Next
End Sub
I was able to build on Tim's code to create a code that loops each PowerPoint file and extracts the information into Excel. The code doesn't break into the debugger but for whatever reason it is not performing any functions. Would anyone have any idea why?
Sub Tester()
Dim ppts As PowerPoint.Application
Dim FolderPath As String
Dim FileName As String
FolderPath = "FolderPath"
FileName = Dir(FolderPath & "*.ppt*")
Do While FileName <> ""
Set ppts = New PowerPoint.Application
ppts.Visible = True
ppts.Presentations.Open FileName:=FolderPath & FileName
A = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row + 5
B = "B" & A
X = "A" & A
Range(X).Value = "New"
Dim ppt As Object, tbl As Object
Dim slide As Object, pres As Object, shp
Dim rngDest As Range
Set ppt = GetObject(, "Powerpoint.Application")
Set pres = ppt.ActivePresentation
Set rngDest = Sheets("Data").Range(B) '
For Each slide In pres.Slides
For Each shp In slide.Shapes
If shp.HasTable Then
ExtractTableContent shp.Table, rngDest
Set rngDest = rngDest.Offset(shp.Table.Rows.Count + 3, 0)
End If
Next
Next
ppts.ActivePresentation.Close
FileName = Dir
Loop
End Sub
Sub ExtractTableContent(oTable As Object, rng As Range)
Dim r, c, offR As Long, offC As Long
For Each r In oTable.Rows '<< Loop over each row in the PPT table
offC = 0 '<< reset the column offset
For Each c In r.Cells '<< Loop over each cell in the row
'Copy the cell's text content to Excel, using the offsets
' offR and offC to select where it gets placed relative
' to the starting point (rng)
rng.Offset(offR, offC).Value = c.Shape.TextFrame.TextRange.Text
offC = offC + 1 '<< increment the column offset
Next c
offR = offR + 1 '<< increment the row offset
Next r
End Sub
Sub N()
Range("A3").Value = "New"
End Sub
Here's an example of extracting a table from PPT to Excel.
Looping over the slides and tables (modified from your posted code)
Sub Tester()
Dim ppt As Object, tbl As Object
Dim slide As Object, pres As Object, shp
Dim rngDest As Range
Set ppt = GetObject(, "Powerpoint.Application")
Set pres = ppt.ActivePresentation
Set rngDest = Sheets("Data").Range("a1") '<< where to start placing ppt data
For Each slide In pres.Slides
For Each shp In slide.Shapes
If shp.HasTable Then
ExtractTableContent shp.Table, rngDest
Set rngDest = rngDest.Offset(shp.Table.Rows.Count + 3, 0)
End If
Next
Next
End Sub
The sub to extract each table's data:
Sub ExtractTableContent(oTable As Object, rng As Range)
Dim r, c, offR As Long, offC As Long
For Each r In oTable.Rows '<< Loop over each row in the PPT table
offC = 0 '<< reset the column offset
For Each c In r.Cells '<< Loop over each cell in the row
'Copy the cell's text content to Excel, using the offsets
' offR and offC to select where it gets placed relative
' to the starting point (rng)
rng.Offset(offR, offC).Value = c.Shape.TextFrame.TextRange.Text
offC = offC + 1 '<< increment the column offset
Next c
offR = offR + 1 '<< increment the row offset
Next r
End Sub
In case anyone skims this and wants the solution used
It is out of the box ready, with the exception of setting your file path.
Sub Tester()
Dim rng As Range
Set rng = Range("A1") 'This code is necessary to prevent a constant loop of the formatting for each extraction. It adds a "1" into "A1"
rng.Value = 1
Dim ppts As PowerPoint.Application
Dim FolderPath As String
Dim FileName As String
FolderPath = "FolderPath" 'Define your Folder Path
FileName = Dir(FolderPath & "*.ppt*") 'Locate .PPT files
Do While FileName <> ""
Set ppts = New PowerPoint.Application 'Left this in after finding another fix. Opens new instance each time
ppts.Visible = True
ppts.Presentations.Open FileName:=FolderPath & FileName
'The code below sets 3 variables to help in formatting Tim's extraction code.
'It searches for the last cell entry and then adds 5 rows before copying more information.
A = Cells.Find("*", SearchOrder:=xlByRows, SearchDirection:=xlPrevious).Row + 5
B = "B" & A
X = "A" & A
Range(X).Value = "New"
'Beginning of Tim's code
Dim ppt As Object, tbl As Object
Dim slide As Object, pres As Object, shp
Dim rngDest As Range
Set ppt = GetObject(, "Powerpoint.Application")
Set pres = ppt.ActivePresentation
Set rngDest = Sheets("Data").Range(B) 'Moved it over one column for formatting
For Each slide In pres.Slides
For Each shp In slide.Shapes
If shp.HasTable Then
ExtractTableContent shp.Table, rngDest
Set rngDest = rngDest.Offset(shp.Table.Rows.Count + 3, 0)
End If
Next
Next
ppts.ActivePresentation.Close 'Close PPT and loop for next one
FileName = Dir
Loop
End Sub
'More of Tim's code
Sub ExtractTableContent(oTable As Object, rng As Range)
Dim r, c, offR As Long, offC As Long
For Each r In oTable.Rows '<< Loop over each row in the PPT table
offC = 0 '<< reset the column offset
For Each c In r.Cells '<< Loop over each cell in the row
'Copy the cell's text content to Excel, using the offsets
' offR and offC to select where it gets placed relative
' to the starting point (rng)
rng.Offset(offR, offC).Value = c.Shape.TextFrame.TextRange.Text
offC = offC + 1 '<< increment the column offset
Next c
offR = offR + 1 '<< increment the row offset
Next r
End Sub
Sub N()
Range("A3").Value = "New" 'Simply adds "New" next to each new file opened. Helps for deliniation between files
End Sub
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.