简体   繁体   中英

How to get content from MS word cells with VBA and clean it for Excel

I am trying to get content from cells of a MS word document and copy it into an excel document with a macro using VBA.

I am using this function in order to achieve it:

Cells(insertRow, 1) = WorksheetFunction.Clean(.cell(iRow, iCol).Range.Text)

But this only brings the text with spaces, without respecting the line jumps (carriage returns).

I have also tried to use both of this:

Cells(insertRow, 1) = WorksheetFunction.Clean(.cell(iRow, iCol).Range)

And

Cells(insertRow, 1) = WorksheetFunction.Clean(.cell(iRow, iCol))

With the same result.

That would be the main task. I would also like to know if it is possible to change this symbol from MS word into something similar in excel.

You could try adding a symbol at the end of each line in the MS word file so as to recover the line jumps. For exemple, if you finish your lines by a semicolon, you could ask Excel to look for the semicolon, and when it finds it, skip a line.

To remove the spaces you could consider runing this code :

newString = Replace(strString, " ", "")

There may be other ways to achieve what you wish to do, but this feels like the simplest way. Hope this helped!

It took me a long while to figure out how to do this, as it was my first time programming in Visual Basic and VBA, but this is how i solved it:

The first problem i had was that i had some junk characters i had to get rid of, which i tried using

WorksheetFunction.Clean(MyString)

Problem is this deletes non printing characters which are imported (ASCII code 0-31) which included ascii code 13 (line jump or carriage return).

Even without using WorksheetFunction.clean line jump was not correctly interpreted by excel, which i solved using this

Dim str as String
Dim clearer As String
str=.cell(iRow, iCol).Range.Text 'Get non treated content from word table cell
clearer = Replace(str, Chr(13), vbNewLine)

Now that i was not cleaning automatically all stuff, i had to do it manually, but i did not know the Ascii code of strange characters.

I looked for them this way

 Dim Counter As Integer
MsgBox ("Word " + str)
For Counter = 1 To Len(str)
MsgBox ("Letter is " + Mid(str, Counter, 1) + "And Ascii code is" + 
 CStr(AscW(Mid(str, Counter, 1))))
Next

Once I knew their code, the process for manual cleaning was the same for other characters. Some of them represented a line_jump and some others were junk, so i treated them in a different way

clearer = Replace(str, Chr(13), vbNewLine) 'Replace line jumps from word to excel
clearer = Replace(clearer, Chr(11), vbNewLine) 'Replace line jumps from word 
to excel
clearer = Replace(clearer, Chr(7), "") 'Remover char

This way, i could clean the imported text as i wanted. Hope this can help someone in the future.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM