I have an Excel file that pulls in data from a csv, manipulates it a bit, and then saves it down as a series of text files.
There are some special characters in the source data that trip things up so I added this to strip them out
Const SpecialCharacters As String = "!,@,#,$,%,^,&,*,(,),{,[,],},?,â,€,™"
Function ReplaceSpecialCharacters(myString As String) As String
Dim newString As String
Dim char As Variant
newString = myString
For Each char In Split(SpecialCharacters, ",")
newString = Replace(newString, char, "")
Next
ReplaceSpecialCharacters = newString
End Function
The issue is that this doesn't catch all of them. When I try to process the following text it slips through the above code and causes Excel to error out.
Hero’s Village
I think the issue is that the special character isn't being recognized by Excel itself. I was only able to get the text to look like it does above by copying it out of Excel and pasting it into a different IDE. In Excel is displays as:
Based on this site it looks like it's having issues displaying the '
character, but how do I get it to fix/filter it out if it can't even read it properly in VBA itself?
Option Explicit
dim mystring as String
dim regex as new RegExp
Private Function rgclean(ByVal mystring As String) As String
'function that find and replace string if contains regex pattern
'returns str
With regex
.Global = True
.Pattern = "[^ \w]" 'regex pattern will ignore spaces, word and number characters...
End With
rgclean = regex.Replace(mystring, "") '.. and replaces everything else with ""
End Function
Try using regular expression.
Make sure you enable regular expression on: Tools > References > checkbox: "Microsoft VBScript Regular Expressions 5.5"
Pass the "mystring" string variable into the function (rgclean). The function will check for anything that is not space, word[A-Za-z], or numbers[0-9], replace them with "", and returns the string.
The function will pretty much remove any symbols in the string. Any Numbers, Space, or Word will NOT be excluded.
Here is the opposite approach. Remove ALL characters that are not included in this group of 62:
ABCDEFGHIJKLMNOPQESTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
The code:
Const ValidCharacters As String = "ABCDEFGHIJKLMNOPQESTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789"
Function ReplaceSpecialCharacters(myString As String) As String
Dim newString As String, L As Long, i As Long
Dim char As Variant
newString = myString
L = Len(newString)
For i = 1 To L
char = Mid(newString, i, 1)
If InStr(ValidCharacters, char) = 0 Then
newString = Replace(newString, char, "@")
End If
Next i
ReplaceSpecialCharacters = Replace(newString, "@", "")
End Function
Note:
You can also add characters to the string ValidCharacters
if you want to retain them.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.