In a specific column, I want to search for a specific character in cells...say "(" or "/". Once this character is found in a cell, I want to extract the part from the beginning of the string upto the point that this character is found, in the cell adjacent to it.
Eg a few values in the column could look like -
Samsung (india)
Samsung/Dhamal
Blackberry (chikna)
Blackberry/Kala Anda
iPhone - egypt
iPhone 5 * yeda
The output will look like -
Samsung
Samsung
Blackberry
Blackberry
iPhone
iPhone 5
NOTE: The cell values in that specific column, are not static, have no pattern, may contain other special characters as well, are not of a specific length.
This question is well suited for regular expressions. The following function returns the position of the character preceding the first match of a simple regex pattern in a given string. If no match is found, the function returns the length of the string. The function can be combined with the LEFT function to extract the text preceding the match. ( The use of LEFT is necessary because, for the sake of simplicity, this function does not implement submatches. )
The following formula would extract the product names in your sample data:
=LEFT(A1,regexmatch(A1," \(|\/| -| \*"))
Breaking down the match pattern " \\(|\\/| -| \\*"
:
" \(" matches a space followed by a left parenthesis
[the backslash escapes the "(", a special character in regular expressions]
"|" signifies an alternative pattern to match
"\/" matches a forward slash (/)
" -" matches a space followed by a dash (-)
" \*" matches a space followed by an asterisk (*).
To learn more about regular expressions, see this regular expression tutorial , one of many available on the web.
In order for the function to work, you will need to set a reference to Microsoft VBScript Regular Expressions 5.5. To do this, select Tools/References from the VBA IDE and check this item, which will be well down the long list of references.
Function regexMatch(text As String, rePattern As String)
'Response to SO post 16591260
'Adapted from code at http://www.macrostash.com/2011/10/08/
' simple-regular-expression-tutorial-for-excel-vba/.
Dim regEx As New VBScript_RegExp_55.RegExp
Dim matches As Variant
regEx.pattern = rePattern
regEx.IgnoreCase = True 'True to ignore case
regEx.Global = False 'Return just the first match
If regEx.Test(text) Then
Set matches = regEx.Execute(text)
regexMatch = matches(0).FirstIndex
Else
regexMatch = Len(text)
End If
End Function
The following subroutine applies the string extraction to each cell in a specified data column and writes the new string to a specified result column. Although it would be possible to just call the function for each cell in the data column, this would incur the overhead of compiling the regular expression (which applies to all cells) each time the function was called. To avoid this overhead, the subroutine splits the match function in to two parts, with the pattern definition outside the loop through the data cells, and the pattern execution inside the loop.
Sub SubRegexMatch()
'Response to SO post 16591260
'Extracts from string content of each data cell in a specified source
' column of the active worksheet the characters to the left of the first
' match of a regular expression, and writes the new string to corresponding
' rows in a specified result column.
'Set the regular expression, source column, result column, and first
' data row in the "parameters" section
'Regex match code was adapted from http://www.macrostash.com/2011/10/08/
' simple-regular-expression-tutorial-for-excel-vba/
Dim regEx As New VBScript_RegExp_55.RegExp, _
matches As Variant, _
regexMatch As Long 'position of character *just before* match
Dim srcCol As String, _
resCol As String
Dim srcRng As Range, _
resRng As Range
Dim firstRow As Long, _
lastRow As Long
Dim srcArr As Variant, _
resArr() As String
Dim i As Long
'parameters
regEx.Pattern = " \(|\/| -| \*" 'regular expression to be matched
regEx.IgnoreCase = True
regEx.Global = False 'return only the first match found
srcCol = "A" 'source data column
resCol = "B" 'result column
firstRow = 2 'set to first row with data
With ActiveSheet
lastRow = .Cells(Cells.Rows.Count, srcCol).End(xlUp).Row
Set srcRng = .Range(srcCol & firstRow & ":" & srcCol & lastRow)
Set resRng = .Range(resCol & firstRow & ":" & resCol & lastRow)
srcArr = srcRng
ReDim resArr(1 To lastRow - firstRow + 1)
For i = 1 To srcRng.Rows.Count
If regEx.Test(srcArr(i, 1)) Then
Set matches = regEx.Execute(srcArr(i, 1))
regexMatch = matches(0).FirstIndex
Else
regexMatch = Len(srcArr(i, 1)) 'return length of original string if no match
End If
resArr(i) = Left(srcArr(i, 1), regexMatch)
Next i
resRng = WorksheetFunction.Transpose(resArr) 'assign result to worksheet
End With
End Sub
Something like this will work:
=IF(FIND("(",A1),LEFT(A1,FIND("(",A1)-1),IF(FIND("\",A1),LEFT(A1,FIND("\",A1)-1),""))
If you have more than just the two characters nest in some more IF statements. There is a limit on how much of this you can do before hitting the iteration limit for Cell functions.
You could use the Split()
function. Here is an example:
Dim text as String
Dim splt as Variant
text = "Samsung/Dhamal"
splt = Split(text, "/")
MsgBox splt(0)
Just do the same for any other character you want to split. More on this on MSDN: http://msdn.microsoft.com/fr-fr/library/6x627e5f%28v=vs.80%29.aspx
The other (better?) alternative I see would be to use InStr()
with Left()
. InStr()
returns the position of the first match it finds. Then you just have to crop your string. Here is an example:
Dim text as String
Dim position as Integer
text = "Samsung/Dhamal"
position = InStr(text, "/")
If position > 0 then MsgBox Left(text, position)
http://msdn.microsoft.com/fr-fr/library/8460tsh1%28v=vs.80%29.aspx
This should work for you:
Public Function IsAlphaNumeric(sChr As String) As Boolean
IsAlphaNumeric = sChr Like "[0-9A-Za-z]"
End Function
Sub LeftUntilNonAlphaNumericChar()
Dim cell As Range
Dim Counter As Integer
Dim NumCharsLeftOfNonAlphaNumChar As Long
Set colRng = ActiveSheet.Range("A1:A1000") 'specify range
For Each cell In colRng
If Len(cell.Text) > 0 Then
MyString = cell.Value
For Counter = Len(cell.Text) To Counter Step -1
If IsAlphaNumeric(cell.Characters(Counter, 1).Text) = False And cell.Characters(Counter, 1).Text <> " " Then
cell.Offset(0, 1).Value = Left(cell.Value, Counter - 1)
End If
Next
End If
Next cell
End Sub
It doesn't remove trailing whitespaces on the end but a simple addition to the sub could change that if you wanted. Good Luck.
ADDITION: You can get the row of the last cell with data in a column and use that in your range(see below):
Public Function IsAlphaNumeric(sChr As String) As Boolean
IsAlphaNumeric = sChr Like "[0-9A-Za-z]"
End Function
Sub LeftUntilNonAlphaNumericChar()
Dim cell As Range
Dim Counter As Integer
Dim NumCharsLeftOfNonAlphaNumChar As Long
Dim LastRow As Long
If Application.Version >= 12# Then
LastRow = ActiveSheet.Range("A1048575").End(xlUp).Row + 1
'MsgBox "You are using Excel 2007 or greater."
Else
LastRow = ActiveSheet.Range("A65535").End(xlUp).Row + 1
'MsgBox "You are using Excel 2003 or lesser."
End If
Set colRng = ActiveSheet.Range("A1:A" & LastRow) 'specify range
For Each cell In colRng
If Len(cell.Text) > 0 Then
MyString = cell.Value
For Counter = Len(cell.Text) To Counter Step -1
If IsAlphaNumeric(cell.Characters(Counter, 1).Text) = False And cell.Characters(Counter, 1).Text <> " " Then
cell.Offset(0, 1).Value = Left(cell.Value, Counter - 1)
End If
Next
End If
Next cell
End Sub
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.