简体   繁体   中英

Extracting text from string between two identical characters using VBA

Let's say I have the following string within a cell:

E. Stark, T. Lannister, A. Martell, P Baelish, B. Dondarrion, and J. Mormont. Increased levels of nudity across Westeros contributes to its sporadic seasonal climate. Nat. Proc. Aca. Sci. (2011) 3: 142-149.

And I want to extract only the title from this. The approach I am considering is to write a script that says "Pull text from this string, but only if it is more than 50 characters long." This way it only returns the title, and not stuff like " Stark, T" and " Martell, P". The code I have so far is:

Sub TitleTest()
    Dim txt As String
    Dim Output As String
    Dim i As Integer
    Dim rng As Range
    Dim j As Integer
    Dim k As Integer

    j = 5
    Set rng = Range("A" & j) 'text is in cell A5
    txt = rng.Value 'txt is string
    i = 1

    While j <= 10 'there are five references between A5 and A10
    k = InStr(i, txt, ".") - InStr(i, txt, ". ") + 1 'k is supposed to be the length of the string returned, but I can't differenciate one "." from the other.

    Output = Mid(txt, InStr(i, txt, "."), k)
            If Len(Output) < 100 Then
                i = i + 1
            ElseIf Len(Output) > 10 Then
                Output = Mid(txt, InStr(i, txt, "."), InStr(i, txt, ". "))
                Range("B5") = Output
                j = j + 1
            End If
    Wend
End Sub

Of course, this would work well if it wasn't two "." I was trying to full information from. Is there a way to write the InStr function in such a way that it won't find the same character twice? Am I going about this in the wrong way?

Thanks in advance,

EDIT: Another approach that might work (if possible), is if I could have one character be " any lower case letter ." and ".". Would even this be possible? I can't find any example of how this could be achieved...

Here you go, it works exactly as you wish. Judging from your code I am sure that you can adapt it for your needs quite quickly:

Option Explicit

Sub ExtractTextSub()

    Debug.Print ExtractText("E. Stark, T. Lannister, A. Martell, P Baelish, B. Dondarrion, and J. Mormont. Increased levels of nudity across Westeros contributes to its sporadic seasonal climate. Nat. Proc. Aca. Sci. (2011) 3: 142-149.")

End Sub

Public Function ExtractText(str_text As String) As String

    Dim arr         As Variant
    Dim l_counter   As Long
    arr = Split(str_text, ".")

    For l_counter = LBound(arr) To UBound(arr)

        If Len(arr(l_counter)) > 50 Then
            ExtractText = arr(l_counter)
        End If

    Next l_counter

End Function

Edit: 5 votes in no time made me improve my code a bit :) This would return the longest string, without thinking of the 50 chars. Furthermore, on Error handlaer and a constant for the point. Plus adding a point to the end of the extract.

Option Explicit

Public Const STR_POINT = "."

Sub ExtractTextSub()

    Debug.Print ExtractText("E. Stark, T. Lannister, A. Martell, P Baelish, B. Dondarrion, and J. Mormont. Increased levels of nudity across Westeros contributes to its sporadic seasonal climate. Nat. Proc. Aca. Sci. (2011) 3: 142-149.")

End Sub

Public Function ExtractText(str_text As String) As String

    On Error GoTo ExtractText_Error

    Dim arr             As Variant
    Dim l_counter       As Long
    Dim str_longest     As String

    arr = Split(str_text, STR_POINT)

    For l_counter = LBound(arr) To UBound(arr)

        If Len(arr(l_counter)) > Len(ExtractText) Then
            ExtractText = arr(l_counter)
        End If

    Next l_counter

ExtractText = ExtractText & STR_POINT

On Error GoTo 0
Exit Function

ExtractText_Error:

MsgBox "Error " & Err.Number & Err.Description 

End Function

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM