简体   繁体   中英

How to extract dynamic string from a text using regular expression

I am working on the below sample text

  1. text here text here —(1) Without prejudice to any special rights previously conferred on the holders of any existing shares or class of shares but subject to the Act, shares in the company may be issued by the directors. (2) Shares referred to in paragraph (1) may be issued with preferred, deferred, or other special rights or restrictions, whether in regard to dividend, voting, return of capital, or otherwise, as the directors, subject to any ordinary resolution of the company, determine.

I want to split the text in the following manner

  1. text here text here —

(1) Without prejudice to any special rights previously conferred on the holders of any existing shares or class of shares but subject to the Act, shares in the company may be issued by the directors.

(2) Shares referred to in paragraph (1) may be issued with preferred, deferred, or other special rights or restrictions, whether in regard to dividend, voting, return of capital, or otherwise, as the directors, subject to any ordinary resolution of the company, determine.

The text is dynamic in nature and instead of (1) , (2) we could get (a) , (b) , a. , b. , i , ii , iii . To handle the first problem statement, I have used the below regular expression in VBA:

Pattern = "([(][\d][)])([A-Z,a-z,.,\-,’,(,),_, ,:,\n,“,”,"",:,;,—,-,\—,\t,\r,]*)"

I am looking for a solution to split the contents but not looking for a solution specific in VBA or regular expressions. Any other approach is also appreciated.

You did not answer the clarification questions, but please, try the next VBA function:

Function SpecSplitText(x As String) As String
  Dim strFin As String, strInt As String, arr, arr1, El
  
  arr = Split(x, "—"):
  strFin = arr(0) & vbCrLf
  arr1 = Split(arr(1), ".")
  
  For Each El In arr1
    If Len(LTrim(El)) = 1 Then
        strInt = LTrim(El) & ". "
    Else
        strFin = strFin & strInt & LTrim(El) & "." & vbCrLf
        strInt = ""
    End If
  Next
  strFin = left(strFin, Len(strFin) - 3)
  SpecSplitText = strFin
End Function

You can test the function using the next simple test Sub :

Sub testSpecSplitText()
    Dim x As String, y As String, z As String
    x = "7. text here text here —(1) Without prejudice to any special rights previously conferred on the holders of any existing shares or class of shares but subject to the Act, shares in the company may be issued by the directors. (2) Shares referred to in paragraph (1) may be issued with preferred, deferred, or other special rights or restrictions, whether in regard to dividend, voting, return of capital, or otherwise, as the directors, subject to any ordinary resolution of the company, determine."
    y = "8. text here text here —a. Without prejudice to any special rights previously conferred on the holders of any existing shares or class of shares but subject to the Act, shares in the company may be issued by the directors. b. Shares referred to in paragraph (1) may be issued with preferred, deferred, or other special rights or restrictions, whether in regard to dividend, voting, return of capital, or otherwise, as the directors, subject to any ordinary resolution of the company, determine."
    z = "9. text here text here —I Without prejudice to any special rights previously conferred on the holders of any existing shares or class of shares but subject to the Act, shares in the company may be issued by the directors. II Shares referred to in paragraph (1) may be issued with preferred, deferred, or other special rights or restrictions, whether in regard to dividend, voting, return of capital, or otherwise, as the directors, subject to any ordinary resolution of the company, determine."
    
    Debug.Print SpecSplitText(x)
    Debug.Print SpecSplitText(y)
    Debug.Print SpecSplitText(z)
End Sub

But, if the text in discussion uses "etc." or an abbreviation ending in dot (.), you must enumerate them to escape.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM