简体   繁体   中英

How do I find specific substring in a string

My full question title was too long, but it should be asked here:

How do I find all instances of a specific substring in a string accounting for spaces and special characters potentially being on either side of the substring

What I mean is this. I am writing a SQL code formatting assistance program in VB.Net. This program will help when I am following up on truly porrly writen SQL. A for instance is (and please ignore the syntax failure here, I am not good at writting bad code in SQL):

if exists(
    select *
    from dbo.table
    where field1 = (if exists (select field1
                               from dbo.table1
                               where field2 = '123')
                    select field1 from table2)

My program is still in the early stages. I have already identified most of the keywords, and written the code that will put them in the proper case format. So in the bad code example from above all of the selects will be Select. To do this I have created a list of key words in array form, and use this array in the following function:

Private Function FindAndReplace(ByVal findWhat As String, _
        ByVal replaceWith As String, ByVal focusLine As String) As String
    focusLine = Microsoft.VisualBasic.Strings.Replace(focusLine, findWhat, _
        replaceWith, 1, -1, Constants.vbTextCompare)
    Return focusLine
End Function

The good news is this works really well with words like Select. Words like If, Go, On, and End are a bit more challenging. If I have the word Send, it will replace it with the word SEnd because End is a keyword. On many of these instances I can account for this by putting the smaller words before the larger words. I have added Send as a keyword because of the number of times that word appears in user messages on our systems.

I cannot seem to account for words like On, If, or Go. I considered searching for " Go ", " On ", ")Go ", " On(", etc. but there are times when Go is going to be the first word on the line...or the only.

What I need is a VB.Net means of searching a string for all of the instances of a given substring (such as If). I was thinking I would check if it was the first word in the string, or seeing if it is surrounded by any combination of spaces or special characters (or not surrounded by other letters and underscores, etc.). I would update those that met my requirements, and leave the others alone.

I am drawing a blank on how to do this, and I could really use some assistance.

I am writing a SQL code formatting assistance program

I'd recommend starting with an existing SQL parser.

Pete Sestoft's excellent Programming Language Concepts book introduces parsing fundamentals including writing Lexer and Parser specifications for Micro-SQL in Chapter 3.

The open source Irony project includes an SQL grammar sample.

Use your favourite search engine to find others.

What I need is a VB.Net means of searching a string for all of the instances of a given substring

There are a number of ways of achieving this:

  1. Split the string into words and then search those words for instances.
  2. Use a state machine to iterate over the string and check words after white space.

With option 2 you can handle quoted strings and maintain an index for each word, here's a short example in F#: http://fssnip.net/f6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM