简体   繁体   中英

Regex finding all commas between two words

I trying to clean up a large .csv file that contains many comma separated words that I need to consolidate parts of. So I have a subsection where I want to change all the commas to slashes. Lets say my file contains this text:

Foo,bar,spam,eggs,extra,parts,spoon,eggs,sudo,test,example,blah,pool

I want to select all commas between the unique words bar and blah. The idea is to then replace the commas with slashes (using find and replace), such that I get this result:

Foo,bar,spam/eggs/extra/parts/spoon/eggs/sudo/test/example,blah,pool

As per @EganWolf input: How do I include words in the search but exclude them from the selection (for the unique words) and how do I then match only the commas between the words?

Thus far I have only managed to select all the text between the unique words including them: bar,.*,blah , bar:*, *,blah , (bar:.+?,blah)*,*\\2

I experimented with negative look ahead but cant get any search results from my statements.

Using Notepad++, you can do:

  • Ctrl + H
  • Find what: (?:\\bbar,|\\G(?!^))\\K([^,]*),(?=.+\\bblah\\b)
  • Replace with: $1/
  • check Wrap around
  • check Regular expression
  • UNCHECK . matches newline . matches newline
  • Replace all

Explanation:

(?:             # start non capture group
    \bbar,      # word boundary then bar then a comma
  |             # OR
    \G          # restart from last match position
    (?!^)       # negative lookahead, make sure not followed by beginning of line
)               # end group
\K              # forget all we've seen until this position
([^,]*)         # group 1, 0 or more non comma
,               # a comma
(?=             # positive lookahead
    .+          # 1 or more any character but newlie
    \bblah\b    # word boundary, blah, word boundary
)               # end lookahead

Result for given example:

Foo,bar,spam/eggs/extra/parts/spoon/eggs/sudo/test/example,blah,pool

Screen capture:

在此处输入图片说明

The following regex will capture the minimally required text to access the commas you want:

(?<=bar,)(.*?(,))*(?=.*?,blah)

See Regex Demo .

If you want to replace the commas, you will need to replace everything in capture group 2. Capture group 0 has your entire match.

An alternative approach would be to split your string by comma to create an array of words. Then join words between bar and blah using / and append the other words joined by , .

Here is a PowerShell example of split and join:

$a = "Foo,bar,spam,eggs,extra,parts,spoon,eggs,sudo,test,example,blah,pool"
$split = $a -split ","
$slashBegin = $split.indexof("bar")+1
$commaEnd = $split.indexof("blah")-1
$str1 = $split[0..($slashbegin-1)] -join "," 
$str2 = $split[($slashbegin)..$commaend] -join "/"
$str3 = $split[($commaend+1)..$split.count] -join ","
@($str1,$str2,$str3) -join ","

Foo,bar,spam/eggs/extra/parts/spoon/eggs/sudo/test/example,blah,pool

This could easily be made into a function with your entire line and keywords as inputs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM