简体   繁体   中英

Using GREP / RegEx to find and replace string

So, I'm trying to migrate a database from Textpattern CMS to something more generic. There are some textpattern-specific commands inside of articles that pull in images. I want to turn these into generic HTML image links. At the moment, they look like this in the sql file:

<txp:upm_image image_id="4" form="dose" />

I want to turn these into something more like this:

<img src="4.jpg" class="dose" />

I've had some luck with TextWrangler doing some regex stuff, but I'm stumped. Any ideas on how to find & replace all of these image paths?

EDIT: For future reference, here's what I ended up doing in PHP to output it:

$body = $post['Body_html'];
$pattern = '/txp:upm_image image_id="([0-9]+)" form="([^"]*)"/i';
$replacement = 'img src="/images/$1.jpg" class="$2"';
$body = preg_replace($pattern, $replacement, $body);
// outputed <img src="/images/59.jpg" class="dose" />

I wouldn't use grep; it's sed you want

$ echo '<txp:upm_image image_id="4" form="dose" />' | sed -e 's/^.*image_id="\([[:digit:]]*\)".*form="\([[:alpha:]]*\)".*/<img src="\1.jpg" class="\2" \/>/' 
<img src="4.jpg" class="dose" /> 
$

if your class has alphanumeric characters, use [[:alnum:]]

(works on macos darwin)

Not sure which tool you are using but try this regex solution: Search for this:

<txp:upm_image\s+image_id="(\d+)"\s+form="([^"]*)"\s*\/>

And replace with this:

<img src="$1.jpg" class="$2" />

Note that this only works for txp tags having the same form as your example. It will fail if there are txp tags having extra attributes, or if they are in a different order.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM