简体   繁体   中英

Replacing scan by gsub in Ruby: how to allow code in gsub block?

I am parsing a Wiki text from an XML dump, for a string named 'section' which includes templates in double braces, including some arguments, which I want to reorganize.

This has an example named TextTerm:

section="Sample of a text with a first template {{TextTerm|arg1a|arg2a|arg3a...}}  and then a second {{TextTerm|arg1b|arg2b|arg3b...}} etc."

I can use scan and a regex to get each template and work on it on a loop using:

section.scan(/\{\{(TextTerm)\|(.*?)\|(.*?)\}\}/i).each { |item| puts "1=" + item[1] # arg1a etc.}

And, I have been able to extract the database of the first argument of the template.

Now I also want to replace the name of the template "NewTextTerm" and reorganize its arguments by placing the second argument in place of the first.

Can I do it in the same loop? For example by changing scan by a gsub(rgexp){ block} :

section.gsub!(/\{\{(TextTerm)\|(.*?)\|(.*?)\}\}/) { |item| '{{NewTextTerm|\2|\1}}'}

I get:

"Sample of a text with a first template {{NewTextTerm|\\2|\\1}}  and then a second {{NewTextTerm|\\2|\\1}} etc."

meaning that the arguments of the regexp are not recognized. Even if it worked, I would like to have some place within the gsub block to work on the arguments. For example, I can't have a puts in the gsub block similar to the scan().each block but only a string to be substituted.

Any ideas are welcome.

PS: Some editing: braces and "section= added", code is complete.

When you have the replacement as a string argument, you can use '\\1' , etc. like this:

string.gsub!(regex, '...\1...\2...')

When you have the replacement as a block, you can use "#$1" , etc. like this:

string.gsub!(regex){"...#$1...#$2..."}

You are mixing the uses. Stick to either one.

Yes, changing the quote by a double quote isn't enough, #$1 is the answer. Here is the complete code:

section="Sample of a text with a first template {{TextTerm|arg1a|arg2a|arg3a...}}  and then a second {{TextTerm|arg1b|arg2b|arg3b...}} etc."
section.gsub(/\{\{(TextTerm)\|(.*?)\|(.*?)\}\}/) { |item| "{{New#$1|#$3|#$2}}"}
"Sample of a text with a first template {{NewTextTerm|arg2a|arg3a...|arg1a}}  and then a second {{NewTextTerm|arg2b|arg3b...|arg1b}} etc."

Thus, it works. Thanks.

But now I have to replace the string, by a "function" returning the changed string:

def stringreturn(arg1,arg2,arg3) strr = "{{New"+arg1 + arg3 +arg2 + "}}"; return strr ; end

and

section.gsub(/\{\{(TextTerm)\|(.*?)\|(.*?)\}\}/) { |item| stringreturn("#$1","|#$2","|#$3") }

will return:

"Sample of a text with a first template {{NewTextTerm|arg2a|arg3a...|arg1a}}  and then a second {{NewTextTerm|arg2b|arg3b...|arg1b}} etc."

Thanks to all! There is probably a better way to manipulate arguments in MediaWiki templates using Ruby.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM