简体   繁体   English

查找@括起来的文本并替换内部

[英]Find text enclosed by @ and replace the inside

The problem: 问题:

Find pieces of text in a file enclosed by @ and replace the inside @包围的文件中查找文本片段并替换其中的内容

Input: 输入:

@abc@ abc @ABC@
cba @cba CBA@

Deisred output: 所需的输出:

абц abc АБЦ
cba цба ЦБА

I have the following: 我有以下内容:

#!/usr/bin/perl
use strict;
use warnings;
use Encode;
my $output;
open FILE,"<", 'test.txt';
while (<FILE>) {
    chomp(my @chars = split(//, $_));
    for (@chars) {
        my @char;
        $_ =~ s/a/chr(0x430)/eg;
        $_ =~ s/b/chr(0x431)/eg;
        $_ =~ s/c/chr(0x446)/eg;
        $_ =~ s/d/chr(0x434)/eg;
        $_ =~ s/e/chr(0x435)/eg;
        $_ =~ s/A/chr(0x410)/eg;
        $_ =~ s/B/chr(0x411)/eg;
        $_ =~ s/C/chr(0x426)/eg;
        push @char, $_;
        $output = join "", @char;
        print encode("utf-8",$output);}
print "\n";
}
close FILE;

But I'm stuck on how to process further 但我仍然坚持如何进一步处理

Thanks for help in advance! 预先感谢您的帮助!

Kluther Kluther

Here my solution. 这是我的解决方案。 (you will fixed it, yes. It is prototype) (您将修复它,是的。它是原型)

for (my $data = <DATA>){
    $data=~s/[@]([\s\w]+)[@]/func($1)/ge;
    print $data;
#   while($data=~m/[@]([\s\w]+)[@]/g){
#      print "marked: ",$1,"\n";
#      print "position:", pos();
#   }
#      print "not marked: ";
}
sub func{
   #do your magic here ;)
   return "<< @_ >>";
}
__DATA__
@abc@ abc @ABC@ cba @cba CBA@

What happens here? 这里会发生什么?

First, I read data. 首先,我读取数据。 You can do it yourself. 你可以自己做。

for (my $data = <DATA>){...}

Next, I need to search your pattern and replace it. 接下来,我需要搜索您的模式并将其替换。
What should I do? 我该怎么办?

Use substition operator: s/pattern/replace/ 使用substition operator: s/pattern/replace/

But in interesting form: 但是以有趣的形式:

s/pattern/func($1)/ge

Key g mean Global Search 关键字g均值全局搜索

Key e mean Evaluate 关键e均值评估

So, I think, that you need to write your own func function ;) 因此,我认为您需要编写自己的func函数;)

Maybe better to use transliteration operator: tr/listOfSymbolsToBeReplaced/listOfSymbolsThatBePlacedInstead/ 使用transliteration operator: tr/listOfSymbolsToBeReplaced/listOfSymbolsThatBePlacedInstead/可能更好transliteration operator: tr/listOfSymbolsToBeReplaced/listOfSymbolsThatBePlacedInstead/

Try this after $output is processed. $output处理后,请尝试此操作。

$output =~ s/\@//g;
my @split_output = split(//, $output);
$output = "";
my $len = scalar(@split_output) ;
while ($len--) {
    $output .= shift(@split_output);
}
print $output;

It can be done with a single regex and no splitting of the string: 可以使用单个正则表达式完成,而无需拆分字符串:

use strict;
use warnings;
use Encode;

my %chars = (
    a => chr(0x430),
    b => chr(0x431),
    c => chr(0x446),
    d => chr(0x434),
    e => chr(0x435),
    A => chr(0x410),
    B => chr(0x411),
    C => chr(0x426),
);

my $regex = '(' . join ('|', keys %chars) . ')'; 


while (<DATA>) {
    1 while ($_ =~ s|\@(?!\s)[^@]*?\K$regex(?=[^@]*(?!\s)\@)|$chars{$1}|eg);
    print encode("utf-8",$_);
}

It does require repeated runs of the regex due to the overlapping nature of the matches. 由于匹配项的重叠性质,确实需要重复运行正则表达式。

With minimal changes to your algorithm you need to keep track of whether you are inside the @ marks or not. 在对算法进行最少更改的情况下,您需要跟踪自己是否在@标记内。 so add something like this 所以添加这样的东西

my $bConvert = 0;
chomp(my @chars = split(//, $_));
for (@chars) {
    my $char = $_;
    if (/@/) {
        $bConvert = ($bConvert + 1) % 2;
        next;
    }
    elsif ($bConvert) {
        $char =~ s/a/chr(0x430)/eg;
        $char =~ s/b/chr(0x431)/eg;
        $char =~ s/c/chr(0x446)/eg;
        $char =~ s/d/chr(0x434)/eg;
        $char =~ s/e/chr(0x435)/eg;
        $char =~ s/A/chr(0x410)/eg;
        $char =~ s/B/chr(0x411)/eg;
        $char =~ s/C/chr(0x426)/eg;
    }
    print encode("utf-8",$char);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM