查找@括起来的文本并替换内部

Question

The problem: 问题：

Find pieces of text in a file enclosed by @ and replace the inside 在@包围的文件中查找文本片段并替换其中的内容

Input: 输入：

@abc@ abc @ABC@
cba @cba CBA@

Deisred output: 所需的输出：

абц abc АБЦ
cba цба ЦБА

I have the following: 我有以下内容：

#!/usr/bin/perl
use strict;
use warnings;
use Encode;
my $output;
open FILE,"<", 'test.txt';
while (<FILE>) {
    chomp(my @chars = split(//, $_));
    for (@chars) {
        my @char;
        $_ =~ s/a/chr(0x430)/eg;
        $_ =~ s/b/chr(0x431)/eg;
        $_ =~ s/c/chr(0x446)/eg;
        $_ =~ s/d/chr(0x434)/eg;
        $_ =~ s/e/chr(0x435)/eg;
        $_ =~ s/A/chr(0x410)/eg;
        $_ =~ s/B/chr(0x411)/eg;
        $_ =~ s/C/chr(0x426)/eg;
        push @char, $_;
        $output = join "", @char;
        print encode("utf-8",$output);}
print "\n";
}
close FILE;

But I'm stuck on how to process further 但我仍然坚持如何进一步处理

Thanks for help in advance! 预先感谢您的帮助！

Kluther Kluther

Answer 1

Here my solution. 这是我的解决方案。 (you will fixed it, yes. It is prototype) （您将修复它，是的。它是原型）

for (my $data = <DATA>){
    $data=~s/[@]([\s\w]+)[@]/func($1)/ge;
    print $data;
#   while($data=~m/[@]([\s\w]+)[@]/g){
#      print "marked: ",$1,"\n";
#      print "position:", pos();
#   }
#      print "not marked: ";
}
sub func{
   #do your magic here ;)
   return "<< @_ >>";
}
__DATA__
@abc@ abc @ABC@ cba @cba CBA@

What happens here? 这里会发生什么？

First, I read data. 首先，我读取数据。 You can do it yourself. 你可以自己做。

for (my $data = <DATA>){...}

Next, I need to search your pattern and replace it. 接下来，我需要搜索您的模式并将其替换。
What should I do? 我该怎么办？

Use substition operator: s/pattern/replace/ 使用substition operator: s/pattern/replace/

But in interesting form: 但是以有趣的形式：

s/pattern/func($1)/ge

Key g mean Global Search 关键字g均值全局搜索

Key e mean Evaluate 关键e均值评估

So, I think, that you need to write your own func function ;) 因此，我认为您需要编写自己的func函数;）

Maybe better to use transliteration operator: tr/listOfSymbolsToBeReplaced/listOfSymbolsThatBePlacedInstead/ 使用transliteration operator: tr/listOfSymbolsToBeReplaced/listOfSymbolsThatBePlacedInstead/可能更好transliteration operator: tr/listOfSymbolsToBeReplaced/listOfSymbolsThatBePlacedInstead/

Answer 2

Try this after $output is processed. $output处理后，请尝试此操作。

$output =~ s/\@//g;
my @split_output = split(//, $output);
$output = "";
my $len = scalar(@split_output) ;
while ($len--) {
    $output .= shift(@split_output);
}
print $output;

Answer 3

It can be done with a single regex and no splitting of the string: 可以使用单个正则表达式完成，而无需拆分字符串：

use strict;
use warnings;
use Encode;

my %chars = (
    a => chr(0x430),
    b => chr(0x431),
    c => chr(0x446),
    d => chr(0x434),
    e => chr(0x435),
    A => chr(0x410),
    B => chr(0x411),
    C => chr(0x426),
);

my $regex = '(' . join ('|', keys %chars) . ')'; 


while (<DATA>) {
    1 while ($_ =~ s|\@(?!\s)[^@]*?\K$regex(?=[^@]*(?!\s)\@)|$chars{$1}|eg);
    print encode("utf-8",$_);
}

It does require repeated runs of the regex due to the overlapping nature of the matches. 由于匹配项的重叠性质，确实需要重复运行正则表达式。

Answer 4

With minimal changes to your algorithm you need to keep track of whether you are inside the @ marks or not. 在对算法进行最少更改的情况下，您需要跟踪自己是否在@标记内。 so add something like this 所以添加这样的东西

my $bConvert = 0;
chomp(my @chars = split(//, $_));
for (@chars) {
    my $char = $_;
    if (/@/) {
        $bConvert = ($bConvert + 1) % 2;
        next;
    }
    elsif ($bConvert) {
        $char =~ s/a/chr(0x430)/eg;
        $char =~ s/b/chr(0x431)/eg;
        $char =~ s/c/chr(0x446)/eg;
        $char =~ s/d/chr(0x434)/eg;
        $char =~ s/e/chr(0x435)/eg;
        $char =~ s/A/chr(0x410)/eg;
        $char =~ s/B/chr(0x411)/eg;
        $char =~ s/C/chr(0x426)/eg;
    }
    print encode("utf-8",$char);
}

查找@括起来的文本并替换内部

问题描述

4 个解决方案

解决方案1
2 2013-03-08 11:24:42

解决方案2
0 2013-03-08 11:16:10

解决方案3
0

解决方案4
0 已采纳 2013-03-08 11:21:27

查找@括起来的文本并替换内部

问题描述

4 个解决方案

解决方案1 2 2013-03-08 11:24:42

解决方案2 0 2013-03-08 11:16:10

解决方案3 0

解决方案4 0 已采纳 2013-03-08 11:21:27

解决方案1
2 2013-03-08 11:24:42

解决方案2
0 2013-03-08 11:16:10

解决方案3
0

解决方案4
0 已采纳 2013-03-08 11:21:27