简体   繁体   English

如何使用sed替换文件中每行字符串前的第n个空格

[英]How to replace the nth space before a string on each line in a file using sed

I am trying to replace the space before the surname on each line of a file with a comma using sed. 我正在尝试使用sed用逗号替换文件每一行上的姓氏前的空格。

Example Source: 来源示例:

George W Heong§New York§USA
Elizabeth Black§Sheffield, Yorkshire§England
Lucy Jones§Cardiff§Wales
James G K Shackleton§Dallas, Texas§USA
Carl Seddon§Canberra,Australia

Example Ouput: 示例输出:

George W,Heong§New York§USA
Elizabeth,Black§Sheffield, Yorkshire§England
Lucy,Jones§Cardiff§Wales
James G K,Shackleton§Dallas, Texas§USA
Carl,Seddon§Canberra,Australia

I think I've worked out a method to obtain the index of the relevant space as follows: 我想我已经设计出一种方法来获取相关空间的索引,如下所示:

int idx$ = str.indexOf("§");
int nthSpace = str.lastIndexOf(" ", idx$);

but I haven't been able to work out how to replace the nth instance with the variable nthSpace. 但是我还无法弄清楚如何用变量nthSpace替换第n个实例。 This is what have got so far: 到目前为止,这是:

sed "s/$nthSpace" "/,/" datain.txt > dataout.txt

Any asistance would be appreciated. 任何协助将不胜感激。

With gensub , available in GNU awk , you can do this: 使用gensub (可在GNU awk ,您可以执行以下操作:

awk 'BEGIN{FS=OFS="§"} {$1=gensub(/[[:blank:]]([^[:blank:]]+)$/, ",\\1", 1, $1)} 1' file

Output: 输出:

George W,Heong§New York§USA
Elizabeth,Black§Sheffield, Yorkshire§England
Lucy,Jones§Cardiff§Wales
James G K,Shackleton§Dallas, Texas§USA
Carl,Seddon§Canberra,Australia

With sed : 与sed:

sed 's/ \([^ ]*§\)/,\1/' sourcefile

The pattern looks for the first occurence of : 该模式查找以下项的首次出现:

  • a space 空间
  • followed by any non-space char 后跟任何非空格字符
  • followed by § 其次是 §

The name is captured in a group that is used in the substitution to be prefixed with a , 该名称是在替换中使用的组中捕获的,前缀为,

UPDATE : 更新:

To prevent strings as name § to be matched, you can preprocess the first substitution with s/ +§/§/ . 为了防止匹配name §字符串,可以使用s/ +§/§/预处理第一次替换。 The final command will be : 最终命令将是:

sed 's/ +§/§/;s/ \([^ ]*§\)/,\1/' sourcefile

As noticed in question comments, multipart surnames (separated with spaces) will be split if not rewritten manually. 如问题注释中所注意到的,如果不手动重写,多部分姓(以空格分隔)将被拆分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM