简体   繁体   English

awk:输出文件的文件名中包含^ M个字符

[英]awk: output files contain ^M character in filename

I have a very long file which looks like this: 我有一个很长的文件,看起来像这样:

0a
190  0.121212
191  0.232323
...
0b
190  0.1212
191  0.4545
...
16c
190  0.34654
191  0.567565
...

I use awk to split the file into many smaller files using this command: 我使用awk使用以下命令将文件拆分为许多较小的文件:

awk '/[0-9][a-c]/{close(x); x=$0;}{print > x;}' spectrum.tsv

This works, but the names of the output files all seem to contain a newline character at the end of the filename: 这可行,但是输出文件的名称似乎都在文件名的末尾包含换行符: 在此处输入图片说明

I have tried to remove the newline character with "sub" like so: 我试图用“ sub”删除换行符,如下所示:

awk '/[0-9][a-c]/{close(x); x=$0;}{sub(/^M/,"",x)}{print > x;}' spectrum.tsv 

But that leads to the same result. 但这导致相同的结果。

So my question is, how can I avoid the newline character in the output filenames? 所以我的问题是,如何避免输出文件名中的换行符? I am working on OSX 10.10 btw. 我正在OSX 10.10 btw上工作。 The input file is from a Windows machine. 输入文件来自Windows计算机。

Run dos2unix on your files before you let awk process them! 在让awk处理文件之前,在文件上运行dos2unix It will remove DOS style line endings, which is probably what is causing your head ache. 它将删除DOS样式的行尾,这可能是导致您头痛的原因。

You can just set an appropriate record separator in awk to take care of \\r in input files: 您只需在awk中设置适当的记录分隔符即可处理输入文件中的\\r

awk -v RS='\r?\n' '/[0-9][a-c]/{close(x); x=$0;}{print > x;}' spectrum.tsv

Here RS='\\r?\\n' sets RS as optional \\r ( ^M ) followed by \\n 这里RS='\\r?\\n'RS设置为可选的\\r^M ),后跟\\n

在vi编辑器中打开并进入命令模式,然后:输入":%s/[CTRL+V][CTRL+M]//g"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM