简体   繁体   English

如何使用vim文本编辑器删除文本文件中出现的任何非法换行?

[英]How to remove any illegal new lines appears within a text files using vim text editor?

I am trying to repair a data file so I can use MySQL DATA LOAD INFILE TO IMPORT DATA INTO database. 我正在尝试修复数据文件,以便可以使用MySQL DATA LOAD INFILE将数据导入数据库。

The problem that I am having with the file is that there are lengthy text but it contains a new line with in the text. 我在文件中遇到的问题是文本太长,但文本中包含换行符。 Also a new line means a new record. 另外,新行也意味着新记录。 This is making it hard for me to import the records into MySQL. 这使我很难将记录导入MySQL。

How can I use vim in linux to search for illegal new lines and replace them with a space? 如何在Linux中使用vim搜索非法的新行并将其替换为空格?

Illegal new line: if a new line is found between a comma ( , ) and ( ,012d000 ) 换行非法:如果在逗号(,)和(,012d000)之间找到新行

This is a sample data of the file 这是文件的样本数据

VST-65654,a0Jd000000FM8cBEAT,Blah,2013-10-22 10:46:30.000000,Blah Blah,2014-01-20 20:27:42.000000,2013-10-18 14:00:00.000000,005d0000002biR4AAI,001d000001NEh0oAAD,In Person,Unscheduled,Grow Applications,High,this is the body

of this 
log test
where I need to

remove all extra new lines,012d0000000ppiXAAQ
VST-122549,a0Jd000000GVwtyEAD,Blah,2013-10-31 18:17:50.000000,Blah,2013-11-06 18:07:47.000000,2013-10-31 18:10:00.000000,005d0000002biR9AAI,001d000001NEaQgAAL,In Person,Scheduled,Grow Applications,Medium,One more long paragraph

where I need to remove all extra

new lines

,012d0000000ppiABCD

The fields are separated by a comma ( , ) and the new record should begin when a new line \\n is found. 这些字段用逗号(,)分隔,并且在找到新行\\ n时应开始新记录。 How can I do such a search replace to fix this issue? 我该如何进行搜索替换来解决此问题?

Or how can I replace all unescaped commas with a double quotes? 或者如何用双引号替换所有未转义的逗号? That is, if I find \\, don't touch it, but if you find a comma with replace it with "," 也就是说,如果我找到\\,请不要触摸它,但是如果您发现一个逗号,则将其替换为“,”

Thanks 谢谢

g/^VST/,-/,012d000/j!

Use the global command, :g to join together, :j , the line starting with VST with all the lines through the next instance of 012d000 . 使用全局命令:g将以VST开头的行与直到下一个012d000所有行连接在一起:j

For more help see: 有关更多帮助,请参见:

:h :g
:h :j
:h [range]

My regex foo isn't powerfull enough to do that in a single command but you could create a macro to achieve what you want. 我的regex foo不够强大,无法在单个命令中做到这一点,但是您可以创建一个宏来实现所需的功能。 The following worked for the input you gave 以下工作为您提供的输入

Go to start of file 转到文件开头

gg

Start recording 开始录音

qq

Find next ,012d 寻找下一个,012d

/,012d<CR>

Go up one line 上一行

k

Enter visual mode 进入视觉模式

v

Go to previous comma 转到上一个逗号

?,<CR>

Replace all new line chars 替换所有换行符

:s/\n//g<CR>

Go down one line 下一行

j

Finish recording 完成录音

q

Repeat 重复

@q

Result 结果

VST-65654,a0Jd000000FM8cBEAT,Blah,2013-10-22 10:46:30.000000,Blah Blah,2014-01-20 20:27:42.000000,2013-10-18 14:00:00.000000,005d0000002biR4AAI,001d000001NEh0oAAD,In Person,Unscheduled,Grow Applications,High,this is the body of this log test where I need to remove all extra new lines,012d0000000ppiXAAQ
VST-122549,a0Jd000000GVwtyEAD,Blah,2013-10-31 18:17:50.000000,Blah,2013-11-06 18:07:47.000000,2013-10-31 18:10:00.000000,005d0000002biR9AAI,001d000001NEaQgAAL,In Person,Scheduled,Grow Applications,Medium,One more long paragraph where I need to remove all extra new lines ,012d0000000ppiABCD

I like @Peter Rincker's answer. 我喜欢@Peter Rincker的回答。 As for the question you asked at the end, you can replace all the un-escaped commas with "," using 至于您最后提出的问题,您可以将所有未转义的逗号替换为","使用

:%s/\\\@<!,/","/g

Here, \\\\ represents a literal backslash and \\@<! 在这里, \\\\代表文字反斜杠,而\\@<! is a modifier. 是修饰符。 (See :help /\\@<! .) (请参阅:help /\\@<!

The problem with this solution is that you have not correctly defined what an un-escaped comma is. 该解决方案的问题在于您没有正确定义什么是未转义的逗号。 For example, \\\\, is an escaped backslash followed by an un-escaped comma. 例如, \\\\,是转义的反斜杠,后跟一个未转义的逗号。 I believe that /\\\\\\@<!\\%(\\\\\\\\\\)*\\zs,/ is the correct pattern, but I do not say it is pretty. 我相信/\\\\\\@<!\\%(\\\\\\\\\\)*\\zs,/是正确的模式,但我并不是说它很漂亮。 It is a little better if you use the "very magic" version: /\\v\\\\@<!%(\\\\\\\\)*\\zs,/ . 如果使用“ very magic”版本会更好一些: /\\v\\\\@<!%(\\\\\\\\)*\\zs,/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM