[英]How to fill the white-space with info while leaving the rest unchanged?
I am constructing scenery for a Flight Simulator and need to figure out how to edit many lines in a text file (3,579,189 of them). 我正在为模拟飞行器构建场景,需要弄清楚如何在文本文件中编辑多行(其中3,579,189行)。
I have TextCrawler Pro, Node, Python SVN and Notepad++ as tools. 我有TextCrawler Pro,Node,Python SVN和Notepad ++作为工具。
Raw, pre-edit portion: 原始的预编辑部分:
POLYGON_POINT -79.750000000217,42.017498354525,0
POLYGON_POINT -79.750000000217,42.016478251402,0
POLYGON_POINT -79.750598748133,42.017193264943,0
POLYGON_POINT -79.750000000217,42.017498354525,0
POLYGON_POINT -79.750000000217,42.085882815878,0
POLYGON_POINT -79.750000000217,42.082008734634,0
POLYGON_POINT -79.751045507507,42.082126409633,0
POLYGON_POINT -79.750281907508,42.083166574215,0
POLYGON_POINT -79.750781149174,42.084212672130,0
POLYGON_POINT -79.750000000217,42.085882815878,0
POLYGON_POINT -79.750000000217,42.088955814831,0
POLYGON_POINT -79.750456566883,42.087544672125,0
POLYGON_POINT -79.751642899173,42.088273325249,0
POLYGON_POINT -79.751461052298,42.088916154415,0
POLYGON_POINT -79.750000000217,42.088955814831,0
With Notepad++'s replace function, it is easy enough to add the POLYGON_POINT
line. 使用Notepad ++的替换功能,添加
POLYGON_POINT
行非常容易。 Now I need some assistance in making it appear as so: 现在,我需要一些帮助使其显示为:
BEGIN_POLYGON
POLYGON_POINT -79.750000000217,42.017498354525,0
POLYGON_POINT -79.750000000217,42.016478251402,0
POLYGON_POINT -79.750598748133,42.017193264943,0
POLYGON_POINT -79.750000000217,42.017498354525,0
END_POLY
BEGIN_POLYGON
POLYGON_POINT -79.750000000217,42.085882815878,0
POLYGON_POINT -79.750000000217,42.082008734634,0
POLYGON_POINT -79.751045507507,42.082126409633,0
POLYGON_POINT -79.750281907508,42.083166574215,0
POLYGON_POINT -79.750781149174,42.084212672130,0
POLYGON_POINT -79.750000000217,42.085882815878,0
END_POLY
BEGIN_POLYGON
POLYGON_POINT -79.750000000217,42.088955814831,0
POLYGON_POINT -79.750456566883,42.087544672125,0
POLYGON_POINT -79.751642899173,42.088273325249,0
POLYGON_POINT -79.751461052298,42.088916154415,0
POLYGON_POINT -79.750000000217,42.088955814831,0
ie add BEGIN_POLYGON
before each block and END_POLY
after each. 即添加
BEGIN_POLYGON
每个块和前END_POLY
每个之后。
How can I do this? 我怎样才能做到这一点?
I would group the lines by being blank or not, using itertools.groupby
(only taking the non-blank groups with the if k
condition), and add the header/footer for each group. 我将使用
itertools.groupby
(仅使用if k
条件获取非空白组)将行分为空白还是空白,并为每个组添加页眉/页脚。 Then flatten the groups using itertools.chain
然后使用
itertools.chain
展平组
import itertools
with open("file.txt") as f, open("fileout.txt","w") as fw:
fw.writelines(itertools.chain.from_iterable([["BEGIN_POLYGON\n"]+list(v)+["END_POLYGON\n"] for k,v in itertools.groupby(f,key = lambda l : bool(l.strip())) if k]))
key = lambda l : bool(l.strip()))
is the grouping key: test for empty line but for line termination key = lambda l : bool(l.strip()))
是分组键:测试空行但行终止
this method doesn't need to read the file fully, so it's suited for very big files. 此方法不需要完全读取文件,因此适用于非常大的文件。 It processes the file line by line so it doesn't hog the memory.
它逐行处理文件,因此不会占用内存。
A quick solution using sed
使用
sed
的快速解决方案
cat -s file.txt |\
sed -e 's/^$/END_POLY\nBEGIN_POLYGON/'\
-e '1i BEGIN_POLYGON'\
-e '$a END_POLY'
cat -s
squeezes all blank lines into one cat -s
将所有空白行压缩为一个
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.