關於空白填充的成功結果的第2部分

Question

因此，我的第一個問題已正確回答。 作為參考，你可以去這里...

簡而言之，我需要這個...

POLYGON_POINT -79.750000000217,42.017498354525,0
POLYGON_POINT -79.750000000217,42.016478251402,0
POLYGON_POINT -79.750598748133,42.017193264943,0
POLYGON_POINT -79.750000000217,42.017498354525,0


POLYGON_POINT -79.750000000217,42.085882815878,0
POLYGON_POINT -79.750000000217,42.082008734634,0
POLYGON_POINT -79.751045507507,42.082126409633,0
POLYGON_POINT -79.750281907508,42.083166574215,0
POLYGON_POINT -79.750781149174,42.084212672130,0
POLYGON_POINT -79.750000000217,42.085882815878,0

要成為這個...

BEGIN_POLYGON
POLYGON_POINT -79.750000000217,42.017498354525,0
POLYGON_POINT -79.750000000217,42.016478251402,0
POLYGON_POINT -79.750598748133,42.017193264943,0
POLYGON_POINT -79.750000000217,42.017498354525,0
END_POLY
BEGIN_POLYGON
POLYGON_POINT -79.750000000217,42.085882815878,0
POLYGON_POINT -79.750000000217,42.082008734634,0
POLYGON_POINT -79.751045507507,42.082126409633,0
POLYGON_POINT -79.750281907508,42.083166574215,0
POLYGON_POINT -79.750781149174,42.084212672130,0
POLYGON_POINT -79.750000000217,42.085882815878,0
END_POLY

這是通過python腳本成功完成的。 現在，我發現我需要刪除重復的行，尤其是每個塊中的最后一行。 那條線關閉了多邊形，但是建築批處理給出了錯誤，因為它自己關閉了多邊形。 基本上，我需要它成為一切的結尾...

BEGIN_POLYGON
POLYGON_POINT -79.750000000217,42.017498354525,0
POLYGON_POINT -79.750000000217,42.016478251402,0
POLYGON_POINT -79.750598748133,42.017193264943,0
END_POLY
BEGIN_POLYGON
POLYGON_POINT -79.750000000217,42.085882815878,0
POLYGON_POINT -79.750000000217,42.082008734634,0
POLYGON_POINT -79.751045507507,42.082126409633,0
POLYGON_POINT -79.750281907508,42.083166574215,0
POLYGON_POINT -79.750781149174,42.084212672130,0
END_POLY

共有3,415,978行。 每隔一個重復的去除器將占用空白和所有措辭。 嗯

Answer 1

如評論中所指出的，保留對上一行的引用：

with open('in.txt') as fin, open('out.txt', 'w') as fout:
    prev = None
    for i, line in enumerate(fin):
      if line.strip() != 'END_POLY' and prev:
        fout.write(prev)
      prev = line
      if not i % 10000:
        print('Processing line {}'.format(i))
    fout.write(line)

Answer 2

如果您不想重復的數據，則可以將列表轉換為集合，然后轉換為列表（對另一個問題的@Jean-FrançoisFabre代碼進行一點修改）：

import itertools, collections

with open("file.txt") as f, open("fileout.txt","w") as fw:
    fw.writelines(itertools.chain.from_iterable([["BEGIN_POLYGON\n"]+list(collections.OrderedDict.fromkeys(v).keys())+["END_POLYGON\n"] for k,v in itertools.groupby(f,key = lambda l : bool(l.strip())) if k]))

如您所見，如果您這樣做：

print(list(collections.OrderedDict.fromkeys([1,1,1,1,1,1,2,2,2,2,5,3,3,3,3,3]).keys()))

它將是-> [1, 2, 5, 3] 1、2、5、3 [1, 2, 5, 3]並且您保留順序

Answer 3

雖然不是在python中，但是如果您使用sed ，則這些類型的編輯相當簡單

sed 'N;s/.*\n\(END_POLY\)/\1/' file.txt

基本上，它的工作是使用N讀取2行，如果第二行包含字符串END_POLY ，則它將刪除第一行，僅END_POLY

關於空白填充的成功結果的第2部分

問題描述

3 個解決方案

解決方案1
0 2017-10-27 13:35:03

解決方案2
0 2017-10-27 13:49:08

解決方案3
0 2017-10-27 14:45:10

關於空白填充的成功結果的第2部分

問題描述

3 個解決方案

解決方案1 0 2017-10-27 13:35:03

解決方案2 0 2017-10-27 13:49:08

解決方案3 0 2017-10-27 14:45:10

解決方案1
0 2017-10-27 13:35:03

解決方案2
0 2017-10-27 13:49:08

解決方案3
0 2017-10-27 14:45:10