简体   繁体   English

使用python修改.txt文件中的表

[英]Modify a table in .txt file using python

I have a .txt file containing a set of data organized as follow:我有一个 .txt 文件,其中包含如下组织的一组数据:

(id1) (name1) (x coordinate1) (y coordinate1) (value1)
(id2) (name2) (x coordinate2) (y coordinate2) (value2) 
(id3) (name3) (x coordinate3) (y coordinate3) (value3) 

..... 

Now I want to move all (names) from column 2 to column 4. The code I wrote is this:现在我想将所有(名称)从第 2 列移到第 4 列。我写的代码是这样的:

with open("C:\\path\\to\\input\\file.txt","r") as f:
    rows = list(f)
    table = [["."],["."],["."],["."],["."],["."]]*len(rows)
    for i in range(len(rows)):
        row = rows[i].split(" ")
        table[6*i] = row[0]+" "
        table[6*i+1] = row[2]+" "
        table[6*i+2] = row[3]+" "
        table[6*i+3] = row[1]+" "
        table[6*i+4] = row[4]
        table[6*i+5] = "\n"
    with open("C:\\path\\to\\output\\file.txt","w") as o:
        o.writelines(table)

it performs the task but the output contains a blank line after each row.它执行任务,但输出在每行之后包含一个空行。 I have tried hours to get rid of them but I cannot figure out how to get a correct output.我已经尝试了几个小时来摆脱它们,但我无法弄清楚如何获得正确的输出。 The wrong output is this:错误的输出是这样的:

(id1) (x coordinate1) (y coordinate1) (name1) (value1)

(id2) (x coordinate2) (y coordinate2) (name2) (value2) 

(id3) (x coordinate3) (y coordinate3) (name3) (value3) 

..... 

You are adding a line break, try removing this line:您正在添加换行符,请尝试删除此行:

table[6*i+5] = "\n"

Since the file you are pulling in has line breaks they get automatically included with the last item in your list.由于您要提取的文件有换行符,因此它们会自动包含在列表中的最后一项中。

Edit: Your source file might be a little wonky, you can also change that last line to be:编辑:您的源文件可能有点不稳定,您也可以将最后一行更改为:

table[6*i+5] = ""

Welcome to StackOverflow!欢迎来到 StackOverflow!

When you read the data from the file, the new lines are present in the data, hence, when you write manipulate and then rewrite the data in another file, it automatically the new lines.当您从文件中读取数据时,新行出现在数据中,因此,当您编写操作然后在另一个文件中重写数据时,它会自动换行。 Hence there is no need to manually add the new lines yourself since that will just add extra unnecessary new lines(which is the problem we are facing).因此无需自己手动添加新行,因为这只会添加额外的不必要的新行(这是我们面临的问题)。

So you must remove this line - table[6*i+5] = "\n" .所以你必须删除这一行 - table[6*i+5] = "\n" I hope this fixes your problem.我希望这可以解决您的问题。

As already noted, you are adding a line break to each line, but your column 4 already contains a line break, resulting in 2 breaks.如前所述,您正在为每一行添加一个换行符,但您的第 4 列已经包含一个换行符,导致 2 个换行符。

However there is another problem with your code.但是,您的代码还有另一个问题。 file.writelines expects a list of strings, usually with a line break at the end of each. file.writelines需要一个字符串列表,通常在每个字符串的末尾都有一个换行符。 But you create a table which is instead a long list of lists, each containing 1 string:但是您创建了一个表,它是一长串列表,每个列表包含 1 个字符串:

table = [["."],["."],["."],["."],["."],["."]]*len(rows)

You then replace these sub-lists, one by one, with strings:然后,您将这些子列表一一替换为字符串:

table[6*i] = row[0]+" "   # etc.

If any are not replaced, writelines will raise an error because there is a list where it expects a string.如果没有替换任何内容,则writelines将引发错误,因为存在一个需要字符串的列表。 So you need to make some additional adjustments to make your existing code work:所以你需要做一些额外的调整来让你现有的代码工作:

with open("input.txt") as f:
    rows = list(f)
table = ["", "", "", "", ""]*len(rows)
for i in range(len(rows)):
    row = rows[i].split(" ")
    table[5*i] = row[0]+" "
    table[5*i+1] = row[2]+" "
    table[5*i+2] = row[3]+" "
    table[5*i+3] = row[1]+" "
    table[5*i+4] = row[4]
with open("output.txt","w") as o:
    o.writelines(table)

However a better way of doing this would be to build the table one row at a time, like this:然而,这样做的更好方法是一次构建表一行,如下所示:

with open("input.txt") as f:
    table = []
    for row in f:
        row = row.strip().split(" ")  # strip removes any line breaks / extra spaces
        table.append([row[0], row[2], row[3], row[1], row[4]])

with open("output.txt","w") as o:
    o.writelines(" ".join(row) + "\n" for row in table)

Better still, use the csv module which is designed for this and will deal with the newlines automatically:更好的是,使用为此设计的csv模块,它将自动处理换行符:

import csv
with open("input.txt") as in_file, open("output.txt", "w", newline="") as out_file:
    writer = csv.writer(out_file, delimiter=" ")
    for row in csv.reader(in_file, delimiter=" "):
        writer.writerow([row[0], row[2], row[3], row[1], row[4]])

or pandas :熊猫

import pandas as pd
pd.read_csv("input.txt", sep=" ", header=None)[[0, 2, 3, 1, 4]] \
    .to_csv("output.txt", sep=" ", header=False, index=False)

Pitty you are required to use Python. Pitty 你需要使用 Python。 You could do it command line (assuming your data is a in a file data.txt ):您可以在命令行中执行此操作(假设您的数据位于文件data.txt中):

sed -e 's/) (/);(/g' data.txt | awk -F ";" '{print $1 ";" $3 ";" $4 ";" $5 ";" $2}' | sed -e 's/;/ /g'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM