[英]how to split a text file and modify it in Python?
I currently have a text file that reads like this: 我目前有一个文本文件,内容如下:
101, Liberia, Monrovia, 111000, 3200000, Africa, English, Liberia Dollar;
102, Uganda, Kampala, 236000, 34000000, Africa, English and Swahili, Ugandan Shilling;
103, Madagascar, Antananarivo, 587000, 21000000, Africa, Magalasy and Frances, Malagasy Ariary;
I'm currently printing the file using this code: 我目前正在使用以下代码打印文件:
with open ("base.txt",'r') as f:
for line in f:
words = line.split(';')
for word in words:
print (word)
What I would like to know is, how can I modify a line by using their id number (101 for example) and keep the format they have and add or remove lines based on their id number? 我想知道的是,如何使用ID号(例如101)修改行并保持其格式,并根据ID号添加或删除行?
My understanding your asking how to modify a word in a line and then insert the modified line back into the file. 我的理解是您询问如何修改一行中的单词,然后将修改后的行重新插入文件中。
def change_value(new_value, line_number, column):
with open("base.txt",'r+') as f: #r+ means we can read and write to the file
lines = f.read().split('\n') #lines is now a list of all the lines in the file
words = lines[line_number].split(',')
words[column] = new_value
lines[line_number] = ','.join(words).rstrip('\n') #inserts the line into lines where each word is seperated by a ','
f.seek(0)
f.write('\n'.join(lines)) #writes our new lines back into the file
In order to use this function to set line 3, word 2
to Not_Madasgascar
call it like this: 为了使用此功能设置
line 3, word 2
Not_Madasgascar
line 3, word 2
像这样称呼它:
change_word("Not_Madagascar", 2, 1)
You will always have to add 1
to the line/word number because the first line/word is 0
您将始终必须在行/单词号上添加
1
,因为第一行/单词是0
def add_line(words, line_number):
with open("base.txt",'r+') as f:
lines = f.readlines()
lines.insert(line_number, ','.join(words) + '\n')
f.seek(0)
f.writelines(lines)
In order to use this function add a line at the end containing the words this
line
is
at
the
end
call it like this: 为了使用该功能,在包含该单词的末尾添加一行
this
line
is
at
the
end
调用它是这样的:
add_line(['this','line','is','at','the','end'], 4) #4 is the line number
For more information on opening files see here . 有关打开文件的更多信息,请参见此处 。
For more information on reading from and modifying files see here . 有关读取和修改文件的更多信息,请参见此处 。
pandas
is a strong tool for solving your requirements. pandas
是解决您的需求的强大工具。 It provides the tools for easily working with CSV files. 它提供了轻松处理CSV文件的工具。 You can manage your data in
DataFrames
. 您可以在
DataFrames
管理数据。
import pandas as pd
# read the CSV file into DataFrame
df = pd.read_csv('file.csv', sep=',', header=None, index_col = 0)
print (df)
# eliminating the `;` character
df[7] = df[7].map(lambda x: str(x).rstrip(';'))
print (df)
# eliminating the #101 row of data
df.drop(101, axis=0, inplace=True)
print (df)
Reading this file into an OrderedDict
would probably be helpful if you are trying to preserve the original file ordering as well as have the ability to references lines in the file for modification/addition/deletion. 如果您尝试保留原始文件的顺序并能够引用文件中的行以进行修改/添加/删除,则将该文件读入
OrderedDict
可能会有所帮助。 There are quite a few assumptions about the full format of the file in the following example, but it will work for your test case: 在下面的示例中,关于文件的完整格式有很多假设,但是对于您的测试用例将起作用:
from collections import OrderedDict
content = OrderedDict()
with open('base.txt', 'r') as f:
for line in f:
if line.strip():
print line
words = line.split(',') # Assuming that you meant ',' vs ';' to split the line into words
content[int(words[0])] = ','.join(words[1:])
print(content[101]) # Prints " Liberia, Monrovia, etc"...
content.pop(101, None) # Remove line w/ 101 as the "id"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.