简体   繁体   中英

how to split a text file and modify it in Python?

I currently have a text file that reads like this:

101, Liberia, Monrovia, 111000, 3200000, Africa, English, Liberia Dollar;
102, Uganda, Kampala, 236000, 34000000, Africa, English and Swahili, Ugandan Shilling;
103, Madagascar, Antananarivo, 587000, 21000000, Africa, Magalasy and Frances, Malagasy Ariary;

I'm currently printing the file using this code:

with open ("base.txt",'r') as f:
   for line in f:
      words = line.split(';')
      for word in words:
         print (word)

What I would like to know is, how can I modify a line by using their id number (101 for example) and keep the format they have and add or remove lines based on their id number?

My understanding your asking how to modify a word in a line and then insert the modified line back into the file.

Change a word in the file

def change_value(new_value, line_number, column):
    with open("base.txt",'r+') as f: #r+ means we can read and write to the file
        lines = f.read().split('\n') #lines is now a list of all the lines in the file
        words = lines[line_number].split(',')
        words[column] = new_value
        lines[line_number] = ','.join(words).rstrip('\n') #inserts the line into lines where each word is seperated by a ','
        f.seek(0)
        f.write('\n'.join(lines)) #writes our new lines back into the file

In order to use this function to set line 3, word 2 to Not_Madasgascar call it like this:

change_word("Not_Madagascar", 2, 1)

You will always have to add 1 to the line/word number because the first line/word is 0

Add a new line to the file

def add_line(words, line_number):
    with open("base.txt",'r+') as f:
        lines = f.readlines()
        lines.insert(line_number, ','.join(words) + '\n')
        f.seek(0)
        f.writelines(lines)

In order to use this function add a line at the end containing the words this line is at the end call it like this:

add_line(['this','line','is','at','the','end'], 4) #4 is the line number

For more information on opening files see here .

For more information on reading from and modifying files see here .

pandas is a strong tool for solving your requirements. It provides the tools for easily working with CSV files. You can manage your data in DataFrames .

import pandas as pd

# read the CSV file into DataFrame
df = pd.read_csv('file.csv', sep=',', header=None, index_col = 0)
print (df)

在此处输入图片说明

# eliminating the `;` character
df[7] = df[7].map(lambda x: str(x).rstrip(';'))
print (df)

在此处输入图片说明

# eliminating the #101 row of data
df.drop(101, axis=0, inplace=True)
print (df)

在此处输入图片说明

Reading this file into an OrderedDict would probably be helpful if you are trying to preserve the original file ordering as well as have the ability to references lines in the file for modification/addition/deletion. There are quite a few assumptions about the full format of the file in the following example, but it will work for your test case:

from collections import OrderedDict

content = OrderedDict()

with open('base.txt', 'r') as f:
    for line in f:
        if line.strip():
            print line
            words = line.split(',')  # Assuming that you meant ',' vs ';' to split the line into words
            content[int(words[0])] = ','.join(words[1:])

print(content[101])  # Prints " Liberia, Monrovia, etc"...

content.pop(101, None)  # Remove line w/ 101 as the "id"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM