简体   繁体   中英

Converting a text file into csv file using python

I have a requirement where in I need to convert my text files into csv and am using python for doing it. My text file looks like this ,

Employee Name : XXXXX
Employee Number : 12345
Age : 45
Hobbies: Tennis
Employee Name: xxx
Employee Number :123456
Hobbies : Football

I want my CSV file to have the column names as Employee Name, Employee Number , Age and Hobbies and when a particular value is not present it should have a value of NA in that particular place. Any simple solutions to do this? Thanks in advance

Maybe this helps you get started? It's just the static output of the first employee data. You would now need to wrap this into some sort of iteration over the file. There is very very likely a more elegant solution, but this is how you would do it without a single import statement ;)

with open('test.txt', 'r') as f:
    content = f.readlines()
    output_line = "".join([line.split(':')[1].replace('\n',';').strip() for line in content[0:4]])
    print(output_line)

You can do something like this:

records = """Employee Name : XXXXX
Employee Number : 12345
Age : 45
Hobbies: Tennis
Employee Name: xxx
Employee Number :123456
Hobbies : Football"""

for record in records.split('Employee Name'):
    fields = record.split('\n')
    name = 'NA'
    number = 'NA'
    age = 'NA'
    hobbies = 'NA'
    for field in fields:
        field_name, field_value = field.split(':')
        if field_name == "": # This is employee name, since we split on it
            name = field_value
        if field_name == "Employee Number":
            number = field_value
        if field_name == "Age":
            age = field_value
        if field_name == "Hobbies":
            hobbies = field_value

Of course, this method assumes that there is (at least) Employee Name field in every record.

I followed very simple steps for this and may not be optimal but solves the problem. Important case here I can see is there can be multiple keys ("Employee Name" etc) in single file. Steps

  1. Read txt file to list of lines.
  2. convert list to dict(logic can be more improved or complex lambdas can be added here)
  3. Simply use pandas to convert dict to csv

Below is the code,

import pandas

etxt_file = r"test.txt"
txt = open(txt_file, "r")
txt_string = txt.read()


txt_lines = txt_string.split("\n")
txt_dict = {}


for txt_line in txt_lines:
    k,v = txt_line.split(":")
    k = k.strip()
    v = v.strip()
    if txt_dict.has_key(k):
        list = txt_dict.get(k)
    else:
        list = []
    list.append(v)
    txt_dict[k]=list

print pandas.DataFrame.from_dict(txt_dict, orient="index")

Output:

                      0         1
Employee Number   12345    123456
Age                  45      None
Employee Name     XXXXX       xxx
Hobbies          Tennis  Football

I hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM