简体   繁体   中英

How to use python to read a text file into a csv by picking certain parts

I have a text file with looks like this:

Current job title:
meter engineer
Current salary:
£30,000
Experience:
2 years
Desired location:
Not supplied
Desired job title:
smart meter engineer
Desired salary:
£30,000
Job Type:
Permanent | Contract | Temp

Current job title:
dual fuel smart meter engineer
Current salary:
£30,000
Experience:
4 years
Desired location:
Not supplied
Desired job title:
Not supplied
Desired salary:
£34,999
Job Type:
Permanent | Contract | Temp

each line is split with a new line and the sets of data is separated by blank space I want to use python to extract the data under the headings. eg: Current job title: meter engineer would go in the Current job title column. Then grab the next set and put those on the next line

how do I achieve this using python?

I am new to python. All I can get it to do is read the file. Picking out the data using IF doesn't work.

 f = open("test.txt", "r")
 lines = f.readlines()
 for line in lines:
    print(line)
import csv

with open('test.csv', 'rU') as infile:
  reader = csv.DictReader(infile)
  data = {}
  for row in reader:
    for header, value in row.items():
      try:
        data[header].append(value)
      except KeyError:
        data[header] = [value]

You'll first need to restructure your data before converting it to a csv format.

Try this:

import csv
from collections import OrderedDict

with open('data.txt', 'r') as data, open('output.csv', 'w') as file:
  rows = data.read().split('\n\n')
  output = [
    OrderedDict(
      (k.rstrip(':'), v) for k, v in zip(row.split('\n')[::2], row.split('\n')[1::2])
    ) 
    for row in rows
  ]
  writer = csv.DictWriter(file, fieldnames=output[0].keys(), lineterminator='\n')
  writer.writeheader()
  writer.writerows(output)

Repl sample

Output:

Current job title,Current salary,Experience,Desired location,Desired job title,Desired salary,Job Type
meter engineer,"£30,000",2 years,Not supplied,smart meter engineer,"£30,000",Permanent | Contract | Temp
dual fuel smart meter engineer,"£30,000",4 years,Not supplied,Not supplied,"£34,999",Permanent | Contract | Temp

Note you need to use OrderedDict as dictionary objects are not ordered on python 2.7 and will cause trouble for your csv file. Relevant pseudo OrderedDict comprehension was inspired by this answer here: Is there an OrderedDict comprehension?

As a side note - Python 2.7 is sunsetting. You should really consider moving your project to a Python 3.x base.

This Code is basic but it might just do the trick. All it does is read the line that is found as the one with the data on it. It knows which line is on because its assume its this example exactly.

 f = open("test.txt", "r")

if f.mode == 'r': 

  repeat = f.find('Current job title:') #this give how many times 'Current job title is found in the program'

  while repeat >= 1:

    repeat = repeat - 1 #changes repeat by -1 every time you do this counting how many times you run this loop

    print("Job Title:")          #Finds and sets JobTitle to the title of job stated
    JobTitle = f.readline(repeat * 15 - 13)

    print(JobTitle)   #Prints Job Title

    print("Current salary:")            #Repeat from Above
    Pay = f.readline(repeat * 15 - 11)
    print(Pay)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM