简体   繁体   中英

How can I create a new csv after finding the header row?

I am reading a csv file that has about 7-8 lines above that are a description of my file. I am getting to the first column by using the following code :

            list_of_files = glob.glob('C:/payment_reports/*csv') # * means all if need specific format then *.csv
            latest_file = max(list_of_files, key=os.path.getctime)
            print (latest_file)
            line_count = None
            for row in csv.reader(open(latest_file)):
                if row[0] == 'date/time':
                    print (row)
                    break
            else:
               print("{} not found".format('name'))

I am getting to correct line since the row that prints is:

['date/time', 'settlement id', 'type', 'order id', 'sku', 'description', 'quantity', 'marketplace', 'fulfillment', 'order city', 'order state', 'order postal', 'product sales', 'shipping credits', 'gift wrap credits', 'promotional rebates', 'sales tax collected', 'Marketplace Facilitator Tax', 'selling fees', 'fba fees', 'other transaction fees', 'other', 'total']

Now how do I save the column + all the rows after as a new csv? I have a line_count, but before I try it with a new variable, I am sure there are functions in the csv using the index of the row that I can use to make things more simple. What do you guys suggest is the best way to do this.?

Solution: thanks @bruno desthuilliers

            list_of_files = glob.glob('C:/payment_reports/*csv') # * means all if need specific format then *.csv
            latest_file = max(list_of_files, key=os.path.getctime)
            print (latest_file)
            with open(latest_file, "r") as infile:
                reader = csv.reader(infile)
                for row in reader: 
                    if row[0] == 'date/time':
                        print (row)
                        break
                else:
                    print("{} not found".format('name'))
                    break
                with open("C:/test.csv", "w") as outfile:
                    writer = csv.writer(outfile)
                    writer.writerow(row) # headers
                    writer.writerows(reader) # remaining rows

csv.reader is an iterator. It reads a line from the csv every time that .next is called.

Here's the documentation: http://docs.python.org/2/library/csv.html .

An iterator object can actually return values from a source that is too big to read all at once. using a for loop with an iterator effectively calls .next on each time through the loop. hope this helps?

Once you found the headers row, you can write it and the remaining rows to your outfile:

with open(latest_file, "rb") as infile:
    reader = csv.reader(infile)
    for row in reader: 
        if row[0] == 'date/time':
            break
    else:
        print("{} not found".format('name'))
        return
    with open("path/to/new.csv", "wb") as outfile:
        writer = csv.writer(outfile)
        writer.writerow(row) # headers
        writer.writerows(reader) # remaining rows

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM