简体   繁体   中英

How to copy unique individuals from a csv file into new files

I am very new to python so please bear with me. I have a csv file that looks like this:

动物位置 .

I am trying to loop through the file and for every unique individual there are, create a new csv file and copy rows over. I was successful in doing this for one animal but I am having trouble creating the syntax for a more generic approach. Here is what I currently have:

import arcpy
import csv
from csv import DictReader

WS = arcpy.env.workspace = raw_input("Where if your workspace")
infile = raw_input("where is your file?") 
outfile = raw_input("What is your outfile name?") 
arcpy.env.overwriteOutput = True


with open(infile, "r") as csvFile, open(outfile, "w") as out, open("outfile2.csv", "w") as out2:
    reader = csv.DictReader(csvFile)
    writer = csv.writer(out)
    writer.writerow(reader.fieldnames)
    for row in reader:
         if row["Animal"] == "1":
            values = [row[field] for field in reader.fieldnames]
            writer.writerow(values)

To write each Animal into its own CSV file, you would need to open a different file for each type of animal. This could be done by using a dictionary to store the file object and csv writer object for each animal. At the end, this could then be used to close all the files correctly:

import csv

output_csvs = {}    # e.g. {'1' : [file_object, csv_object]}

with open('input.csv', 'rb') as f_input:
    csv_reader = csv.reader(f_input)
    header = next(csv_reader)

    for row in csv_reader:
        animal = row[0]

        if animal in output_csvs:
            output_csvs[animal][1].writerow(row)
        else:
            f_output = open('animal_{}.csv'.format(animal), 'wb')
            csv_output = csv.writer(f_output)
            output_csvs[animal] = [f_output, csv_output]
            csv_output.writerow(header)
            csv_output.writerow(row)

for csv_file, csv_writer in output_csvs.values():
    csv_file.close()

This would give you a set of output CSV files named according to the animal, eg animal_1.csv


Alternatively, if the data is small enough to be read into memory, it could be sorted by animal and output one block at a time by making use of Python's itertools.groupby() function:

from itertools import groupby
import csv

with open('input.csv', 'rb') as f_input:
    csv_reader = csv.reader(f_input)
    header = next(csv_reader)

    for animal, group in groupby(sorted(csv_reader), lambda x: x[0]):
        with open('animal_{}.csv'.format(animal), 'wb') as f_output:
            csv_output = csv.writer(f_output)
            csv_output.writerow(header)
            csv_output.writerows(group)

Using sorted() ensures that all animals of the same kind are grouped together. If this is already the case in the data, sorting is not needed.


To access these files you could use glob.glob() :

import matplotlib.pyplot as plt            
import glob

for animal_filename in glob.glob('animal_*.csv'):
    with open(animal_filename, 'rb') as f_input:
        csv_input = csv.reader(f_input)
        heading = next(csv_input)
        x, y = [], []

        for row in csv_input:
            x.append(int(row[1]))
            y.append(int(row[2]))

        fig, ax = plt.subplots()
        plt.title(animal_filename)
        ax.scatter(x, y)

plt.show()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM