简体   繁体   中英

pulling data from one csv file to another in python

What I'm trying to do is to take the data that is read from filteredData.csv and and run through and figure out how to find the average amount of snow for each location within the year of 2016. Then I want to take that data and load it into a new csv file called average2016.csv.

I have currently tried to open up theaverage2016.csv within the open file of filteredData and tried to loop in the location and average snow.

data2 = open('average2016.csv','w')
for row in csv1:
    print (location + "," + average_snow)
data2.close()

My whole code looks like this:

import csv
data = open('filteredData.csv','r')
# Create Dictionaries to store location values
# Snow_Fall is the number of inches for that location
# Number_Days is the number of days that there is Snowfall data for
Snow_Fall = {}
Number_Days = {}

# Create CSV reader
csv1 = csv.DictReader(data,delimiter=',')
# read each row of the CSV file and process it
for row in csv1:
    # Check the date column and see if it is in 2016
    if "2016" in row["DATE"]:
        # Check to see if the value in the snow column is null/none if so then skip processing that row
        if (row["SNOW"] is None) or (row["SNOW"] == ""):
            pass
        else:
            # Check to see if the location has been added to the dict if it has then add the data to itself
            # If it has not then just assign the data to the location.
            if row["NAME"] in Snow_Fall:
                Snow_Fall[row["NAME"]] = Snow_Fall[row["NAME"]] + float(row["SNOW"])
                Number_Days[row["NAME"]] = Number_Days[row["NAME"]] + 1
            else:
                Snow_Fall[row["NAME"]] = float(row["SNOW"])
                Number_Days[row["NAME"]] = 1

# For each location we want to print the data for that location
for location in Snow_Fall:
   print ("The number of inches for location " + location + " is " + str(Snow_Fall[location]))            
   print ("The number of days of snowfall for location " + location + " is " + str(Number_Days[location]))
   print ("The average Number of Inches for location " + location + " is " + str(Snow_Fall[location] / Number_Days[location]))
data2 = open('average2016.csv','w')
for row in csv1:
    print (location + "," + average_snow)
data2.close()
data.close()

and 这是我的filteredData文件的示例

pandas is definitely your friend here. Consider something along these lines:

import pandas as pd

df = pd.read_csv('filteredData.csv')

# Assuming that you are trying to find the average snow in each STATION
by_location = df.groupby('STATION')
# Get the means for each location
snow_means = by_location.mean().SNOW

# The following 2 lines are just to make everything more readable in your new csv (you could skip them if wanted):
snow_means = snow_means.to_frame()
# Rename your column to 'AVG_SNOW'
snow_means.columns = ['AVG_SNOW']

# Finally, write your new dataframe to CSV
snow_means.to_csv('average2016.csv', header=True)

Note that this is untested (but should work). If you post a minimal example with some rows from your dataframe (instead of a screenshot), I can test and debug it to make sure everything is fine. I highly recommend following a pandas tutorial if you are going to be trying to replace excel with python.

for location in Snow_Fall:
    print (location + "," + str(Snow_Fall[location] / Number_Days[location]),file=data2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM