简体   繁体   中英

Python 3 - Calculate average & write to .csv

(A) Python code

import csv
from collections import defaultdict

data = defaultdict(str)

#Make a list with the predefined variables
definition = ["record_id", "abbreviation", "patient_id", "study_id",
"step_count", "distance", "ambulation_time", "velocity", "cadence",
"normalized_velocity", "step_time_differential", "step_length_differential",
"cycle_time_differential", "step_time", "step_length", "step_extremity",
"cycle_time", "stride_length", "hh_base_support", "swing_time",
"stance_time", "single_supp_time", "double_supp_time", "toe_in_out"]

#Read the GaitRite .csv
with open('C:/Users/Kay_v/Documents/School/Exports/Export 3.csv', 'r')  as f, open('C:/Users/Kay_v/Documents/School/Exports/result.csv', 'w') as outfile: 
    reader = csv.reader(f, delimiter=';')
    next(reader, None)  # skip the headers
    writer = csv.DictWriter(outfile, fieldnames=definition, lineterminator='\n')
    writer.writeheader()

#Read the .csv row by row
    for row in reader:
        #print(row)
        for item in definition:
            h = item.replace('_', '')
            r0 = row[0].lower().replace(' ', '')
            if h in r0:
                try:
                    avg = round((float(row[1].replace(',', '.')) + float(row[2].replace(',', '.'))) / 2, 2)
            except ValueError:
                avg = 0  # for cases with entry strings or commas
                #print(avg)
                print(h, r0, row[1], row[2])
                data[item] = row[1]

    data['record_id'] = 1

# Write the clean result.csv
    writer.writerow(data)

(B) The problem

The problem is about including the averages into the result.csv. I am using the following part of the code to calculate the average, whenever a variable has two values. In the current situation the average is calculated, but it's not showing in the result.csv

try:
    avg = round((float(row[1].replace(',', '.')) + float(row[2].replace(',', '.'))) / 2, 2)
except ValueError:
    avg = 0  # for cases with entry strings or commas

I hope anyone can help to get the average to show up in the result.csv aswell, would be highly appreciated!

Feel free to play with the export file i am using, you can download it here: CSV export file

Try this:

if h in r0:
    try:
        avg = round((float(row[1].replace(',', '.').replace(';', '.')) + float(row[2].replace(',', '.').replace(';', '.'))) / 2, 2)
        data[item] = avg
    except ValueError:
        data[item] = 0  # for cases with entry strings or commas
        #print(avg)
        print(h, r0, row[1], row[2])

You are calculating the average but not adding it into the csv file that the first problem you are facing- of what I understood from your ambiguous question. First add another column in the list definition and add the key named average into the dict data , here's the modified code:

import csv
from collections import defaultdict

data = defaultdict(str)

#Make a list with the predefined variables
definition = ["record_id", "abbreviation", "patient_id", "study_id",
"step_count", "distance", "ambulation_time", "velocity", "cadence",
"normalized_velocity", "step_time_differential", "step_length_differential",
"cycle_time_differential", "step_time", "step_length", "step_extremity",
"cycle_time", "stride_length", "hh_base_support", "swing_time",
"stance_time", "single_supp_time", "double_supp_time", "toe_in_out", "average"]

#Read the GaitRite .csv
with open('Export 3.csv', 'r')  as f, open('result.csv', 'w') as outfile: 
    reader = csv.reader(f, delimiter=';')
    next(reader, None)  # skip the headers
    writer = csv.DictWriter(outfile, fieldnames=definition, lineterminator='\n')
    writer.writeheader()

#Read the .csv row by row
    for row in reader:
        #print(row)
        for item in definition:
            h = item.replace('_', '')
            r0 = row[0].lower().replace(' ', '')
            if h in r0:
                try:
                    avg = round((float(row[1].replace(',', '.')) + float(row[2].replace(',', '.'))) / 2, 2)
                except ValueError:
                    avg = 0  # for cases with entry strings or commas
                # print(avg)
                # print(h, r0, row[1], row[2])
                data[item] = row[1]
                data['average'] = avg    
    data['record_id'] = 1

# Write the clean result.csv
    print data
    writer.writerow(data)

I will try to explain it in a better way. I would like the result.csv to eventually output the following:

Desired output

record_id  abbreviation  patient_id  study_id  step_count  distance  ambulation_time  velocity  cadence  normalized_velocity  step_time_differential  step_length_differential  cycle_time_differential  step_time  step_length  step_extremity  cycle_time  stride_length  hh_base_support  swing_time  stance_time  single_supp_time  double_supp_time  toe_in_out 
1                                              3           292,34    1,67             175,1     107,8                         0,004                   1,051                     0,008                    0,56       97,27                        1,11        194,64         4,65             0,47        0,65         0,47              0,18              1,45

The problem is that some of the values in the .csv i am reading the data from, contains 2 values (like step_time [0,558;0,554]) and others contain just 1 value (like step_count [3]). The ones containing just 1 value can be passed to the result.csv right away. But for the ones containing 2 values, the average of those 2 values should be calculated and then that value should also be passed to the result.csv

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM