(A) Python code
import csv
from collections import defaultdict
data = defaultdict(str)
#Make a list with the predefined variables
definition = ["record_id", "abbreviation", "patient_id", "study_id",
"step_count", "distance", "ambulation_time", "velocity", "cadence",
"normalized_velocity", "step_time_differential", "step_length_differential",
"cycle_time_differential", "step_time", "step_length", "step_extremity",
"cycle_time", "stride_length", "hh_base_support", "swing_time",
"stance_time", "single_supp_time", "double_supp_time", "toe_in_out"]
#Read the GaitRite .csv
with open('C:/Users/Kay_v/Documents/School/Exports/Export 3.csv', 'r') as f, open('C:/Users/Kay_v/Documents/School/Exports/result.csv', 'w') as outfile:
reader = csv.reader(f, delimiter=';')
next(reader, None) # skip the headers
writer = csv.DictWriter(outfile, fieldnames=definition, lineterminator='\n')
writer.writeheader()
#Read the .csv row by row
for row in reader:
#print(row)
for item in definition:
h = item.replace('_', '')
r0 = row[0].lower().replace(' ', '')
if h in r0:
try:
avg = round((float(row[1].replace(',', '.')) + float(row[2].replace(',', '.'))) / 2, 2)
except ValueError:
avg = 0 # for cases with entry strings or commas
#print(avg)
print(h, r0, row[1], row[2])
data[item] = row[1]
data['record_id'] = 1
# Write the clean result.csv
writer.writerow(data)
(B) The problem
The problem is about including the averages into the result.csv. I am using the following part of the code to calculate the average, whenever a variable has two values. In the current situation the average is calculated, but it's not showing in the result.csv
try:
avg = round((float(row[1].replace(',', '.')) + float(row[2].replace(',', '.'))) / 2, 2)
except ValueError:
avg = 0 # for cases with entry strings or commas
I hope anyone can help to get the average to show up in the result.csv aswell, would be highly appreciated!
Feel free to play with the export file i am using, you can download it here: CSV export file
Try this:
if h in r0:
try:
avg = round((float(row[1].replace(',', '.').replace(';', '.')) + float(row[2].replace(',', '.').replace(';', '.'))) / 2, 2)
data[item] = avg
except ValueError:
data[item] = 0 # for cases with entry strings or commas
#print(avg)
print(h, r0, row[1], row[2])
You are calculating the average but not adding it into the csv file that the first problem you are facing- of what I understood from your ambiguous question. First add another column in the list definition
and add the key named average
into the dict data
, here's the modified code:
import csv
from collections import defaultdict
data = defaultdict(str)
#Make a list with the predefined variables
definition = ["record_id", "abbreviation", "patient_id", "study_id",
"step_count", "distance", "ambulation_time", "velocity", "cadence",
"normalized_velocity", "step_time_differential", "step_length_differential",
"cycle_time_differential", "step_time", "step_length", "step_extremity",
"cycle_time", "stride_length", "hh_base_support", "swing_time",
"stance_time", "single_supp_time", "double_supp_time", "toe_in_out", "average"]
#Read the GaitRite .csv
with open('Export 3.csv', 'r') as f, open('result.csv', 'w') as outfile:
reader = csv.reader(f, delimiter=';')
next(reader, None) # skip the headers
writer = csv.DictWriter(outfile, fieldnames=definition, lineterminator='\n')
writer.writeheader()
#Read the .csv row by row
for row in reader:
#print(row)
for item in definition:
h = item.replace('_', '')
r0 = row[0].lower().replace(' ', '')
if h in r0:
try:
avg = round((float(row[1].replace(',', '.')) + float(row[2].replace(',', '.'))) / 2, 2)
except ValueError:
avg = 0 # for cases with entry strings or commas
# print(avg)
# print(h, r0, row[1], row[2])
data[item] = row[1]
data['average'] = avg
data['record_id'] = 1
# Write the clean result.csv
print data
writer.writerow(data)
I will try to explain it in a better way. I would like the result.csv to eventually output the following:
Desired output
record_id abbreviation patient_id study_id step_count distance ambulation_time velocity cadence normalized_velocity step_time_differential step_length_differential cycle_time_differential step_time step_length step_extremity cycle_time stride_length hh_base_support swing_time stance_time single_supp_time double_supp_time toe_in_out
1 3 292,34 1,67 175,1 107,8 0,004 1,051 0,008 0,56 97,27 1,11 194,64 4,65 0,47 0,65 0,47 0,18 1,45
The problem is that some of the values in the .csv i am reading the data from, contains 2 values (like step_time [0,558;0,554]) and others contain just 1 value (like step_count [3]). The ones containing just 1 value can be passed to the result.csv right away. But for the ones containing 2 values, the average of those 2 values should be calculated and then that value should also be passed to the result.csv
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.