I have a CSV file containing survey data on 60 participants. The first column is the participant's number, and for each number corresponds all the data collected from that participants. It looks something like:
Participant number: 1, Gender: Female, Level of study: Postgrad
I would like to create a dictionary where the key is the Participant Number and the value is the whole of the row with all the data, to have something like this:
{1: Female, Postgrad, American, Yes, No, No, Yes, Yes, No...} and so on. I am still a newbie and so far this is what I tried:
with open('surveys.csv', 'r') as f:
reader = csv.reader(f, delimiter=' ')
with open('new_surveys.csv', mode='w') as outfile:
writer = csv.writer(outfile)
mydict = {rows[0]:rows for rows in reader}
print(mydict)
But this prints something like:
{'\"': ['\"'], 'Participant/Question","1.': ['Participant/Question","1.', 'Gender'], ',2.': [',2.', 'Level', 'of', 'study'],} which does not make any sense to me at the moment...
Thank you!
Edit:
This is one complete row of data:
You can try this?
import csv
with open('surveys.csv', 'r') as f:
reader = csv.reader(f, delimiter=' ')
mydict={}
iterreader = iter(reader)
next(iterreader)
for row in iterreader:
elementsList=row[0].split("\t")
nonEmptyElements=[]
for element in elementsList[1:]:
print(element)
if(not element.strip()==""):
nonEmptyElements.append(element)
valuesList=",".join(nonEmptyElements)
mydict[elementsList[0]]=valuesList
print(mydict)
My CSV looks like this
Participant Name Gender
1 Rupin Male
2 Poonam Female
3 Jeshan Male
The code avoids using the first row.
My output looks like this
{'1': 'Rupin,Male', '2': 'Poonam,Female', '3': 'Jeshan,Male'}
From the comments , we know the first 100 bytes of the raw file are:
b'\xef\xbb\xbf"\nParticipant/Question","1. Gender\n","2. Level of study\n","3. How often visit SC\n","4. Time of vi'
This looks like a csv export from Excel, with embedded newlines in the cells. The initial b'\\xef\\xbb\\xbf'
is a byte order mark, indicating that the bytes are encoded as 'utf-8-sig'.
Based on this information, this code should create the desired dictionary:
with open('surveys.csv', 'r', encoding='utf-8-sig') as f:
reader = csv.reader(f, dialect='excel')
# Advance the iterator to skip the header row
next(reader)
mydict = {row[0]:row for row in reader}
print(mydict)
Passing the 'utf-8-sig' encoding ensures that the byte-order-mark doesn't get treated as part of the data. It's probably a good idea to set this encoding when reading and writing csv files if you are working with Excel.
Passing dialect='excel'
to the reader tells it to use the defaults associated with csv files created by Excel, such as using a comma as the delimiter.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.