简体   繁体   中英

Conditional Sum/Average/etc… CSV file in Python

First off, I've found similar articles, but I haven't been able to figure out how to translate the answers from those questions to my own problem. Secondly, I'm new to python, so I apologize for being a noob.

Here's my question: I want to perform conditional calculations (average/proportion/etc..) on values within a text file

More concretely, I have a file that looks a little something like below

0    Diamond    Correct
0    Cross      Incorrect
1    Diamond    Correct
1    Cross      Correct

Thus far, I am able to read in the file and collect all of the rows.

import pandas as pd
fileLocation = r'C:/Users/Me/Desktop/LogFiles/SubjectData.txt'
df = pd.read_csv(fileLocation, header = None, sep='\t', index_col = False,
                 name = ["Session Number", "Image", "Outcome"])

I'm looking to query the file such that I can ask questions like:

--What is the proportion of "Correct" values in the 'Outcome' column when the first column ('Session Number') is 0? So this would be 0.5, because there is one "Correct" and one "Incorrect".

I have other calculations I'd like to perform, but I should be able to figure out where to go once I know how to do this, hopefully simple, command.

Thanks!

# getting the total number of rows
total = len(df)  

# getting the number of rows that have 'Correct' for 'Outcome' and 0 for 'Session Number'
correct_and_session_zero = len(df[(df['Outcome'] == 'Correct') & 
                                  (df['Session Number'] == 0)])

# if you're using python 2 you might need to convert correct_and_session_zero  or total
# to float so you won't lose precision
print(correct_and_session_zero / total)

you can also do it this way:

In [467]: df.groupby('Session#')['Outcome'].apply(lambda x: (x == 'Correct').sum()/len(x))
Out[467]:
Session#
0    0.5
1    1.0
Name: Outcome, dtype: float64

it'll group your DF by Session# and calculate Ratio of correct Outcomes for each group ( Session# )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM