I have a dataset that looks something like this:
ID Result1 Result2
1 Yes Pos
2 No Neg
3 No Pos
4 Yes Neg
5 Yes Neg
6 No Pos
My main goal is split the dataset (which is much larger than this) into subsets based on certain criteria. I want to be able to run this splitting process by selecting the column that contains the deciding criteria, and then selecting the option to perform the splitting process.
For example:
Please enter column to segment by:
-Result2
Please enter the criteria in [Results2] to segment by:
-Pos
Then it should output the same dataset, but only containing the data for all fields corresponding to "Pos" results in the Results2 column.
So far I can select the column with this code:
import pandas as pd
# Read in the data from the csv
df = pd.read_csv("filename.csv",sep = ";")
# Get the column headers
headers = list(df.columns.values)
# Display the header options fo user
for k in range(0,len(headers)):
print k,headers[k]
# Get the column to sort the file by
header_idx = int(raw_input("\nPlease choose the header to segment by: (0,1,2 etc) \n"))
Now I want to list all the available options in the selected column to the user and call the segmentation function (which is a bridge I still need to cross).
Not sure I follow, but is this what you are looking to do?
col = raw_input('Please enter column to segment by: ')
val = raw_input('Please enter the criteria in to segment by: ')
df2 = df[df.loc[:,col]==val]
print df2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.