简体   繁体   中英

Get unique options in a row, of specific column using pandas

I have a dataset that looks something like this:

ID  Result1 Result2
1   Yes     Pos
2   No      Neg
3   No      Pos
4   Yes     Neg
5   Yes     Neg
6   No      Pos

My main goal is split the dataset (which is much larger than this) into subsets based on certain criteria. I want to be able to run this splitting process by selecting the column that contains the deciding criteria, and then selecting the option to perform the splitting process.

For example:

Please enter column to segment by:
-Result2

Please enter the criteria in [Results2] to segment by:
-Pos

Then it should output the same dataset, but only containing the data for all fields corresponding to "Pos" results in the Results2 column.

So far I can select the column with this code:

import pandas as pd
# Read in the data from the csv
df = pd.read_csv("filename.csv",sep = ";")

# Get the column headers
headers = list(df.columns.values)

# Display the header options fo user
for k in range(0,len(headers)):
    print k,headers[k]


# Get the column to sort the file by
header_idx = int(raw_input("\nPlease choose the header to segment by: (0,1,2 etc) \n"))

Now I want to list all the available options in the selected column to the user and call the segmentation function (which is a bridge I still need to cross).

Not sure I follow, but is this what you are looking to do?

col = raw_input('Please enter column to segment by: ')
val = raw_input('Please enter the criteria in to segment by: ')
df2 = df[df.loc[:,col]==val]
print df2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM