简体   繁体   中英

how to divide pandas dataframe into different dataframes based on unique values from one column and itterate over that?

I have a dataframe with three columns

The first column has 3 unique values I used the below code to create unique dataframes, However I am unable to iterate over that dataframe and not sure how to use that to iterate.

df = pd.read_excel("input.xlsx")

unique_groups = list(df.iloc[:,0].unique())    ### lets assume Unique values are 0,1,2
mtlist = []

for index, value in enumerate(unique_groups):
    globals()['df%s' % index] = df[df.iloc[:,0] == value]
    mtlist.append('df%s' % index)
print(mtlist)

O/P

['df0', 'df1', 'df2']

for example lets say I want to find out the length of the first unique dataframe if I manually type the name of the DF I get the correct output

len(df0)

O/P
35

But I am trying to automate the code so technically I want to find the length and itterate over that dataframe normally as i would by typing the name.

What I'm looking for is if I try the below code

len('df%s' % 0)

I want to get the actual length of the dataframe instead of the length of the string. Could someone please guide me how to do this?

I have also tried to create a Dictionary using the below code but I cant figure out how to iterate over the dictionary when the DF columns are more than two, where key would be the unique group and the value containes the two columns in same line.

df = pd.read_excel("input.xlsx")

unique_groups = list(df["Assignment Group"].unique())
length_of_unique_groups = len(unique_groups)
mtlist = []

df_dict = {name: df.loc[df['Assignment Group'] == name] for name in unique_groups}

Can someone please provide a better solution?

UPDATE

SAMPLE DATA

Assignment_group    Description                         Document
Group A             Text to be updated on the ticket 1  doc1.pdf
Group B             Text to be updated on the ticket 2  doc2.pdf
Group A             Text to be updated on the ticket 3  doc3.pdf
Group B             Text to be updated on the ticket 4  doc4.pdf
Group A             Text to be updated on the ticket 5  doc5.pdf
Group B             Text to be updated on the ticket 6  doc6.pdf
Group C             Text to be updated on the ticket 7  doc7.pdf
Group C             Text to be updated on the ticket 8  doc8.pdf

Lets assume there are 100 rows of data

I'm trying to automate ServiceNow ticket creation with the above data. So my end goal is GROUP A tickets should go to one group, however for each description an unique task has to be created, but we can club 10 task once and submit as one request so if I divide the df's into different df based on the Assignment_group it would be easier to iterate over(thats the only idea which i could think of)

For example lets say we have REQUEST001 within that request it will have multiple sub tasks such as STASK001,STASK002 ... STASK010.

hope this helps

Your problem is easily solved by groupby : one of the most useful tools in pandas . :

length_of_unique_groups = df.groupby('Assignment Group').size()

You can do all kind of operations (sum, count, std, etc) on your remaining columns, like getting the mean value of price for each group if that was a column.

我想你想尝试像len(eval('df%s' % 0))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM