Performing union on three Group by Resultant dataframes with same columns, different order

Question

I have created three different pandas dataframes by Applying Group By on three different Data having Columns A,B,C using.

Resultdf=SessionDev.query(AppDetails).filter(text(" A in ('20170727L00319')")).all()

df1= Resultdf.groupby(["A", "B","C"]).size().reset_index(name='Count')

[df1]

    A              |      B           | C  |Count

0 | 20170727L00319  |      423605030008907  |   319     |   1

1 | 20170727L00319   |     42360604002461     | 319   |   1

[df2]

   A               |     B            |  C  |  Count

0 | 20170727L00319   |   423605030008907   |  319   |   2

1 | 20170727L00319   |   423606040002461   |  319    |  2

[df3]

    A              |     B            |  C  | Count

0 | 20170727L00319   |   423605030008907   |  319  |    1

1 | 20170727L00319   |   423606040002461   |  319  |    2

I want to perform an union(Excluding Duplicate) on the above three Grouped dataframes Result into Single dataframes having Distinct Result

I have tried to concat this three different dataframe and then removing duplicate using drop_duplicates but i am unable find any result

A                  |    B             | C

0 | 20170727L00319  |  423605030008907  |  319

1 | 20170727L00319  |  423606040002461  |  319

2 | 20170727L00319  |  423605030008907  |  319

3 | 20170727L00319  |  42360604002461   |  319

5 | 20170727L00319  |  423606040002461  |  319

Using

FinalUnion=pd.concat([df1,df2,df3],ignore_index=True,join_axes=[df1.drop(['Count'],axis=1)

FinalUnion.drop_duplicates(['B','C'], keep='first')

I am Expecting Result as Below

         A             |    B             |   C

0 | 20170727L00319  |  423605030008907  |  319

1 | 20170727L00319  |  423606040002461  |  319

3 | 20170727L00319  |  42360604002461     |  319

Update:

After performing drop_duplicates on Column A and B,i have got distinct result.But performing drop_duplicates on any other combination seems to fail.

Answer 1

The issue was simple,as i have used data from three different tables into three different model and then into three different pd dataframe. And then Perform Group by and then Concat and Drop Duplicate to get Distinct result.

Resolution : Column [C] for the First two tables where having datatype varchar, where as for the third table it was big-int,cos of which the drop_duplicate failed to provide appropriate result

Changing the datatype gave the exact result. Another way to dynamically convert datatype is using df1[["C"]] = df1[["C"]].apply(pd.to_numeric)

Performing union on three Group by Resultant dataframes with same columns, different order

Question

1 answers

solution1
0 2019-01-01 14:10:11

Performing union on three Group by Resultant dataframes with same columns, different order

Question

1 answers

solution1 0 2019-01-01 14:10:11

solution1
0 2019-01-01 14:10:11