简体   繁体   中英

table distinct from Editor Power Query to Python

I need to transform a data transformation programm from Editor Power Query to Python. I am not very good with Editor power Query.

I have a table with 30ish columns, with a task like this:

= Table.Distinct(#"Previous task", {"column1"})

When I do this, which row does it take? the first one? the last one? A random one? How can I translate this to Python Pandas, to be sure to have same data?

Thanks for answer

By default, I believe Table.Distinct keeps the first row.

In Pandas, you could use something like:

df.drop_duplicates(keep='first', inplace=True)

The keep parameter specifies which row you want to keep, and the inplace parameter just makes sure the changes are made to the dataframe itself and not a copy of the dataframe.

See more here on pandas.DataFrame.drop_duplicates .

Also, here's some more information on Table.Distinct and how you can preserve the sort order of a table before performing the operation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM