pd.read_sql_query("""SELECT Tab1.Title, NewTab.NewCol1 FROM
(SELECT Col1 AS NewCol, COUNT(*) AS NewCol1
FROM Tab2 GROUP BY Col1) AS NewTab
JOIN Tab1 ON NewTab.NewCol=Tab1.Id
WHERE Tab1.Num=1
ORDER BY NewCol1 DESC""", conn)
My goal is to rewrite it using only pandas' methods and functions. First things first, I'd like to assign a new column NewCol
that would contain also a new column PostId
, but I highly doubt that I should do it in two steps. Could anyone please guide me towards solution or provide a full code I could analyze?
Would you like to rewrite this query in pandas in only one line? It might be done but it's highly unreadable. Something like this looks much neater
NewTab = Tab2.groupby('Col1').size().reset_index(name = 'NewCol1').rename(columns = {'Col1': 'NewCol'})
And now you can merge those two tables:
result_df = pd.merge(NewTab, Tab1, left_on = 'NewCol', right_on = 'Id')[result_df.Num == 1]
You can now sort the data frame after merging and specify the columns:
result_df.sort_values(by=['NewCol1'], inplace = True)
result_df = result_df[['Title','NewCol1']]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.