I have a SQL query:
pd.read_sql_query("""SELECT UvTab.A, UvTab.Uv,
IFNULL(DvTab.Dv, 0) AS Dv
FROM
(
SELECT A, COUNT(*) AS Uv FROM B
WHERE Vtype = 2 GROUP BY A
) AS UvTab
LEFT JOIN
(
SELECT A, COUNT(*) AS Dv FROM B
WHERE Vtype = 3 GROUP BY A
) AS DvTab
ON UvTab.A = DvTab.A
""", conn)
And my goal is to get the same result but using only pandas' methods. What I obtained is:
UvTab = B.loc[B.Vtype == 2].groupby("A").size()
UvTab = pd.DataFrame({'A' : UvTab.index, 'Uv' : UvTab.values})
DvTab = B.loc[B.Vtype == 3].groupby("A").size()
DvTab = pd.DataFrame({'A' : DvTab.index, 'Dv' : DvTab.values})
df = pd.merge(UvTab, DvTab, how='left', on='A')
df['Dv'] = df['Dv'].fillna(0)
And it seems to be fine. But is this the simpliest and the best way to represent the query?
One idea is aggregate sum
for count matching and then use DataFrame.join
:
UvTab = (B.Vtype == 2).astype(int).groupby(B["A"]).sum().reset_index(name='Uv')
DvTab = (B.Vtype == 3).astype(int).groupby(B["A"]).sum().to_frame('Dv')
df = UvTab.join(DvTab, on='A').fillna({'DV':0})
Or alternative with merge
:
UvTab = (B.Vtype == 2).astype(int).groupby(B["A"]).sum().reset_index(name='Uv')
DvTab = (B.Vtype == 3).astype(int).groupby(B["A"]).sum().reset_index(name='Dv')
df = UvTab.merge(DvTab, on='A', how='left').fillna({'DV':0})
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.