So I have two tables (let's say x and y). Most of the data from both table are duplicated but there are some rows that are different. I insert whole data from those two tables into a new table (let's say table_mixed). There's one column that indicate table's date eg 20190307 for x and 20190308 for y So, for any duplicated rows, there will be a date column that is different.
num Code col1 col2 col3..... import_date file_date
-------- --------- ----------------- ---------- ----------
01 AA ...... 20190308 20190307
01 AA ...... 20190308 20190308
02 AA ...... 20190308 20190307
03 BB ...... 20190308 20190308
What I am trying to do is, I want to query a data such that, show a non-duplicated row from both table and for any duplicated row, shows only a row with recent date.
I have done some finding and I have tried this:
select *,max(file_date) over (partition by stx_import_date) max_date
from table_mixed;
where file_date is a date that tell the different date from each table and every row from both table has the same import_date.
num Code col1 col2 col3...... import_date file_date max_date
-------- --------- ------------ ---------- ----------
01 AA ...... 20190308 20190307 20190308
01 AA ...... 20190308 20190308 20190308
02 AA ...... 20190308 20190307 20190307
03 BB ...... 20190308 20190308 20190308
The result from this query show every row (including all duplicated row) and add another column (max_date) that show only the recent file_date for each of the row. But I want the result to show only what I mentioned above and no additional column (max_date).
This is the result that I am looking for:
num Code col1 col2 col3... import_date file_date
-------- --------- ------------ ---------- --------
01 AA ...... 20190308 20190308
02 AA ...... 20190308 20190307
03 BB ...... 20190308 20190308
Thank you
PS Not only column num, code and import date that need to be duplicated but also other columns that I ..... So, what I mean duplicated row >> every column except file_date (which I have 10+ columns)
PS2 I edited the example so that you guy want get me wrong. There're other columns (like col1, col2, col3 and so on) that also used. How should I use partitioned by in this case
Use row_number window function.
num,code,import_date..etc
columns. file_date
desc Sample query:
Select * from (
select *,row_number() over (partition by num,code,stx_import_date order by file_date desc) row_number
from table_mixed)t
where t.row_number = 1;
You seem to care about the num
column and only want the most recent data. You can do this with your mixed table as:
select tm.*
from (select tm.*,
row_number() over (partition by num, code, . . . order by file_date desc) as seqnum
from table_mixed
) tm
where seqnum = 1;
Note: If the file dates are the same, then an arbitrary row will be chosen.
This may be more efficient to do when you create the mixed table. You can just do:
select y.*
from y
union all
select x.*
from x left join
y
on x.num = y.num and
x.code = y.code and
. . .
where y.num is null;
This returns all rows from y
(the more recent) along with any non-matching rows from x
(and assumes none of the column values are NULL
).
Maybe grouping by the MAX() of whatever date field will be the differentiator?
;WITH get_max_dt AS (
SELECT TM.[num]
, TM.[Code]
, TM.[import_date]
, MAX(TM.[file_date]) AS [file_date]
FROM table_mixed AS TM
GROUP BY TM.[num],TM.[Code],TM.[import_date]
)
SELECT *
FROM get_max_dt
Output:
num Code import_date file_date
01 AA 20190308 20190308
02 AA 20190308 20190307
03 BB 20190308 20190308
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.