Comparing values of rows…row by row?

Question

I have a small query that joins together some temp tables like such

select u.batch_uid, u.user_id, u.firstname, u.middlename, u.lastname, u.email, u.student_id, u.row_status, uff.batch_uid, uff.user_id, uff.firstname,uff.middlename,uff.lastname,uff.email, uff.student_id,uff.row_status
from users u full outer join users_feed_file uff on u.user_id = uff.user_id
where u.data_src_pk1 = 83

The results would for example be something like this:

(users as u) batch_uid user_name row_status (users_feed_file as uff) batch_uid user_name row_status
            johndoe   johndoe            2                           johndoe   johndoe            0

Because, the first 3 columns come from a source table that is being replicated from a live table. The last 3 columns come from a feed file that gets processed and inserted into a temp table and are then dropped after the run time is completed(and re-loaded later with new data).

What I'm trying to accomplish is basically looking at rows to perform various operations. I'm going to be checking nearly 25,000 rows. So in this case, what I'd like to do is check something like

if u.batch_uid, u.user_name, u.row_status is not null
and
uff.uid, uff.user_name, uff.row_status is not null
and u.row_status is equal to 2 and uff.row_status is equal to 0
add user to feed file to enable him

However these(and other kinds of conditions and checks)need to be done against all 25k rows that get returned and then processed in C# row by row to determine if my code needs to insert a line into a file or not.

Thank you.

Answer 1

You have a couple of different issues to address in your question.

First, in your initial SELECT you're using a FULL OUTER JOIN--but you're explicitly looking for records where the three fields in (table) User match the three fields in (table) UserFeed. You will see dramatically better performance--and process a lot fewer records--with an INNER JOIN, like this:

SELECT u.batch_uid, u.user_id, u.firstname, 
u.middlename, u.lastname, u.email, u.student_id, 
u.row_status, uff.batch_uid, uff.user_id, uff.firstname, 
uff.middlename, uff.lastname, uff.email, uff.student_id, 
uff.row_status
FROM users u 
INNER JOIN users_feed_file uff 
ON u.user_id = uff.user_id
WHERE u.data_src_pk1 = 83
AND u.row_status = 2
AND uff.row_status = 0;

That will give you just the rows that match your complete condition--it should be a relatively small set of rows.

But--if you're just retrieving records from (table) User to compare with (table) UserFeedFile, why get the user's name, address, etc.? No need--just get the data you want:

SELECT u.user_id
FROM user U
INNER JOIN userfeedfile UFF
ON U.user_id = UFF.user_id
WHERE U.row_status = 2
AND UFF.row_status = 0
AND U.data_src_pkt1 = @PacketNumber;   -- That's a parameter

The next question is: what are you going to do with those rows? If you're going to update a field value in another table (or, perhaps, in the User Feed File table) you can do it with an INSERT or UPDATE statement. To update (table) UserFeedFile, do this:

UPDATE userfeedfile
SET enabled = 1
FROM user U
INNER JOIN userfeedfile UFF
ON U.user_id = UFF.user_id
WHERE U.row_status = 2
AND UFF.row_status = 0
AND U.data_src_pkt1 = @PacketNumber;

(You can change the second line, with the SET statement, to update any field or fields you choose.)

As a general rule, SQL databases work best on sets of data. If you find yourself iterating over a data set a row at a time, and ESPECIALLY if you are going to surface the data to a different process (potentially across the network on a difference machine) to handle the row in .Net code, stop and think of how to do it within SQL Server, using sets. The performance difference will be dramatic.

Comparing values of rows…row by row?

Question

1 answers

solution1
0 2014-06-05 17:04:02

Comparing values of rows…row by row?

Question

1 answers

solution1 0 2014-06-05 17:04:02

solution1
0 2014-06-05 17:04:02