简体   繁体   中英

Getting the count of results returned by a MySQL Query with JDBC in the most performance efficient way

The accepted way of getting the number of results from a JDBC result seems to be, to do resultSet.last() , and then resultSet.getRow() , according to this answer . But, in that answer, the author also says:

it may not be a good idea as it can mean reading the entire table over the network and throwing away the data. Do a SELECT COUNT(*) FROM ... query instead.

I'm looking for a definite answer on this. Performance wise, will it be better to do a separate COUNT(*) query to get the number of results, or will it be better to do resultSet.last() and resultSet.getRow() , followed by resultSet.first() again?

If the ResultSet has already fetched the results and is holding them in memory, then it'd undoubtedly be better just to do last() and getRow() as (I assume) it would just loop over the results in the memory. But the OP's answer above seems to imply that it lazy loads the results from the db as they're requested.

Making a separate query is not a good solution (on my opinion) because server will search rows twice for one result set.

To prevent "double" query MySQL has function FOUND_ROWS(), it returns a number of rows founded for current conditions from WHERE clause. This is very useful when you use LIMIT and OFFSET in the query. I believe that using this function is a better solution. http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_found-rows

I had read some time ago that JDBC streams rows from server, but this was not official documentation and I can't tell how it works in the present, but queries can be different - with difficult sub-queries and joins. I think that server can finish all resources if will "stream" these rows only when client asked for them.

According to me it depends on your use case. If you resultset is small enough then the performance difference would be negligible. I highly doubt that the resultset fetches the whole result in the main memory for larger resultsets. To second this here is the link that talks about this particular scenario. In such case calculating size using last() , getrows() followed by first() would be obviously inefficient as it has to first load the portion of resultset in memory and perform these operations(and don't forget the network transfer time) to calculate the net size. On the other hand count(*) would just go in and count the rows in your result set.

This is my understanding of which one out performs other. I am open to any inputs from others

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM