简体繁体中英

hadoop orc table taking only one mapper all the time

原文 2016-01-26 15:28:28 5 1 sql/ hadoop/ hive/ orc/ bigdata

In my current project I am working with Orc files with snappy compression format ,What ever query I run it is running with only one mapper .I tried to configure the mapred.max.split.size and mapred.min.split.size,but is not showing any changes in the number of mappers.The reducer count is good enough ,but as the mapper is a single mapper,The time to run a simple query like .

select x,max(y) from z group by x ; is taking almost 20 mins to complete the mapper . Is there any other things I should do to increase the number of mappers.

Please don't tell that to use the partitions or buckets ,As I have used them already in my table.

1 answers

Try to play with tblproperties orc.stripe.size.

The default value for stripe size is 256 MB and technically there is one mapper per one stripe. With decreasing size of single stripe you can increase number of mappers.

Group By one same table taking long time

SELECTING * From a Table But Only Taking One Row with Highest Value

SQLite table with two FKs, but only one at a time

sql select all for one table only

SQL 3 table Join While taking all values from 1 table but only filled from other 2

How to join two tables without a relationship, taking only one value from a table and pasting it to the other table?

hadoop operation only writing one row?

SQL Query for only one table - takes too much time

Query to fetch in-out time for all employees from one table

SQL Table Comparison Taking Extended Periods of Time

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Group By one same table taking long time SELECTING * From a Table But Only Taking One Row with Highest Value SQLite table with two FKs, but only one at a time sql select all for one table only SQL 3 table Join While taking all values from 1 table but only filled from other 2 How to join two tables without a relationship, taking only one value from a table and pasting it to the other table? hadoop operation only writing one row? SQL Query for only one table - takes too much time Query to fetch in-out time for all employees from one table SQL Table Comparison Taking Extended Periods of Time

Related Tags

hadoop orc table taking only one mapper all the time

Question

1 answers

solution1 0 2016-04-05 12:11:52

solution1
0 2016-04-05 12:11:52