简体   繁体   中英

Spark temporary table is not shown in beeline

I have a spark cluster at AWS EMR and try to start the following code with thrift-server:

...
JavaSparkContext jsc = new JavaSparkContext(SparkContext.getOrCreate());
HiveContext hiveContext = new HiveContext(jsc);
JavaRDD<Person> people = jsc.textFile("people.txt").map(
  new Function<String, Person>() {
    public Person call(String line) throws Exception {
      ...
    }
});
DataFrame schemaPeople = hiveContext.createDataFrame(people, Person.class);
schemaPeople.registerTempTable("people_temp");
schemaPeople.saveAsTable("people");

HiveThriftServer2.startWithContext(hiveContext);
...

I run this code with a command: sudo ./sbin/start-thriftserver.sh --jars /home/ec2-user/some.jar --class spark.jobs.thrift.ThriftServerInit

After thrift server was started I connect to it with beeline: !connect jdbc:hive2://localhost:10001 , run show tables; and get a result:

+--------------+--------------+--+
|  tableName   | isTemporary  |
+--------------+--------------+--+
| people       | false        |
+--------------+--------------+--+

I expect to see a temporary table people_temp too. Why people_temp is absent?

On latest Spark 1.6.* I found that needed to explicitly set the configuration flag to single session to make it work with temp tables: spark.sql.hive.thriftServer.singleSession=true Take a look at the migration guide http://spark.apache.org/docs/latest/sql-programming-guide.html#upgrading-from-spark-sql-15-to-16 Hope this helps

Rod

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM