简体   繁体   English

我需要哪个“JAR”文件才能在 Scala 中导入“org.apache.parquet”?

[英]Which "JAR" file do i need to be able to import "org.apache.parquet" in Scala?

When I try this:当我尝试这个时:

scala> import org.apache.parquet

It errors out:它出错了:

<console>:23: error: object parquet is not a member of package org.apache
       import org.apache.parquet

Question - which jar do i need to include in spark conf for this import to work?问题- 我需要在 spark conf 中包含哪个 jar 才能使此导入工作?

Note, this works fine:请注意,这工作正常:

scala> import org.apache.hadoop
import org.apache.hadoop

CDH jars i have access to: CDH jars 我可以访问:

$ ls /opt/cloudera/parcels/CDH-5.14.4-1.cdh5.14.4.p3503.3712/jars -al|grep parq
-rw-r--r--.  1 root root     12717 Jun 24  2019 kite-morphlines-hadoop-parquet-avro-1.0.0-cdh5.14.4.jar
-rw-r--r--.  1 root root    106448 Jun 24  2019 parquet-avro-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root     25490 Jun 24  2019 parquet-cascading-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root    956035 Jun 24  2019 parquet-column-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root     41084 Jun 24  2019 parquet-common-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root    278926 Jun 24  2019 parquet-encoding-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root    384620 Jun 24  2019 parquet-format-2.1.0-cdh5.14.4.jar
-rw-r--r--.  1 root root    132777 Jun 24  2019 parquet-format-2.1.0-cdh5.14.4-javadoc.jar
-rw-r--r--.  1 root root      6474 Jun 24  2019 parquet-format-2.1.0-cdh5.14.4-sources.jar
-rw-r--r--.  1 root root     23679 Jun 24  2019 parquet-generator-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root    212644 Jun 24  2019 parquet-hadoop-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root   2776911 Jun 24  2019 parquet-hadoop-bundle-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root    927867 Jun 24  2019 parquet-jackson-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root     84853 Jun 24  2019 parquet-pig-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root   2855960 Jun 24  2019 parquet-pig-bundle-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root     49233 Jun 24  2019 parquet-protobuf-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root     33088 Jun 24  2019 parquet-scala_2.10-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root     22932 Jun 24  2019 parquet-scrooge_2.10-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root      6287 Jun 24  2019 parquet-test-hadoop2-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root    207188 Jun 24  2019 parquet-thrift-1.5.0-cdh5.14.4.jar
-rw-r--r--.  1 root root     67029 Jun 24  2019 parquet-tools-1.5.0-cdh5.14.4.jar

It looks like parquet is a base package看起来parquet是一个base package

scala> import parquet.hadoop
import parquet.hadoop

scala> import parquet.hadoop.metadata
import parquet.hadoop.metadata

scala> import parquet.hadoop.metadata.ParquetMetadata
import parquet.hadoop.metadata.ParquetMetadata

scala> val nof = parquet.format.converter.ParquetMetadataConverter.NO_FILTER
nof: parquet.format.converter.ParquetMetadataConverter.MetadataFilter = NO_FILTER

scala> nof
res1: parquet.format.converter.ParquetMetadataConverter.MetadataFilter = NO_FILTER

scala> print(nof)
NO_FILTER
scala> import parquet.hadoop.ParquetFileReader
import parquet.hadoop.ParquetFileReader

In Pyspark:在 Pyspark:

>>> pfr = sc._gateway.jvm.parquet.hadoop.ParquetFileReader
>>> pfr
<py4j.java_gateway.JavaClass object at 0x7f6fca6d1c90>


>>> nof =  sc._gateway.jvm.parquet.format.converter.ParquetMetadataConverter.NO_FILTER
>>> nof
JavaObject id=o64
>>> str(nof)
'NO_FILTER'

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Java - com.cloudera.sqoop与org.apache.sqoop从sqoop jar导入? - Java - com.cloudera.sqoop vs. org.apache.sqoop which to import from sqoop jar? hadoop-core.jar中不存在org.apache.hadoop.conf.Configuration - org.apache.hadoop.conf.Configuration does not exist in hadoop-core.jar 是否可以在蜂巢外部表中压缩包含Json数据的Parquet文件? - Is it possible to compress Parquet file which contain Json data in hive external table? cdh5.1.2是否支持Parquet文件格式 - Does cdh5.1.2 support Parquet file format 为什么eclipse无法为hbase导入软件包? - why eclipse is not able to import package for hbase? 如何将Windows OS的文件复制到cloudera? 我也想将.csv文件导入到配置单元中,我该怎么做? - How to copy a file fron windows os to cloudera ? I also want to import the .csv file in to the hive, how can i do that? org.apache.commons.math3.linear.SingularMatrixException:矩阵是奇异的 - org.apache.commons.math3.linear.SingularMatrixException: matrix is singular java.lang.NoClassDefFoundError:org / apache / hadoop / hdfs / BenchmarkThroughput - java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/BenchmarkThroughput Spark-获取异常org.apache.spark.Logging找不到 - Spark - Getting exception org.apache.spark.Logging not found 没有找到org.apache.hadoop.classification.InterfaceAudience的Cloudera Hadoop类文件 - Cloudera Hadoop Class file for org.apache.hadoop.classification.InterfaceAudience not found
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM