简体   繁体   中英

Drill to Hive connectivity error(org.apache.thrift.transport.TTransportException java.net.SocketException: Broken pipe (Write failed))

I am getting the following error:

ERROR hive.log - Got exception: org.apache.thrift.transport.TTransportException java.net.SocketException: Broken pipe (Write failed) while trying to connect Drill to Hive. 

For "Hive" Microsoft Azure HDInsight(Ranger Enabled) (Remote metastore (MS SQL Server)) is getting used and for Drill I am using other VM which is under same VNet as cluster. I am able to make Drill Storage plugin with below configuration

{
"type": "hive",
"enabled": true,
"configProps": {
"hive.metastore.uris": "thrift://hn0-xyz.cloudapp.net:9083,thrift://hn1-    xyz.cloudapp.net:9083",
"hive.metastore.warehouse.dir": "/hive/warehouse",
"fs.default.name": "wasb://qwerty@demo.blob.core.windows.net",
"hive.metastore.sasl.enabled": "false"
}
}

Stack Trace of error:

17:57:19.515 [2779bbff-d7a9-058c-d133-b41795a0ee58:foreman] ERROR hive.log - Got exception: org.apache.thrift.transport.TTransportException java.net.SocketException: Broken pipe (Write failed)
org.apache.thrift.transport.TTransportException: java.net.SocketException: Broken pipe (Write failed)
        at org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161) ~[drill-hive-exec-shaded-1.9.0.jar:1.9.0]
        at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65) ~[drill-hive-exec-shaded-1.9.0.jar:1.9.0]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_all_databases(ThriftHiveMetastore.java:733) ~[hive-metastore-1.2.1.jar:1.2.1]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_all_databases(ThriftHiveMetastore.java:726) ~[hive-metastore-1.2.1.jar:1.2.1]
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getAllDatabases(HiveMetaStoreClient.java:1031) ~[hive-metastore-1.2.1.jar:1.2.1]
        at org.apache.drill.exec.store.hive.DrillHiveMetaStoreClient.getDatabasesHelper(DrillHiveMetaStoreClient.java:205) [drill-storage-hive-core-1.9.0.jar:1.9.0]

core-site.xml:

<configuration>
<property>
        <name>fs.azure.account.keyprovider.kkhdistore.blob.core.windows.net</name>
      <value>org.apache.hadoop.fs.azure.ShellDecryptionKeyProvider</value>
</property>
<property>
    <name>fs.azure.shellkeyprovider.script</name>
    <value>/usr/lib/python2.7/dist-   packages/hdinsight_common/decrypt.sh</value>
  </property>
  <property>
    <name>fs.azure.account.key.kkhdistore.blob.core.windows.net</name>
    <value>{COPY FROM CLUSTER core-site.xml}</value>
  </property>
  <property>
    <name>fs.AbstractFileSystem.wasb.impl</name>
    <value>org.apache.hadoop.fs.azure.Wasb</value>
  </property>
 </configuration>

According to the section Non-public ports of the offical docuement Ports and URIs used by HDInsight , as the note below said, I doubt the Hive you used was installed manually on Azure HDInsight cluster, not Hive cluster type.

Some services are only available on specific cluster types. For example, HBase is only available on HBase cluster types.

So the thrift port 9083 is non-public port for others like Drill even if they are under the same VNet. The solution for the issue is that following the document Extend HDInsight capabilities by using Azure Virtual Network to create a rule to allow the inbound of the port at the NSG of the cluster. Hope it helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM