简体   繁体   English

使用apache ignite在hdfs上写一个文件

[英]Write a file on hdfs using apache ignite

I want to insert data in hdfs with help of ignite write through cache. 我希望借助点火写入缓存在hdfs中插入数据。 I am using following example config file to run ignite node. 我使用以下示例配置文件来运行点火节点。

ignite.sh /app/apache-ignite-fabric-1.9.0-bin/examples/config/filesystem/example-igfs.xml

This is my core-site.xml file 这是我的core-site.xml文件

<configuration>
<configuration>
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://hmaster:9000/</value>
</property>
<property>
   <name>fs.file.impl</name>
   <!-- value>org.apache.hadoop.fs.LocalFileSystem</value  -->
<value>org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem</value>
   <description>The FileSystem for file: uris.</description>
</property>

<property>
   <name>fs.hdfs.impl</name>
   <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>
   <description>The FileSystem for hdfs: uris.</description>
</property>

 <property>
      <name>fs.igfs.impl</name>
      <value>org.apache.ignite.hadoop.fs.v1.IgniteHadoopFileSystem</value>
  </property>
</configuration>
</configuration>

when I do hadoop fs -cat igfs:/// it shows igfs file system. 当我做hadoop fs -cat igfs:///它显示igfs文件系统。 If I run any hadoop job by below command it inserts data in igfs. 如果我通过下面的命令运行任何hadoop作业, 它会在igfs中插入数据。 But I need to insert data in hdfs file system . 但我需要在hdfs文件系统中插入数据 How to insert data in hdfs? 如何在hdfs中插入数据?

hadoop --config /app/apache-ignite-fabric-1.9.0-bin/examples/config/filesystem  jar /app/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount igfs:///workDir/myFile1 /outputWC

You should configure IGFS with a secondary file system to have write-through caching in Ignite. 您应该使用辅助文件系统配置IGFS,以在Ignite中进行直写式缓存。

This doc page says something about that: https://apacheignite-fs.readme.io/docs/secondary-file-system The config (default-config.xml) of Ignite Hadoop edition has the following code, that commented out by default: 此文档页面说明了一些内容: https ://apacheignite-fs.readme.io/docs/secondary-file-system Ignite Hadoop edition的config(default-config.xml)具有以下代码,默认情况下已注释掉:

  <property name="secondaryFileSystem"> <bean class="org.apache.ignite.hadoop.fs.IgniteHadoopIgfsSecondaryFileSystem"> <property name="fileSystemFactory"> <bean class="org.apache.ignite.hadoop.fs.CachingHadoopFileSystemFactory"> <property name="uri" value="hdfs://your_hdfs_host:9000/"/> </bean> </property> </bean> </property> 

You need to uncomment it and provide appropriate secondary file system URI. 您需要取消注释并提供适当的辅助文件系统URI。 Please note known bug that a trailing slash should be present in the end of the 2ndary file system URI, hdfs://your_hdfs_host:9000/ . 请注意在第二个文件系统URI hdfs://your_hdfs_host:9000/的末尾应该存在尾部斜杠的已知错误。 By default DUAL_ASYNC mode will be used. 默认情况下,将使用DUAL_ASYNC模式。 To set DUAL_SYNC mode set "defaultMode" property of "fileSystemConfiguration" bean. 要设置DUAL_SYNC模式,请设置“fileSystemConfiguration”bean的“defaultMode”属性。

General comments. 普通的留言。

  1. There should not be nested <configuration> tags in Hadoop configuration files. Hadoop配置文件中不应该嵌套<configuration>标记。
  2. You likely don't need to redefine 'fs.file.impl' and 'fs.hdfs.impl', please use $IGNITE_HOME/config/hadoop/core-site.ignite.xml as a core-site.xml file template. 您可能不需要重新定义'fs.file.impl'和'fs.hdfs.impl',请使用$IGNITE_HOME/config/hadoop/core-site.ignite.xml作为core-site.xml文件模板。
  3. hadoop fs -cat ... will not work for directories, please use hadoop fs -ls ... hadoop fs -cat ...不适用于目录,请使用hadoop fs -ls ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM