简体   繁体   English

在apache hadoop中读取或复制到hdfs时的校验和异常

[英]Checksum Exception when reading from or copying to hdfs in apache hadoop

I am trying to implement a parallelized algorithm using Apache hadoop, however I am facing some issues when trying to transfer a file from the local file system to hdfs. 我正在尝试使用Apache hadoop实现并行化算法,但是当我尝试将文件从本地文件系统传输到hdfs时,我遇到了一些问题。 A checksum exception is being thrown when trying to read from or transfer a file. 尝试读取或传输文件时会抛出校验和异常

The strange thing is that some files are being successfully copied while others are not (I tried with 2 files, one is slightly bigger than the other, both are small in size though). 奇怪的是,有些文件被成功复制而其他文件没有被复制(我尝试使用2个文件,其中一个略大于另一个,但两者都很小)。 Another observation that I have made is that the Java FileSystem.getFileChecksum method, is returning a null in all cases. 我所做的另一个观察是Java FileSystem.getFileChecksum方法在所有情况下都返回null

A slight background on what I am trying to achieve: I am trying to write a file to hdfs, to be able to use it as a distributed cache for the mapreduce job that I have written. 关于我想要实现的目标的一个小背景:我正在尝试将文件写入hdfs,以便能够将其用作我编写的mapreduce作业的分布式缓存。

I have also tried the hadoop fs -copyFromLocal command from the terminal, and the result is the exact same behaviour as when it is done through the java code. 我还尝试了来自终端的hadoop fs -copyFromLocal命令,结果与通过java代码完成的行为完全相同。

I have looked all over the web, including other questions here on stackoverflow however I haven't managed to solve the issue. 我已经浏览了整个网络,包括有关stackoverflow的其他问题,但是我还没有设法解决这个问题。 Please be aware that I am still quite new to hadoop so any help is greatly appreciated. 请注意,我对hadoop仍然很新,所以非常感谢任何帮助。

I am attaching the stack trace below which shows the exceptions being thrown. 我附加了下面的堆栈跟踪,其中显示了抛出的异常。 (In this case I have posted the stack trace resulting from the hadoop fs -copyFromLocal command from terminal) (在这种情况下,我发布了来自终端的hadoop fs -copyFromLocal命令产生的堆栈跟踪)

name@ubuntu:~/Desktop/hadoop2$ bin/hadoop fs -copyFromLocal ~/Desktop/dtlScaleData/attr.txt /tmp/hadoop-name/dfs/data/attr2.txt

13/03/15 15:02:51 INFO util.NativeCodeLoader: Loaded the native-hadoop library
    13/03/15 15:02:51 INFO fs.FSInputChecker: Found checksum error: b[0, 0]=
    org.apache.hadoop.fs.ChecksumException: Checksum error: /home/name/Desktop/dtlScaleData/attr.txt at 0
        at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:219)
        at org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:237)
        at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
        at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
        at java.io.DataInputStream.read(DataInputStream.java:100)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:68)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:47)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:100)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:230)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:176)
        at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1183)
        at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:130)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:1762)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:1895)
    copyFromLocal: Checksum error: /home/name/Desktop/dtlScaleData/attr.txt at 0

You are probably hitting the bug described in HADOOP-7199 . 您可能正在遇到HADOOP-7199中描述的错误。 What happens is that when you download a file with copyToLocal , it also copies a crc file in the same directory, so if you modify your file and then try to do copyFromLocal , it will do a checksum of your new file and compare to your local crc file and fail with a non descriptive error message. 当您使用copyToLocal下载文件时,它还会将crc文件复制到同一目录中,因此如果您修改文件然后尝试执行copyFromLocal ,它将对您的新文件执行校验和并与您的本地文件进行比较crc文件并失败,并显示非描述性错误消息。

To fix it, please check if you have this crc file, if you do just remove it and try again. 要修复它,请检查您是否有此crc文件,如果您只是删除它并再试一次。

我通过删除.crc文件来解决同样的问题

Ok so I managed to solve this issue and I'm writing the answer here just in case someone else encounters the same problem. 好的,所以我设法解决了这个问题,我正在写这里的答案,以防其他人遇到同样的问题。

What I did was simply create a new file and copied all the contents from the problematic file . 我所做的只是创建一个新文件并复制有问题文件中的所有内容

From what I can presume it looks like some crc file is being created and attached to that particular file, hence by trying with another file, another crc check will be carried out. 从我可以假设它看起来像某个crc文件正在创建并附加到该特定文件,因此通过尝试使用另一个文件,将执行另一个crc检查。 Another reason could be that I have named the file attr.txt , which could be a conflicting file name with some other resource. 另一个原因可能是我将文件命名为attr.txt ,这可能是与其他资源冲突的文件名。 Maybe someone could expand even more on my answer, since I am not 100% sure on the technical details and these are just my observations. 也许有人可以扩大我的答案,因为我不是100%肯定技术细节,这些只是我的观察。

CRC file holds serial number for the Particular block data. CRC文件保存特定块数据的序列号。 Entire data is spiltted into Collective Blocks. 整个数据被转移到Collective Blocks中。 Each block stores metada along with the CRC file inside /hdfs/data/dfs/data folder. 每个块都存储metada以及/ hdfs / data / dfs / data文件夹中的CRC文件。 If some one makes correction to the CRC files...the actual and current CRC serial numbers would mismatch and it causes the ERROR!! 如果有人对CRC文件进行了修正......实际和当前的CRC序列号将不匹配,导致错误! Best practice to fix this ERROR is to do override the meta data file along with CRC file. 修复此ERROR的最佳做法是覆盖元数据文件和CRC文件。

I got the exact same problem and didn't fid any solution. 我得到了完全相同的问题,没有任何解决方案。 Since this was my first hadoop experience, I could not follow some instruction over the internet. 由于这是我的第一次hadoop体验,我无法通过互联网进行一些指导。 I solved this problem by formatting my namenode. 我通过格式化我的namenode解决了这个问题。

hadoop namenode -format

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何配置Apache Flume 1.4.0从Twitter获取数据并放入HDFS(Apache Hadoop版本2.5)? - How to configure Apache Flume 1.4.0 to fetch data from Twitter and put in HDFS (Apache Hadoop version 2.5)? 在hdfs C API中调用hdfsOpenFile时出现Hadoop异常java.lang.NoSuchMethodError - Hadoop Exception java.lang.NoSuchMethodError when calling hdfsOpenFile in hdfs C API 将数据从一个集群复制到另一个集群时,Hadoop Distcp 中止 - Hadoop Distcp aborting when copying data from one cluster to another 方案没有文件系统:找不到 hdfs 和类 org.apache.hadoop.DistributedFileSystem - No FileSystem for scheme:hdfs and Class org.apache.hadoop.DistributedFileSystem not found 使用APACHE Web服务器,Linux CentOS访问HDFS HADOOP - Access HDFS HADOOP using APACHE Web Server, Linux CentOS 错误:无法在Hadoop上找到或加载主类org.apache.hadoop.hdfs.tools.GetConf - Error: Could not find or load main class org.apache.hadoop.hdfs.tools.GetConf on Hadoop 为什么org.apache.hadoop.hdfs.protocol.proto在HADOOP SVN中为空 - Why org.apache.hadoop.hdfs.protocol.proto is empty in HADOOP SVN 无法将数据从水槽接收到hdfs hadoop以获取日志 - Unable to ingest data from flume to hdfs hadoop for logs 从头开始模仿 HDFS(Hadoop 分布式文件系统) - Mimicking HDFS(Hadoop Distributed File System) from Scratch 错误:找不到或加载主类org.apache.hadoop.hdfs.server.datanode.DataNode - Error: Could not find or load main class org.apache.hadoop.hdfs.server.datanode.DataNode
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM