简体   繁体   English

map reduce程序在hadoop框架中实现数据结构

[英]map reduce program to implement data structure in hadoop framework

This is a data structure implementation in Hadoop. 这是Hadoop中的数据结构实现。 I want to implement indexing in Hadoop using map-reduce programming. 我想使用map-reduce编程在Hadoop中实现索引。 Part 1 = I want to store this text file each word using index number in a table. 第1部分=我想使用表中的索引号将每个单词存储此文本文件。 [Able to complete] Part 2 = Now I want to perform the hashing for this newly created table [not able to complete] 1st part I am able to complete but 2nd part I m facing difficulty  Suppose if I have a text file containing 3 lines: how is your job how is your family hi how are you [能够完成]第2部分=现在我要对该新创建的表执行哈希处理[无法完成]第一部分我可以完成,但是第二部分我面临困难假设我有一个包含3的文本文件台词:你的工作怎么样,你的家人怎么样,你好吗

I want to store this text file using indexing. 我想使用索引存储此文本文件。 I have map-reduce code that returns index value of every word, this index value I am able to store in index table (hash table) Output that contains index values of every word: how 0, how 14, is 3, is 18, job 12, your 7, 我有map-reduce代码,该代码返回每个单词的索引值,这个索引值我可以存储在索引表(哈希表)中,其中包含每个单词的索引值:如何0,如何14、3、18工作12,您的7

Now to store in hash table apply hashing for every word (index value) with modules (number of distinct elements in file) let say 4. For every index value of word and apply hash function (modules'%') to store in hash table. 现在要存储在哈希表中,使用模块(文件中不同元素的数量)对每个单词(索引值)应用哈希,说4。对于单词的每个索引值,应用哈希函数(模块'%')存储在哈希表中。 If there is a collision for same location then go to next location and store it. 如果同一位置发生碰撞,请转到下一个位置并将其存储。

  0%4=0(store 'how' at hash index 0)
  14%4=2(store 'how' at has index 2)
  18%4=2(store 'is' at hash index 3 because of collision) 
  7%4=3 (store 'your' at index 4 because of collision)

you can create Hashtable object and put the key and value. 您可以创建Hashtable对象并放置键和值。

Hashtable hashtable = new Hashtable(); 

How to find key? 如何找到钥匙? Ans. 答。 you have total distinct words count and word's index. 您有不同的单词总数和单词索引。 key = index % no of distinct word value = word 键=索引%不同单词的编号=单词

Before insert record in hashtable, check collision is occur or not for that key. 在哈希表中插入记录之前,请检查该键是否发生冲突。 How can I check collision occur? 如何检查发生碰撞? Ans. 答。

boolean collision=hashtable.containsKey(key);  

if collision is true, then linearly check for key+1, key+2,...and when you get collision is false, insert the key and value in hashtable using below line. 如果碰撞为true,则线性检查key + 1,key + 2...。当碰撞为false时,使用下面的代码行将键和值插入哈希表中。

hashtable.put(key,value);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM