简体   繁体   English

Hbase Schema嵌套实体

[英]Hbase Schema Nested Entity

Does anyone have an example on how to create an Hbase table with a nested entity? 有没有人有关于如何使用嵌套实体创建Hbase表的示例?

Example

UserName (string)
SSN  (string)
  + Books (collection)

The books collection would look like this for example 例如,书籍集合看起来像这样

Books 图书

isbn
title
etc...

I cannot find a single example are how to create a table like this. 我找不到一个例子是如何创建这样的表。 I see many people talk about it, and how it is a best practice in certain scenarios, but I cannot find an example on how to do it anywhere. 我看到很多人都在谈论它,以及在某些情况下它是如何做到最好的做法,但我找不到一个如何在任何地方做到这一点的例子。

Thanks... 谢谢...

Nested entities isn't an official feature of HBase; 嵌套实体不是HBase的官方特征; it's just a way some people talk about one usage pattern. 这只是一些人谈论一种使用模式的方式。 In this pattern, you use the fact that "columns" in HBase are really just a big map (a bunch of key/value pairs) to let you to model a dimension of cardinality inside the row by adding one column per "row" of the nested entity. 在这种模式中,您使用HBase中的“columns”实际上只是一个大的映射(一组键/值对)这一事实,让您通过每个“行”中添加一列来对行的基数维建模。嵌套实体。

Schema-wise, you don't need to do much on the table itself; 从架构方面来说,你不需要在桌子上做太多事情; when you create a table in HBase, you just specify the name & column family (and associated properties), like so (in hbase shell): 在HBase中创建表时,只需指定名称和列族(以及关联的属性),就像这样(在hbase shell中):

hbase:001:0> create 'UserWithBooks', 'cf1'

Then, it's up to you what you put in it, column wise. 然后,这取决于你在列中明智地放入的内容。 You could insert values like: 您可以插入如下值:

hbase:002:0> put 'UsersWithBooks', 'userid1234', 'cf1:username', 'my username'
hbase:003:0> put 'UsersWithBooks', 'userid1234', 'cf1:ssn', 'my ssn'
hbase:004:0> put 'UsersWithBooks', 'userid1234', 'cf1:book_id_12345', '<isbn>12345</isbn><title>mary had a little lamb</title>'
hbase:005:0> put 'UsersWithBooks', 'userid1234', 'cf1:book_id_67890', '<isbn>67890</isbn><title>the importance of being earnest</title>'

The column names are totally up to you, and there's no limit to how many you can have (within reason: see the HBase Reference Guide for more on this). 列名完全取决于您,并且您可以拥有的数量没有限制(理由内容:有关此内容的更多信息,请参阅HBase参考指南)。 Of course, doing this, you have to do your own legwork re: putting in and getting out values (and you'd probably do it with the java client in a more sophisticated way than I'm doing with these shell commands, they're just for explanatory purposes). 当然,这样做,你必须做自己的腿部工作:放入和取出值(你可能用比我用这些shell命令更复杂的方式使用java客户端,他们'仅用于解释目的)。 And while you can efficiently scan just a portion of the columns in a table by key (using a column pagination filter), you can't do much with the contents of the cells other than pull them and parse them elsewhere. 虽然您可以按键有效地扫描表格中的一部分列(使用列分页过滤器),但除了拉动它们并将它们解析到其他地方之外,您无法对单元格的内容做很多事情。

Why would you do this? 你为什么要这样做? Probably just if you wanted atomicity around all the nested rows for one parent row. 可能只是你想要一个父行的所有嵌套行周围的原子性。 It's not very common, your best bet is probably to start by modeling them as separate tables, and only move to this approach if you really understand the tradeoffs. 这不是很常见,你最好的选择可能是将它们建模为单独的表格,并且只有在你真正了解权衡时才采用这种方法。

There are some limitations to this. 这有一些限制。 First, this technique only works to one level deep: your nested entities can't themselves have nested entities. 首先,这种技术只适用于一个层次:您的嵌套实体本身不能拥有嵌套实体。 You can still have multiple different nested child entities in a single parent, and the column qualifier is their identifying attributes. 您仍然可以在单个父级中具有多个不同的嵌套子实体,并且列限定符是其标识属性。 Second, it's not as efficient to access an individual value stored as a nested column qualifier inside a row, as compared to accessing a row in another table, as you learned earlier in the chapter. 其次,与访问另一个表中的行相比,访问作为行内嵌套列限定符存储的单个值的效率并不高,如本章前面所述。 Still, there are compelling cases where this kind of schema design is appropriate. 尽管如此,仍有令人信服的案例,这种架构设计是合适的。 If the only way you get at the child entities is via the parent entity, and you'd like to have transactional protection around all children of a parent, this can be the right way to go. 如果您获得子实体的唯一方法是通过父实体,并且您希望围绕父母的所有子项拥有事务保护,这可能是正确的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM