简体   繁体   English

了解Google App Engine数据存储

[英]Understanding Google App Engine datastore

i am in the early stages of designing a VERY large system (its an enterprise level point of sale system). 我正处于设计非常大型系统(企业级销售点系统)的初期阶段。 as some of you know the data models on these things can get very complicated. 你们中有些人知道这些事情的数据模型会变得非常复杂。 i want to run this thing on google app engine because i want to put more of my resources to developing the software rather than building and maintaining an infrastructure. 我想在Google App Engine上运行此工具,因为我想将更多资源用于开发软件,而不是构建和维护基础架构。

in that spirit of things, ive been doing a lot of reading on GAE and DataStore. 本着这种精神,我在GAE和DataStore上进行了大量阅读。 im an old school relational database modeler and ive seen several different concepts of what a schemaless database is and i think ive figured out what datastore is but i want to make sure i have it right 我是一个老派的关系数据库建模者,我见过关于无模式数据库的几个不同概念,我想我已经弄清楚了什么是数据存储,但我想确保自己正确

so, if im right gae is a sorta table based system. 因此,如果im right gae是基于sorta table的系统。 so if i create a java entity 所以如果我创建一个Java实体

class user
public string firstname
public string lastname

and deploy it, the "table" user is automatically created and running. 并部署它,“表”用户将自动创建并运行。 then in subsquent releases if i modify class user 然后在后续版本中,如果我修改类用户

class user
public string firstname
public string lastname
public date addDate

and deploy it, the "table" user is automatically updated with the new field. 并部署它后,“表”用户将自动使用新字段进行更新。

now, in relating data, as i understand it, its very similar to some of the massively complex systems like SAP where the data is in fact very organized, but due to the volume its referential integrity is a function of the application, not the database engine. 现在,据我所知,在关联数据方面,它与某些非常复杂的系统(例如SAP)非常相似,在SAP中,数据实际上是非常有组织的,但是由于数量巨大,其引用完整性是应用程序的功能,而不是数据库的功能发动机。 so i would have code that looks like this 所以我会有看起来像这样的代码

class user
public long id
public string firstname
public string lastname

class phone
public string phonenumber
public user userentity

and to pull up the phone numbers for a user from scratch instead of 并从头开始为用户拉出电话号码,而不是

select phone from phone inner join user as phone.userentity = user where user.id = 5
(lay off i know the syntax is incorrect but you get the point) 

i would do something like 我会做类似的事情

select user from user where user.id = 5
then
select phone from phone where phone.userentity = user

and that would retrieve all the phone numbers for the user. 然后将检索该用户的所有电话号码。

so, as i understand, its not so much a huge change in how to think about structuring data and organizing data, as its a big change on how to access it. 因此,据我所知,它在思考如何构造数据和组织数据方面并没有太大的变化,而在访问方式上却有很大的变化。 i do joins manually with code instead of joins automatically with the database engine. 我确实使用代码手动联接,而不是使用数据库引擎自动联接。 beyond that its the same. 除此之外,它是相同的。 am i correct or am i clueless. 我是正确的还是我无知。

There are really no tables at all. 根本没有桌子。 If you make some users with only a first and last name, and then later add addDate , then your original entities will still not have an addDate property. 如果使某些用户仅使用名字和姓氏,然后再添加addDate ,则原始实体仍将没有addDate属性。 None of the user entities are connected at all, in any way. 完全没有任何用户实体以任何方式连接。 They are not in a table of Users. 它们不在“用户”表中。

You can access all of the objects you wrote to the database that have the name "User" because appengine keeps big, long lists (indexes) of all of the objects that have each name. 您可以访问写入数据库的所有名称为“ User”的对象,因为appengine会保留每个名称的所有对象的大而长的列表(索引)。 So, any object you put in there that has the name (kind) "User" will get an entry in this list. 因此,您在那里放置的任何具有名称(种类)“ User”的对象都将在此列表中获得一个条目。 Later, you can read that index to get the location of each of your objects, and use those locations (keys) to fetch the objects. 以后,您可以读取该索引以获取每个对象的位置,然后使用这些位置(键)来获取对象。 They are not in a table, they're just floating around. 它们不在桌子上,只是在周围漂浮。 Some of them have some properties in common, but this is a coincidence, and not a requirement. 它们中的一些具有某些共同点,但这只是一个巧合,而不是必要条件。

If you want to fetch all of the User objects that have a certain name (Select * from User where firstname="Joe") then you have to maintain another big long index of keys. 如果要获取具有特定名称的所有User对象(从User中选择*,其中firstname =“ Joe”),则必须维护另一个较大的键长索引。 This index has the firstname property as well as the key of an entity on each row. 该索引具有firstname属性以及每一行上实体的键。 Later you can scan the index for a certain firstname , get all the keys, and then go look up the actual entities you stored with those keys. 稍后,您可以扫描索引中的某个firstname ,获取所有密钥,然后查找使用这些密钥存储的实际实体。 All of THOSE entities will have the firstname property (because you wouldn't enter an entity without the firstname property on your firstname index), but they may not have any other fields in common, because they are not in a table that enforces any data structure at all. 所有的THOSE实体都将具有firstname属性(因为您不会在firstname索引上输入没有firstname属性的实体),但是它们可能没有其他共同的字段,因为它们不在强制任何数据的表中结构。

These complications affect the way data is accessed pretty dramatically, and really affect things like transactions and complex queries. 这些复杂性极大地影响了数据访问的方式,并确实影响了事务和复杂查询等事物。 You're basically right that you don't have to change your thinking too much, but you should definitely understand how indexes and transactions work before planning your data structures. 您基本上没必要改变太多想法,这是正确的,但是在计划数据结构之前,您一定应该了解索引和事务的工作方式。 It is not always simple to efficiently tack on extra queries that you didn't think of before you got started, and it's pretty expensive to maintain these indexes, so the fewer you can get by with the better. 有效地处理您在开始之前没有想到的额外查询并不总是那么简单,而且维护这些索引非常昂贵,因此,越少越好。

Google数据存储区的精彩介绍是由objectify框架的创建者编写的: 数据存储区的基本概念

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM