简体   繁体   English

如何使用neo4j和gremlin设计我的数据集

[英]how to design my dataset using neo4j and gremlin

i have a dataset containg fields like below: 我有一个数据集包含如下字段:

id  amount date        s_pName   s_cName      b_pName   b_cName

 1    100   2/3/2012      IBM    IBM_USA        Pepsi    Pepsi_USA  
 2    200   21/3/2012     IBM    IBM_USA        Coke     Coke_UK
 3    300   12/3/2012     IBM    IBM_USA        Pepsi    Pepsi_USA
 4    1100  22/3/2012     Pepsi  IBM_Aus        IBM      IBM_USA

here all 4 fields like s_pName s_cName b_pName b_cName can be saler or buyer. 在这里,所有4个字段(例如s_pName,s_cName b_pName b_cName)都可以是买卖双方。 how to models this dataset in neo4j so that when I query using gremlin like, 如何在neo4j中对该数据集建模,以便当我使用gremlin进行查询时,

select b_CName,id,amount,date from tableName where s_cName = IBM_USA,IBM_AUS; 从tableName中选择b_CName,id,amount,date,其中s_cName = IBM_USA,IBM_AUS;

I noted your question on the gremlin-users mailing list as well (where you provided a bit more information about things you'd tried): https://groups.google.com/forum/#!topic/gremlin-users/AxsF2eJvpOA 我也在gremlin-users邮件列表上记录了您的问题(您在其中提供了有关尝试过的事情的更多信息): https : //groups.google.com/forum/#!topic/ gremlin-users/AxsF2eJvpOA

I'm sure there are a few ways to approach this modelling issue, so I'll just provide some things to consider and hopefully that will inspire you to solution. 我敢肯定有几种方法可以解决此建模问题,因此,我只提供一些要考虑的事情,并希望能激发您的解决方案。 First, instead of thinking of buyers and sellers, just think about the fact that you have "companies" that sells things to other companies and that companies have hierarchy (meaning that a company can have a parent). 首先,不用考虑买卖双方,只需考虑以下事实:您拥有“公司”,可以将商品出售给其他公司,并且公司具有等级制度(意味着公司可以有一个母公司)。 Your model then comes down to: 然后,您的模型可以归结为:

company --sellsTo--> company
company --parent--> company

Place your transaction amount and date on the "sellsTo" edge creating one such edge per row in your dataset. 将交易金额和日期放在“ sellsTo”边缘上,在数据集中每行创建一个这样的边缘。 Create a key index on the "companyName" field of the company vertex so that you can look up the company. 在公司顶点的“ companyName”字段上创建键索引,以便您可以查找公司。 Your Gremlin would then be something like: 您的Gremlin将会是这样的:

['IBM_USA','IBM_AUS'].collect{g.V('companyName',it).next()}._().outE('sellsTo').as('tx').inV.as('buyer').select{[it.id, it.amount, it.date]}{it.companyName}

so breaking that down you do a lookup of your two companies you care about by key index on companyName and get them into a pipeline with _() . 因此,将其分解,您可以通过companyName上的键索引查找您关心的两家公司,并使用_()将它们放入管道中。 Then you traverse out to the companies those two companies sold to. 然后您遍历这两家公司出售给的公司。 You use select to grab the tx (transaction edge) and buyer vertex executing a closure on each of them to transform them into the fields you want which will yield you something like (for one result, your Gremlin would likely return several of these with your full dataset obviously): 您可以使用select来抓取tx (交易边),然后buyer顶点对每个交易buyer执行闭包操作,以将其转换为所需的字段,从而产生类似的结果(对于一个结果,Gremlin可能会随您返回其中的几个完整的数据集):

[[1,100,2/3/2012],Pepsi_USA]

You could use some Groovy JDK ( http://groovy.codehaus.org/groovy-jdk/ ) operations to transform it further from there if that's not the final format you need. 如果不是最终格式,则可以使用一些Groovy JDK( http://groovy.codehaus.org/groovy-jdk/ )操作从那里进一步转换它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM