[英]Storing wide-form dataframes in datajoint table
Say I have some analysis that spits out a wide-form pandas dataframe with a multiindex on the index and columns.假设我有一些分析会吐出一个宽格式 pandas dataframe 在索引和列上具有多索引。 Depending on the analysis parameters, the number of columns may change.
根据分析参数,列数可能会发生变化。 What is the best design pattern to use to store the outputs in a datajoint table?
用于将输出存储在数据联合表中的最佳设计模式是什么? The following come to mind, each with pros and cons
想到以下几点,各有利弊
Are there any designs or pros/cons I haven't thought of?有没有我没有想到的设计或优点/缺点?
Before providing a more specific answer, let's establish a few basics (also known as normal forms).在提供更具体的答案之前,让我们建立一些基础知识(也称为范式)。
DataJoint implements the relational data model. DataJoint 实现了关系数据 model。 Under the relational model, complex dataframes of the type you described require normalization into multiple related tables related to each other through their primary keys and foreign keys.
在关系 model 下,您描述的类型的复杂数据帧需要通过主键和外键规范化为多个相互关联的相关表。
Each table will represent a single entity class: Units and Trials will be represented in separate tables.每个表将代表单个实体 class:单元和试验将在单独的表中表示。
All entities in a given table will have the same attributes (columns).给定表中的所有实体都将具有相同的属性(列)。 They will be uniquely identified by the same attribute(s) comprising the primary key.
它们将由构成主键的相同属性唯一标识。
In addition to the primary key, tables may have additional secondary indexes to accelerate queries.除了主键之外,表可能还有额外的二级索引来加速查询。
If you already knew about normalization, we can talk how about to normalize your design.如果您已经了解标准化,我们可以讨论如何标准化您的设计。 If not, we can refer you to a quick tutorial.
如果没有,我们可以向您推荐一个快速教程。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.