简体繁体 English

如何确定在Pandas DataFrame中将哪些列设置为索引？

[英]How does one determine which columns to set as an index in a Pandas DataFrame?

原文 2016-12-20 17:08:37 0 2 python/ pandas/ indexing/ dataframe

Let's say I have a DataFrame of financial securities, which often have multiple identifiers: 假设我有一个金融证券的数据框架，它通常有多个标识符：

Should I choose only one column to set as the index? 我应该只选择一列作为索引吗？ Should I set all potential identifiers as the index? 我应该将所有潜在标识符设置为索引吗？ Should I set all text data as an index, and leave all numeric data as columns? 我应该将所有文本数据设置为索引，并将所有数字数据保留为列吗？ What is the best practice? 什么是最佳做法？

2 个解决方案

This is more about database design than pandas. 这更多是关于数据库设计而不是熊猫。

The decision should be based on the business meaning of the dataframe (table in relational database) and its columns. 决策应基于数据框（关系数据库中的表）及其列的业务含义。 Eg, if 'Internal Security ID' is used to identify this kind of data in its business, then it should be set as the index. 例如，如果“内部安全ID”用于识别其业务中的此类数据，则应将其设置为索引。

However, if you are not sure, just stick with the default integer index. 但是，如果您不确定，请坚持使用默认的整数索引。

I tend to stick with the default index unless you have a need to have one of your columns as an index. 我倾向于坚持使用默认索引，除非您需要将一个列作为索引。 If you do, I strongly recommend using a column with unique values. 如果您这样做，我强烈建议您使用具有唯一值的列。 If there exists duplicates, this will cause you a lot of headache. 如果存在重复，这将引起您很多头痛。

如何将 pandas Dataframe 的索引设置为列长度的索引？ - How to set the index of a pandas Dataframe to that of the length of the Columns?

如何应用于具有多索引列的数据框中的一组列 - How to apply to one set of columns in a dataframe with multi-index columns

Pandas：如何在现有 DataFrame 的列上设置索引？ - Pandas: How do I set index on the columns of an existing DataFrame?

使用其中一列中的值索引pandas数据帧 - Index a pandas dataframe with a value in one of the columns

熊猫将一个数据框的列转换为另一数据框的索引 - Pandas convert columns of one dataframe to index in another dataframe

如何在Pandas DataFRame中替换列和行的索引 - how to replace index of columns and rows in Pandas DataFRame

如何将Pandas数据框填充为索引和列的函数 - How to populate Pandas dataframe as function of index and columns

如何获取pandas DataFrame的第一个索引，其中几个未定义的列不为空？ - How to get the first index of a pandas DataFrame for which several undefined columns are not null?

如何获取两列都不为null的pandas数据框的第一个索引？ - How to get the first index of a pandas dataframe for which two columns are both not null?

如何在MultiIndex Pandas DataFrame中设置索引值？ - How to set index values in a MultiIndex pandas DataFrame?

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将 pandas Dataframe 的索引设置为列长度的索引？ - How to set the index of a pandas Dataframe to that of the length of the Columns? 如何应用于具有多索引列的数据框中的一组列 - How to apply to one set of columns in a dataframe with multi-index columns Pandas：如何在现有 DataFrame 的列上设置索引？ - Pandas: How do I set index on the columns of an existing DataFrame? 使用其中一列中的值索引pandas数据帧 - Index a pandas dataframe with a value in one of the columns 熊猫将一个数据框的列转换为另一数据框的索引 - Pandas convert columns of one dataframe to index in another dataframe 如何在Pandas DataFRame中替换列和行的索引 - how to replace index of columns and rows in Pandas DataFRame 如何将Pandas数据框填充为索引和列的函数 - How to populate Pandas dataframe as function of index and columns 如何获取pandas DataFrame的第一个索引，其中几个未定义的列不为空？ - How to get the first index of a pandas DataFrame for which several undefined columns are not null? 如何获取两列都不为null的pandas数据框的第一个索引？ - How to get the first index of a pandas dataframe for which two columns are both not null? 如何在MultiIndex Pandas DataFrame中设置索引值？ - How to set index values in a MultiIndex pandas DataFrame?

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM