使用 pd.Index 与 df.loc 的优缺点是什么

Question

使用pd.Index与df.loc什么区别？ 它实际上是同一件事吗？

idx = pd.Index(('a', 'b'))
df = pd.DataFrame({'a': [0, 1], 'b': [2, 3], 'c': [0, 5]})

print(df.loc[:, ('a', 'b')],)
print(df[idx])

   a  b
0  0  2
1  1  3

Answer 1

当您执行loc时，您可以使用索引切片和列切片或组合，但是pd.index只能用于列切片

df.loc[[0]]
   a  b  c
0  0  2  0

df.loc[[0],['a','b']]
   a  b
0  0  2

IMO， loc使用起来更灵活，我将 select loc从长远来看或检查后台会更清楚。

Answer 2

文档中描述了loc如何是首选方法。 使用多个切片可能会导致SettingWithCopyWarning ：

idx = ['a', 'b']
d = df[idx]
d.iloc[0,0] = 9

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

相反，使用loc不会触发SettingWithCopyWarning ：

idx = ['a', 'b']
d = df.loc[:,idx]
d.iloc[0,0] = 9

值得注意的是， loc还允许您将特定轴作为参数传递：

df.loc(axis=1)[idx]

使用 pd.Index 与 df.loc 的优缺点是什么

问题描述

2 个解决方案

解决方案1
1 2022-01-23 02:55:20

解决方案2
0 2022-01-23 04:32:05

使用 pd.Index 与 df.loc 的优缺点是什么

问题描述

2 个解决方案

解决方案1 1 2022-01-23 02:55:20

解决方案2 0 2022-01-23 04:32:05

解决方案1
1 2022-01-23 02:55:20

解决方案2
0 2022-01-23 04:32:05