[英]What are the pros/cons in using pd.Index vs df.loc
What is the difference between using pd.Index
vs df.loc
?使用pd.Index
与df.loc
什么区别? Is it effectively the same thing?它实际上是同一件事吗?
idx = pd.Index(('a', 'b'))
df = pd.DataFrame({'a': [0, 1], 'b': [2, 3], 'c': [0, 5]})
print(df.loc[:, ('a', 'b')],)
print(df[idx])
a b
0 0 2
1 1 3
When you do loc
, you can do with index slice and columns slice or combine, however pd.index
can only do for column slice当您执行loc
时,您可以使用索引切片和列切片或组合,但是pd.index
只能用于列切片
df.loc[[0]]
a b c
0 0 2 0
df.loc[[0],['a','b']]
a b
0 0 2
IMO, loc
is more flexible to using, and I will select loc
which will more clear for the long run or check back stage. IMO, loc
使用起来更灵活,我将 select loc
从长远来看或检查后台会更清楚。
How loc
is the preferred method is described in the documentation . 文档中描述了loc
如何是首选方法。 Using multiple slices can lead to a SettingWithCopyWarning
:使用多个切片可能会导致SettingWithCopyWarning
:
idx = ['a', 'b']
d = df[idx]
d.iloc[0,0] = 9
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
In contrast, using loc
doesn't trigger the SettingWithCopyWarning
:相反,使用loc
不会触发SettingWithCopyWarning
:
idx = ['a', 'b']
d = df.loc[:,idx]
d.iloc[0,0] = 9
Of note, loc
also enables you to pass a specific axis as parameter:值得注意的是, loc
还允许您将特定轴作为参数传递:
df.loc(axis=1)[idx]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.