[英]Python, Pandas Data frame syntax
I have a data frame(df) and I am trying to copy first line from df to df1 by the following syntax. 我有一个数据帧(df),我正在尝试通过以下语法将第一行从df复制到df1。
df1=df.iloc[0].copy()
and when I printed df1 the data frame look like this 当我打印df1时,数据框如下所示
device_consumption_key 2700848
eagle_id d8d5b9008b92
device_id 002446000006983b
component_id 0b
read_dt 2017-10-25 20:05:24
msg_dt 2017-09-15 14:09:19
delivered 0.86
received 0
unit kWh
creation_time 2017-10-25 20:06:00
etl_pid 20171102184518
Name: 0, dtype: object
if I used a different syntax liek below the frame looks like a da--> 如果我在框架下方使用其他语法,则看起来像是da->
df1=df.iloc[[0]].copy()
Can't able to show the full picture but I guess you got the idea 无法显示完整图片,但我想您已经明白了
Why is that? 这是为什么?
The two difference slicing methods, using double brackets and single brackets return a DataFrame
and a Series
respectively: 使用双括号和单括号的两种不同的切片方法分别返回
DataFrame
和Series
:
>>> type(df.iloc[[0]].copy())
<class 'pandas.core.frame.DataFrame'>
>>> type(df.iloc[0].copy())
<class 'pandas.core.series.Series'>
Series
are by default displayed with the column names as the index, and the values as what looks like a column , but should not be considered as such. 默认情况下,以列名称作为索引显示
Series
,以列的形式显示值,但不应将其视为列 。
More details : 更多细节 :
A Series
, based on the pandas
documentation is a 基于
pandas
文档的 Series
是
One-dimensional ndarray with axis labels (including time series).
具有轴标签(包括时间序列)的一维ndarray。
Whereas a DataFrame
而
DataFrame
Can be thought of as a dict-like container for Series objects
可以看作是Series对象的类似dict的容器
Take a look at the .loc documentation , specifically (highlights added by me) 看看.loc文档 ,特别是(我添加的突出显示)
Single label.
单标签。 Note this returns the row as a Series.
请注意,这会将行作为系列返回。
df.loc['viper']
And 和
List of labels.
标签列表。 Note using [[]] returns a DataFrame.
注意使用[[]]返回一个DataFrame。
df.loc[['viper', 'sidewinder']]
Basically when you slice using a single label, such as df.iloc[0]
, you get a pd.Series
object. 基本上,当您使用单个标签(例如
df.iloc[0]
切片时,会得到一个pd.Series
对象。 When you slice using a list of labels , such as [[0]]
, you get a pd.DataFrame
. 当使用标签列表(例如
[[0]]
切片时,会得到pd.DataFrame
。
The same logic applies to .iloc
相同的逻辑适用于
.iloc
As highlighted in comments, this is assuming unique indexes . 如注释中突出显示的那样,这是假设唯一索引 。 If you use
loc
in a data frame that has duplicated indexes, then unexpected results might arise (which is one strong argument for you not to use duplicated indexes !) 如果在具有重复索引的数据帧中使用
loc
,则可能会出现意外结果(这是一个不使用重复索引的强烈理由!)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.