简体   繁体   English

Python,Pandas数据框语法

[英]Python, Pandas Data frame syntax

I have a data frame(df) and I am trying to copy first line from df to df1 by the following syntax. 我有一个数据帧(df),我正在尝试通过以下语法将第一行从df复制到df1。

df1=df.iloc[0].copy()

and when I printed df1 the data frame look like this 当我打印df1时,数据框如下所示

device_consumption_key                2700848
eagle_id                         d8d5b9008b92
device_id                    002446000006983b
component_id                               0b
read_dt                   2017-10-25 20:05:24
msg_dt                    2017-09-15 14:09:19
delivered                                0.86
received                                    0
unit                                      kWh
creation_time             2017-10-25 20:06:00
etl_pid                        20171102184518
Name: 0, dtype: object

if I used a different syntax liek below the frame looks like a da--> 如果我在框架下方使用其他语法,则看起来像是da->

df1=df.iloc[[0]].copy()

Can't able to show the full picture but I guess you got the idea 无法显示完整图片,但我想您已经明白了

Why is that? 这是为什么?

The two difference slicing methods, using double brackets and single brackets return a DataFrame and a Series respectively: 使用双括号和单括号的两种不同的切片方法分别返回DataFrameSeries

>>> type(df.iloc[[0]].copy())
<class 'pandas.core.frame.DataFrame'>
>>> type(df.iloc[0].copy())
<class 'pandas.core.series.Series'>

Series are by default displayed with the column names as the index, and the values as what looks like a column , but should not be considered as such. 默认情况下,以列名称作为索引显示Series ,以列的形式显示值,但不应将其视为

More details : 更多细节

A Series , based on the pandas documentation is a 基于pandas文档的 Series

One-dimensional ndarray with axis labels (including time series). 具有轴标签(包括时间序列)的一维ndarray。

Whereas a DataFrame DataFrame

Can be thought of as a dict-like container for Series objects 可以看作是Series对象的类似dict的容器

Take a look at the .loc documentation , specifically (highlights added by me) 看看.loc文档 ,特别是(我添加的突出显示)

Single label. 单标签。 Note this returns the row as a Series. 请注意,这会将行作为系列返回。

df.loc['viper']

And

List of labels. 标签列表。 Note using [[]] returns a DataFrame. 注意使用[[]]返回一个DataFrame。

df.loc[['viper', 'sidewinder']]

Basically when you slice using a single label, such as df.iloc[0] , you get a pd.Series object. 基本上,当您使用单个标签(例如df.iloc[0]切片时,会得到一个pd.Series对象。 When you slice using a list of labels , such as [[0]] , you get a pd.DataFrame . 当使用标签列表(例如[[0]]切片时,会得到pd.DataFrame

The same logic applies to .iloc 相同的逻辑适用于.iloc


As highlighted in comments, this is assuming unique indexes . 如注释中突出显示的那样,这是假设唯一索引 If you use loc in a data frame that has duplicated indexes, then unexpected results might arise (which is one strong argument for you not to use duplicated indexes !) 如果在具有重复索引的数据帧中使用loc ,则可能会出现意外结果(这是一个使用重复索引的强烈理由!)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM