简体   繁体   English

对于没有标题行的csv,如何从pandas.Series.from_csv 切换到pandas.read_csv?

[英]how to switch from pandas.Series.from_csv to pandas.read_csv for the csv without header row?

pandas.Series.from_csv is deprecated (since 0.21 version). pandas.Series.from_csv已弃用(自 0.21 版本起)。 I want to change my code to use pandas.read_csv .我想更改我的代码以使用pandas.read_csv However, I simply can't find variant to load the same data for a csv file without a header line.但是,我根本找不到可以为没有标题行的 csv 文件加载相同数据的变体。

For example, let's assume following csv:例如,让我们假设以下 csv:

cntry,country
ctr,center
hts,heights
ft,fort
mt,mount
spg,springs
spgs,springs
st,saint
ter,terrace
e,east
w,west
s,south
n,north

Following deprecated code:以下弃用代码:

z1 = pd.Series.from_csv('file.csv')
type(z1)
z1.shape
z1

Gives me (executed in notebook):给我(在笔记本中执行):

pandas.core.series.Series
(13,)
cntry    country
ctr       center
hts      heights
ft          fort
mt         mount
spg      springs
spgs     springs
st         saint
ter      terrace
e           east
w           west
s          south
n          north
dtype: object

And I can't get the same result using pandas.read_csv and combinations of:而且我无法使用pandas.read_csv和以下组合获得相同的结果:

  • index_col=0 ; index_col=0 ;
  • header=None ; header=None
  • parse_dates=True ; parse_dates=True
  • squeeze=True . squeeze=True

For instance:例如:

z2 = pd.read_csv('file.csv',index_col=0,header=None,parse_dates=True,squeeze=True)
type(z2)
z2.shape
z2

gives me:给我:

pandas.core.series.Series
(13,)
0
cntry    country
ctr       center
hts      heights
ft          fort
mt         mount
spg      springs
spgs     springs
st         saint
ter      terrace
e           east
w           west
s          south
n          north
Name: 1, dtype: object

The difference in 0 . 0的差异。 Which isn't shown among .values or .iteritems() ..values.iteritems()没有显示。 Nonetheless, I don't understand what is it, and how to eliminate it using pandas.read_csv and its parameters.尽管如此,我不明白它是什么,以及如何使用pandas.read_csv及其参数消除它。

UPDATE更新

0 - is the index's title. 0 - 是索引的标题。 Can be removed with: .rename_axis(index=None) .可以删除: .rename_axis(index=None)

1 - is the series title. 1 - 是系列标题。 Can be removed with: .rename() .可以删除: .rename()

And I can't see so far how to do it using pandas.read_csv .到目前为止我还看不到如何使用pandas.read_csv来做到这pandas.read_csv

You should be able to do something like:您应该能够执行以下操作:

z1 = pd.read_csv('file.csv', header=None, names=['', 'values'], index_col=0)['values']

This will read the file into a dataframe, setting the first column as the index, then select the second column (named "values") as a series.这会将文件读入数据帧,将第一列设置为索引,然后选择第二列(名为“值”)作为一个系列。

The only difference from your example is that the name of the series will be "values".与您的示例的唯一区别是系列的名称将是“值”。 You can always run z1.name = None if that is not desirable.如果z1.name = None您可以随时运行z1.name = None

Using the provided solution almost worked, but for me the index was a datetimeindex, and I had a NaT value at the beggining.使用提供的解决方案几乎有效,但对我来说,索引是一个日期时间索引,并且我在开始时有一个 NaT 值。 The workaround:解决方法:

def load_pandas_series(filename):
    df1 = pd.read_csv(filename, index_col=0, names=['', 'values'],
                    header=None, parse_dates=True)['values']
    df1.name = None
    df1 = df1[~df1.index.isnull()]
    return df1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM