简体   繁体   中英

how to switch from pandas.Series.from_csv to pandas.read_csv for the csv without header row?

pandas.Series.from_csv is deprecated (since 0.21 version). I want to change my code to use pandas.read_csv . However, I simply can't find variant to load the same data for a csv file without a header line.

For example, let's assume following csv:

cntry,country
ctr,center
hts,heights
ft,fort
mt,mount
spg,springs
spgs,springs
st,saint
ter,terrace
e,east
w,west
s,south
n,north

Following deprecated code:

z1 = pd.Series.from_csv('file.csv')
type(z1)
z1.shape
z1

Gives me (executed in notebook):

pandas.core.series.Series
(13,)
cntry    country
ctr       center
hts      heights
ft          fort
mt         mount
spg      springs
spgs     springs
st         saint
ter      terrace
e           east
w           west
s          south
n          north
dtype: object

And I can't get the same result using pandas.read_csv and combinations of:

  • index_col=0 ;
  • header=None ;
  • parse_dates=True ;
  • squeeze=True .

For instance:

z2 = pd.read_csv('file.csv',index_col=0,header=None,parse_dates=True,squeeze=True)
type(z2)
z2.shape
z2

gives me:

pandas.core.series.Series
(13,)
0
cntry    country
ctr       center
hts      heights
ft          fort
mt         mount
spg      springs
spgs     springs
st         saint
ter      terrace
e           east
w           west
s          south
n          north
Name: 1, dtype: object

The difference in 0 . Which isn't shown among .values or .iteritems() . Nonetheless, I don't understand what is it, and how to eliminate it using pandas.read_csv and its parameters.

UPDATE

0 - is the index's title. Can be removed with: .rename_axis(index=None) .

1 - is the series title. Can be removed with: .rename() .

And I can't see so far how to do it using pandas.read_csv .

You should be able to do something like:

z1 = pd.read_csv('file.csv', header=None, names=['', 'values'], index_col=0)['values']

This will read the file into a dataframe, setting the first column as the index, then select the second column (named "values") as a series.

The only difference from your example is that the name of the series will be "values". You can always run z1.name = None if that is not desirable.

Using the provided solution almost worked, but for me the index was a datetimeindex, and I had a NaT value at the beggining. The workaround:

def load_pandas_series(filename):
    df1 = pd.read_csv(filename, index_col=0, names=['', 'values'],
                    header=None, parse_dates=True)['values']
    df1.name = None
    df1 = df1[~df1.index.isnull()]
    return df1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM