pandas.Series.from_csv
is deprecated (since 0.21 version). I want to change my code to use pandas.read_csv
. However, I simply can't find variant to load the same data for a csv file without a header line.
For example, let's assume following csv:
cntry,country
ctr,center
hts,heights
ft,fort
mt,mount
spg,springs
spgs,springs
st,saint
ter,terrace
e,east
w,west
s,south
n,north
Following deprecated code:
z1 = pd.Series.from_csv('file.csv')
type(z1)
z1.shape
z1
Gives me (executed in notebook):
pandas.core.series.Series
(13,)
cntry country
ctr center
hts heights
ft fort
mt mount
spg springs
spgs springs
st saint
ter terrace
e east
w west
s south
n north
dtype: object
And I can't get the same result using pandas.read_csv
and combinations of:
index_col=0
; header=None
; parse_dates=True
; squeeze=True
. For instance:
z2 = pd.read_csv('file.csv',index_col=0,header=None,parse_dates=True,squeeze=True)
type(z2)
z2.shape
z2
gives me:
pandas.core.series.Series
(13,)
0
cntry country
ctr center
hts heights
ft fort
mt mount
spg springs
spgs springs
st saint
ter terrace
e east
w west
s south
n north
Name: 1, dtype: object
The difference in 0
. Which isn't shown among .values
or .iteritems()
. Nonetheless, I don't understand what is it, and how to eliminate it using pandas.read_csv
and its parameters.
UPDATE
0
- is the index's title. Can be removed with: .rename_axis(index=None)
.
1
- is the series title. Can be removed with: .rename()
.
And I can't see so far how to do it using pandas.read_csv
.
You should be able to do something like:
z1 = pd.read_csv('file.csv', header=None, names=['', 'values'], index_col=0)['values']
This will read the file into a dataframe, setting the first column as the index, then select the second column (named "values") as a series.
The only difference from your example is that the name of the series will be "values". You can always run z1.name = None
if that is not desirable.
Using the provided solution almost worked, but for me the index was a datetimeindex, and I had a NaT value at the beggining. The workaround:
def load_pandas_series(filename):
df1 = pd.read_csv(filename, index_col=0, names=['', 'values'],
header=None, parse_dates=True)['values']
df1.name = None
df1 = df1[~df1.index.isnull()]
return df1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.