Pandas: get second character of the string, from every row

Question

I've a array of data in Pandas and I'm trying to print second character of every string in col1. I can't figure out how to do it. I can easily print the second character of the each string individually, for example:

array.col1[0][1]

However I'd like to print the second character from every row, so there would be a "list" of second characters.

I've tried

array.col1[0:][1]

but that just returns the second line as a whole of col1.

Any advice?

Answer 1

You can use str to access the string methods for the column/Series and then slice the strings as normal:

>>> df = pd.DataFrame(['foo', 'bar', 'baz'], columns=['col1'])
>>> df
  col1
0  foo
1  bar
2  baz

>>> df.col1.str[1]
0    o
1    a
2    a

This str attribute also gives you access variety of very useful vectorised string methods, many of which are instantly recognisable from Python's own assortment of built-in string methods ( split , replace , etc.).

Answer 2

As of Pandas 0.23.0, if your data is clean, you will find Pandas "vectorised" string methods via pd.Series.str will generally underperform simple iteration via a list comprehension or use of map .

For example:

from operator import itemgetter

df = pd.DataFrame(['foo', 'bar', 'baz'], columns=['col1'])

df = pd.concat([df]*100000, ignore_index=True)

%timeit pd.Series([i[1] for i in df['col1']])            # 33.7 ms
%timeit pd.Series(list(map(itemgetter(1), df['col1'])))  # 42.2 ms
%timeit df['col1'].str[1]                                # 214 ms

A special case is when you have a large number of repeated strings, in which case you can benefit from converting your series to a categorical :

df['col1'] = df['col1'].astype('category')

%timeit df['col1'].str[1]  # 4.9 ms

Pandas: get second character of the string, from every row

Question

2 answers

solution1
11 ACCPTED 2014-11-19 15:38:37

solution2
0 2018-10-06 21:45:32

Pandas: get second character of the string, from every row

Question

2 answers

solution1 11 ACCPTED 2014-11-19 15:38:37

solution2 0 2018-10-06 21:45:32

solution1
11 ACCPTED 2014-11-19 15:38:37

solution2
0 2018-10-06 21:45:32