How to get column row and value from partial string efficiently with Pandas
I have a pandas dataframe setup with about 150 indexes and 8 columns what I am looking to do is efficiently get the the column and index for cells based on a partial string. What I came up with was as follows:
df = pd.DataFrame([["foo", "foo", "foo", "foo"], ["foo", "bar", "foo", "foo"], ["bar", "foo", "foo", "bar"],
["foo", "foo", "foo", "bar"]])
Output:
0 1 2 3
0 foo foo foo foo
1 foo bar foo foo
2 bar foo foo bar
3 foo foo foo bar
Here if I'm looking for just the entries that contain the sub-string "ar" I employ:
setup_mask = df.applymap(lambda x: "ar" in str(x))
values_hold = []
for x in df.index:
for y in df.columns:
if setup_mask.loc[x, y].any() == bool(True):
if [x, y] not in values_hold:
values_hold.append([x, y])
This works well and returns a list of index column values [[1, 1], [2, 0], [2, 3], [3, 3]].
This feels unpythonic and really just plain messy is there a way to do something like this in a more pythonic way?
PS I know I could cut out the mask but I felt like if there is a more pythonic way it would rely on a mask.
Pandas supports vectorized string operations, but only on one column at a time. So:
df.apply(lambda ser: ser.str.contains('ar'))
Will give you this:
0 1 2 3
0 False False False False
1 False True False False
2 True False False True
3 False False False True
And it's pretty efficient so long as you have fewer columns than rows (which you do).
If you store the above in mask
, then:
np.transpose(np.where(mask))
Gives you your answer:
array([[1, 1],
[2, 0],
[2, 3],
[3, 3]])
You can use transform
with str.contains
and stack
In [5352]: s = df.transform(lambda x: x.str.contains('ar')).stack()
In [5353]: s.index[s].tolist()
Out[5353]: [(1L, 1L), (2L, 0L), (2L, 3L), (3L, 3L)]
Or, as list of lists
In [5366]: [list(map(int, x)) for x in s.index[s]]
Out[5366]: [[1, 1], [2, 0], [2, 3], [3, 3]]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.