Replacing specific values in numpy 2D Array

Question

I have a numpy 2D array like so:

[['a', '(junk, b)', '(junk, c)'],
 ['d', '(junk, e)', '(junk, f)'],
 ['g', '(junk, h)', '(junk, i)']]

As you can see some of the values have a parenthesis around them, I'd like to remove these extra values such that my new array is:

[['a', 'b', 'c'],
 ['d', 'e', 'f'],
 ['g', 'h', 'i']]

I have a regex to get the match group of the data I want to capture but is there a clean way within numpy to apply a regex to certain values at certain positions and return my new array with the unwanted values replaced?

Answer 1

You can use a nested list comprehension to strip the items with str.strip() method :

>>> np.array([[x.strip('()') for x in i] for i in l])
array([['a', 'b', 'c'],
       ['d', 'e', 'f'],
       ['g', 'h', 'i']], 
      dtype='|S1')

Based on your edit if you have extra words in your array you can use regex to match the single characters :

>>> l=[['a', '(junk, b)', '(junk, c)'],
...  ['d', '(junk, e)', '(junk, f)'],
...  ['g', '(junk, h)', '(junk, i)']]
>>> 
>>> np.array([[re.search(r'\b[a-z]\b',x).group() for x in i] for i in l])
array([['a', 'b', 'c'],
       ['d', 'e', 'f'],
       ['g', 'h', 'i']], 
      dtype='|S1')
>>>

Replacing specific values in numpy 2D Array

Question

1 answers

solution1
2 ACCPTED 2015-10-15 20:44:04

Replacing specific values in numpy 2D Array

Question

1 answers

solution1 2 ACCPTED 2015-10-15 20:44:04

solution1
2 ACCPTED 2015-10-15 20:44:04