简体   繁体   中英

Adding leading zeros to strings in NumPy array

I have a 3-dimensional array filled with strings, mostly of numbers but also some text. If a string contains only one digit (ie. 1, 5), I want to add a zero to it (01,05,14,etc.). I can't get it to work for my NumPy array though.

I tried (among others):

strlist = ['1','2','3','4','5','6','7','8','9']
arr[np.isin(arr, strlist)] = '0' + arr[np.isin(arr, strlist)] 

But this doesn't work. Does anyone have any tips?

NumPy has several useful functions for arrays of strings. See NumPy docs on String operations . The function you are looking for is np.core.defchararray.zfill or its alias np.char.zfill .

Taking an example array from David Buck's answer :

>>> import numpy as np
>>> arr = np.array([[['3', '6', '12'],
                     ['0', '1', '3'],
                     ['5', 'T', '8'],
                     ['19', '15', '11']],
                    [['6', '3', '1'],
                     ['10', '10', 'QR'],
                     ['7', '11', '9'],
                     ['12', '13', '11']],
                    [['1', 'G', '3'],
                     ['10', '9', '2'],
                     ['18', '12', '17'],
                     ['6', '1', '10']]])
>>> np.char.zfill(arr, 2)
array([[['03', '06', '12'],
        ['00', '01', '03'],
        ['05', '0T', '08'],
        ['19', '15', '11']],

       [['06', '03', '01'],
        ['10', '10', 'QR'],
        ['07', '11', '09'],
        ['12', '13', '11']],

       [['01', '0G', '03'],
        ['10', '09', '02'],
        ['18', '12', '17'],
        ['06', '01', '10']]], dtype='<U2')

If you want to avoid adding zeros to elements that are not digits, we can use boolean array indexing and np.core.defchararray.isdigit function or its alias np.char.isdigit :

>>> mask = np.char.isdigit(arr)
>>> mask
array([[[ True,  True,  True],
        [ True,  True,  True],
        [ True, False,  True],
        [ True,  True,  True]],

       [[ True,  True,  True],
        [ True,  True, False],
        [ True,  True,  True],
        [ True,  True,  True]],

       [[ True, False,  True],
        [ True,  True,  True],
        [ True,  True,  True],
        [ True,  True,  True]]])
>>> arr[mask] = np.char.zfill(arr[mask], 2)
>>> arr
array([[['03', '06', '12'],
        ['00', '01', '03'],
        ['05', 'T', '08'],
        ['19', '15', '11']],

       [['06', '03', '01'],
        ['10', '10', 'QR'],
        ['07', '11', '09'],
        ['12', '13', '11']],

       [['01', 'G', '03'],
        ['10', '09', '02'],
        ['18', '12', '17'],
        ['06', '01', '10']]], dtype='<U2')

You can define a function that pads integers/passes over non integers and then use vectorize to apply it to the whole array.

import numpy as np

def pad(value):
    try:
        return '{0:0>2}'.format(int(value))
    except:
        return value

vfunc = np.vectorize(pad)
arr = vfunc(arr)
print(arr)

Applying that to an input of:

arr = np.array([[['3', '6', '12'],
                 ['0', '1', '3'],
                 ['5', 'T', '8'],
                 ['19', '15', '11']],
                [['6', '3', '1'],
                 ['10', '10', 'QR'],
                 ['7', '11', '9'],
                 ['12', '13', '11']],
                [['1', 'G', '3'],
                 ['10', '9', '2'],
                 ['18', '12', '17'],
                 ['6', '1', '10']],])

returns

[[['03' '06' '12']
  ['00' '01' '03']
  ['05' 'T' '08']
  ['19' '15' '11']]
 [['06' '03' '01']
  ['10' '10' 'QR']
  ['07' '11' '09']
  ['12' '13' '11']]
 [['01' 'G' '03']
  ['10' '09' '02']
  ['18' '12' '17']
  ['06' '01' '10']]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM