DataFrame with array of Letters

Question

data['Ln']
Out[46]: 
0        [C, C, C, C, C, C, G, I, O, P, P, P, R, R, R, ...
1        [C, C, C, C, C, C, G, I, O, P, P, P, R, R, R, ...
2        [C, C, C, C, C, C, G, I, O, P, P, R, R, R, R, ...
3        [C, C, C, C, C, C, G, I, O, P, P, R, R, R, R, ...
4        [C, C, C, C, C, C, G, I, O, P, P, P, R, R, R, ...
                               ...                        
43244                       [G, I, O, P, P, P, R, R, R, R]
43245                       [G, I, O, P, P, P, R, R, R, R]
43246                             [G, I, O, P, P, R, R, R]
43247                             [G, I, O, P, P, R, R, R]
43248                                   [G, I, O, P, R, R]
Name: Ln, Length: 43249, dtype: object

How can i structure a for loop to iterate over every row, and every letter either using sklearn.preprocessing.LebelEncoding or ord()?

For instance, I want every 'C' in every row to be the same number, as well as G, I, etc.

Answer 1

Create a dict then map it

alphabet_dict = {'C': 0, 'G': 1, }

data['Ln'].map(lambda x: [alphabet_dict.get(i) for i in x])

0    [0, 0, 0, 0, 0]
1    [1, 1, 1, 1, 1]

DataFrame with array of Letters

Question

1 answers

solution1
1 ACCPTED 2021-02-11 16:14:49

DataFrame with array of Letters

Question

1 answers

solution1 1 ACCPTED 2021-02-11 16:14:49

solution1
1 ACCPTED 2021-02-11 16:14:49