简体   繁体   中英

str encapsulated arrays to array with pandas

I am working with a dataframe, that contains arrays. Upon read_cvs , pandas seems to be storing my vetors in str mode. Like this:

df['column'].iloc[3]
>>>'[50.6402809, 4.6667145]'

type(df['column'].iloc[3])
>>> str

How can I convert the entire column to array? Like so:

df['column'].iloc[3]
>>>[50.6402809, 4.6667145]

type(df['column'].iloc[3])
>>> array

If want numpy arrays use lambda function with ast.literal_eval and convert to arrays:

import ast

df['column'] = df['column'].apply(lambda x: np.array(ast.literal_eval(x)))

And if need lists:

df['column'] = df['column'].apply(ast.literal_eval)

df['column'] = [ast.literal_eval(x) for x in df['column']]

You could use the ast module to interpret the strings literally. However, this can be dangerous, especially when reading the data from a file or worse, online.

An alternative would be to parse the file directly using series.str functions:

In [19]: parsed = (
    ...:     df['column']
    ...:     .str.strip('[]')
    ...:     .str.split(', ', )
    ...:     .apply(lambda x: np.array(x).astype(float)))
    ...:

In [20]: parsed
Out[20]:
0    [0.45482146988492345, 0.40132331304489344]
1      [0.4820128044982769, 0.6930103661982894]
2      [0.15845986027370507, 0.825879918750825]
3      [0.08389109330674027, 0.031864037778777]
Name: column, dtype: object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM