I am working with a dataframe, that contains arrays. Upon read_cvs
, pandas seems to be storing my vetors in str
mode. Like this:
df['column'].iloc[3]
>>>'[50.6402809, 4.6667145]'
type(df['column'].iloc[3])
>>> str
How can I convert the entire column to array? Like so:
df['column'].iloc[3]
>>>[50.6402809, 4.6667145]
type(df['column'].iloc[3])
>>> array
If want numpy arrays use lambda function with ast.literal_eval
and convert to arrays:
import ast
df['column'] = df['column'].apply(lambda x: np.array(ast.literal_eval(x)))
And if need lists:
df['column'] = df['column'].apply(ast.literal_eval)
df['column'] = [ast.literal_eval(x) for x in df['column']]
You could use the ast
module to interpret the strings literally. However, this can be dangerous, especially when reading the data from a file or worse, online.
An alternative would be to parse the file directly using series.str
functions:
In [19]: parsed = (
...: df['column']
...: .str.strip('[]')
...: .str.split(', ', )
...: .apply(lambda x: np.array(x).astype(float)))
...:
In [20]: parsed
Out[20]:
0 [0.45482146988492345, 0.40132331304489344]
1 [0.4820128044982769, 0.6930103661982894]
2 [0.15845986027370507, 0.825879918750825]
3 [0.08389109330674027, 0.031864037778777]
Name: column, dtype: object
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.