简体   繁体   中英

How to convert 3D string to numpy array which is originated after saving 3D image in CSV

I have a CSV file, which has one column with image data. Before saving to CSV each image was a 3D numpy array. So each cell of this column was a 3D array. After saving to CSV and reading using pandas they converted to string. Now I want to recreate an array from them. Below you can find a sample of string which I want to convert to 3D numpy array.

import numpy as np

my_string_array = str(np.random.randint(0, high=255, size=(51, 52, 3)))

I tried the staff described here how to read numpy 2D array from string? , but seems that I need to have something different, since I have 3D array.

I know that if the arrays were converted to list before saving to CSV, then

import ast
my_array = np.array(ast.literal_eval(my_string_array))

would work, but unfortunately this is not the case. After running this I got an error:

Traceback (most recent call last):

  File "/opt/lyp-venv/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3319, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-25-3e5a6dae7682>", line 2, in <module>
    my_array = np.array(ast.literal_eval(my_string_array))

  File "/usr/lib/python3.7/ast.py", line 46, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')

  File "/usr/lib/python3.7/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)

  File "<unknown>", line 1
    [[[205  60 145]
             ^  
SyntaxError: invalid syntax

Regarding the error that you added:

ast.literal_eval(my_string_array)
....
[[[205  60 145]
         ^  
SyntaxError: invalid syntax

literal_eval works on a limited subset of Python syntax. For example it will work on a valid list input, eg "[[205, 60, 145]]" . But the string in the error message does not match that; it's missing the commas. The str(an_array) omits the commas. str(an_array.tolist()) does not.

Most of the answers that deal with loading csv files like this stress that you will need to replace the spaces (or blank delimiters) with commas.

So in this case the error has nothing to do with the array being 3d.

Let me illustrate:

make 3d array:

In [720]: arr = np.arange(24).reshape(2,3,4)                                                     

In [722]: arr                                                                                    
Out[722]: 
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

It's str representation, which is probably what pandas writes to the csv:

In [723]: str(arr)                                                                               
Out[723]: '[[[ 0  1  2  3]\n  [ 4  5  6  7]\n  [ 8  9 10 11]]\n\n [[12 13 14 15]\n  [16 17 18 19]\n  [20 21 22 23]]]'

Compare that with what a list str looks like:

In [724]: arr.tolist()                                                                           
Out[724]: 
[[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]],
 [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]
In [725]: str(arr.tolist())                                                                      
Out[725]: '[[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]'

literal_eval has no problem with this triple nested list string:

In [726]: ast.literal_eval(_)                                                                    
Out[726]: 
[[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]],
 [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]

literal_eval applied to the array string produces your error:

In [727]: ast.literal_eval(Out[721])                                                             
Traceback (most recent call last):

  File "/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py", line 3319, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-727-700e3f960e29>", line 1, in <module>
    ast.literal_eval(Out[721])

  File "/usr/lib/python3.6/ast.py", line 48, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')

  File "/usr/lib/python3.6/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)

  File "<unknown>", line 1
    [[[ 0  1  2  3]
           ^
SyntaxError: invalid syntax

I might be able to fix that with a couple of string substitutions, effectively converting Out[721] to Out[725] .

@Mad pointed out that if the array is large enough (over 1000 elements) str will produce a condensed version, replacing a lot of the values with '...'. You can verify that yourself. If that is the case, no amount of string editing will fix the problem. That string is useless.

In how to read numpy 2D array from string? , my answer has limited values since you already have string. https://stackoverflow.com/a/44323021/901925 is better. I've also SO questions that deal specifically with the strings that appear in pandas csv. In any case you need to pay attention to the details of the string, especially delimiters and special characters.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM