简体   繁体   中英

How can I recover my original data type of Array{Tuple{Int64,Int64},1} after writing to a “.csv” - Julia

I have a list of list of tuples that I am writing out to a file, so that it can be read-in by another file for post-processing.If it helps its a list of cycles in a graph, so each cycle is itself a list of tuples. I refer to this list as cycle_basis .

currently this is how I am writing it out:

df = DataFrame()
df.Path = cycle_basis
df.Size = cycle_size
CSV.write("cycle_basis.csv",df)

cycle_size is just an integer representing the number of edges in each cycle. I can see that it is stored in the CSV as follows (ran on a super small graph for the sake of this post, normally the file would be MUCH longer):

Path,Size
"[(1, 3), (2, 3), (1, 2)]",3
"[(4, 5), (3, 4), (1, 3), (2, 5), (1, 2)]",5

I have tried casting each string back to its original data type as follows: convert(Array{Tuple{Int64,Int64},1},cycle.Path[1]) , but that just gave me the following error:

    ERROR: MethodError: Cannot `convert` an object of type String to an object of type                                  Array{Tuple{Int64,Int64},1}
Closest candidates are:
  convert(::Type{Array{S,N}}, ::PooledArrays.PooledArray{T,R,N,RA} where RA) where {S, T, R, N} at /home/charper/.julia/packages/PooledArrays/yiLq3/src/PooledArrays.jl:294
  convert(::Type{T}, ::AbstractArray) where T<:Array at array.jl:533
  convert(::Type{T}, ::T) where T<:AbstractArray at abstractarray.jl:14
  ...
Stacktrace:
 [1] top-level scope at none:0

I tried the samething with parse() and got a similar error.

I would not recommend using CSV storage for such data. Probably the easiest thing to do is JSONTables.jl:

julia> df = DataFrame(a=[1,2], b=[[(1,2),(3,4)], [(5,6),(7,8)]])
2×2 DataFrame
│ Row │ a     │ b                │
│     │ Int64 │ Array…           │
├─────┼───────┼──────────────────┤
│ 1   │ 1     │ [(1, 2), (3, 4)] │
│ 2   │ 2     │ [(5, 6), (7, 8)] │

julia> s = arraytable(df) 
"[{\"a\":1,\"b\":[[1,2],[3,4]]},{\"a\":2,\"b\":[[5,6],[7,8]]}]"

julia> DataFrame(jsontable(s))
2×2 DataFrame
│ Row │ a     │ b                │
│     │ Int64 │ Array…           │
├─────┼───────┼──────────────────┤
│ 1   │ 1     │ [[1, 2], [3, 4]] │
│ 2   │ 2     │ [[5, 6], [7, 8]] │

julia> DataFrame(jsontable(objecttable(df))) # objecttable gives you a different layout of data
2×2 DataFrame
│ Row │ a     │ b                │
│     │ Int64 │ Array…           │
├─────┼───────┼──────────────────┤
│ 1   │ 1     │ [[1, 2], [3, 4]] │
│ 2   │ 2     │ [[5, 6], [7, 8]] │

I have shown here how to store it in a String and read it back, but you can use IO instead.

The reading back does not recover tuples, but it recovers the structure.

Now - this is only one option. A comparison of different load/save options for DataFrames.jl is given in this tutorial: https://github.com/bkamins/Julia-DataFrames-Tutorial/blob/master/04_loadsave.ipynb .


Now in CSV you can do:

julia> CSV.write("tmp.csv", df)
"tmp.csv"

julia> df2 = CSV.File("tmp.csv") |> DataFrame
2×2 DataFrame
│ Row │ a     │ b                │
│     │ Int64 │ String           │
├─────┼───────┼──────────────────┤
│ 1   │ 1     │ [(1, 2), (3, 4)] │
│ 2   │ 2     │ [(5, 6), (7, 8)] │

julia> df2.b = eval.(Meta.parse.(df2.b))
2-element Array{Array{Tuple{Int64,Int64},1},1}:
 [(1, 2), (3, 4)]
 [(5, 6), (7, 8)]

julia> df2
2×2 DataFrame
│ Row │ a     │ b                │
│     │ Int64 │ Array…           │
├─────┼───────┼──────────────────┤
│ 1   │ 1     │ [(1, 2), (3, 4)] │
│ 2   │ 2     │ [(5, 6), (7, 8)] │

but it is not a safe way to do it. In eval any unsafe code can be injected for execution.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM