简体   繁体   中英

Using a vector to subset elements within a string vector in Julia

I am attempting to subset a Vector{String} in Julia using a combination of Integer and Vector{Integer} subset values. I want to write a function that basically allows for a subsetting of "asdf"[1:3] with each of the three arguments x[y:z] to be either vectors or singletons.

This is what I have attempted so far:

function substring(x::Array{String}, y::Integer, z::Integer)
  y = fill(y, length(x))
  z = fill(z, length(x))
  substring(x, y, z)
end

function substring(x::Vector{String}, y::Vector{Integer}, z::Integer)
  y = fill(y, length(x))
  substring(x, y, z)
end

function substring(x::Vector{String}, y::Integer, z::Vector{Integer})
  z = fill(z, length(x))
  substring(x, y, z)
end

function substring(x::Vector{String}, y::Vector{Integer}, z::Vector{Integer})
  for i = 1:length(x)
    x[i] = x[i][y[i]:min(z[i], length(x[i]))]
    # If z[i] is greater than the length of x[i] 
    # return the end of the string
  end
  x
end

Attempting to use it:

v = string.('a':'z')
x = rand(v, 100) .* rand(v, 100) .* rand(v, 100)

substring(x, 1, 2)
# or
substring(x, 1, s)

I get the error:

MethodError: no method matching substring(::Array{String,1}, ::Int64, ::Array{Int64,1})
Closest candidates are:
  substring(::Array{String,N}, ::Integer, !Matched::Integer) at untitled-e3b9271a972031e628a35deeeb23c4a8:2
  substring(::Array{String,1}, ::Integer, !Matched::Array{Integer,1}) at untitled-e3b9271a972031e628a35deeeb23c4a8:13
  substring(::Array{String,N}, ::Integer, !Matched::Array{Integer,N}) at untitled-e3b9271a972031e628a35deeeb23c4a8:13
  ...
 in include_string(::String, ::String, ::Int64) at eval.jl:28
 in include_string(::Module, ::String, ::String, ::Int64, ::Vararg{Int64,N}) at eval.jl:32
 in (::Atom.##53#56{String,Int64,String})() at eval.jl:50
 in withpath(::Atom.##53#56{String,Int64,String}, ::Void) at utils.jl:30
 in withpath(::Function, ::String) at eval.jl:38
 in macro expansion at eval.jl:49 [inlined]
 in (::Atom.##52#55{Dict{String,Any}})() at task.jl:60

I see that there is another post addressing a similar error with type Vector{String} . My post also ques a response to the error associated with the Vector{Integer} . I believe the responses to it might be helpful for others like me who find the implementation of abstract types novel and difficult.

If you're on Julia 0.6, this is pretty easy to do using SubString.(strs, starts, ends) :

julia> SubString.("asdf", 2, 3)
"sd"

julia> SubString.(["asdf", "cdef"], 2, 3)
2-element Array{SubString{String},1}:
 "sd"
 "de"

julia> SubString.("asdf", 2, [3, 4])
2-element Array{SubString{String},1}:
 "sd" 
 "sdf"

On Julia 0.5, you can do the same thing, but you must wrap the string in a vector (ie it cannot be left as a single scalar):

julia> SubString.(["asdf"], [1, 2, 3], [2, 3, 4])
3-element Array{SubString{String},1}:
 "as"
 "sd"
 "df"

The main difference between Julia and R is that while in R, functions are typically made to work on vectors by default (broadcasted), in Julia you explicitly specify the broadcasting behavior by using a so-called "dot-call", ie f.(x, y, z) .

Just to make that explicit as think its a super common thing to think.

Even though Int64 <: Integer is true

Array{Int64,1} <: Array{Integer,1} is not!


The docs on parametric-composite-types explain why in detail. But to paraphrase its basically because the former Array{Int64,1} has a specific representation in memory (ie many contiguous 64 bit values) while the Array{Integer,1} has to be sets of pointers to separately allocated values that may or may not be 64 bits.

See the similar Q&A for the cool new syntax you can use for declaring functions in julia 0.6 w/regard to this: Vector{AbstractString} function parameter won't accept Vector{String} input in julia

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM