简体   繁体   中英

APL - How can I find the longest word in a string vector?

I want to find the longest word in a string vector. Using APL I know that the shape function will return the length of a string eg

⍴ 'string' ⍝ returns 6

The reduce function allows me to map diadic functions along a vector but since shape is monadic this will not work. How can I map the shape function in this case? For example:

If the vector is defined as:

lst ← 'this is a string'

I want to do this:

⍴'this' ⍴'is' ⍴'a' ⍴'string'

The "typical" approach would be to treat it as a segmented (or: separated) string and prefix it with the separator (a blank ) and pass it to a dfn for further analysis:

{}' ',lst

The fn then looks for the separator and uses it to build the vectors of words:

      {(⍵=' ')⊂⍵}' ',lst
┌─────┬───┬──┬───────┐
│ this│ is│ a│ string│
└─────┴───┴──┴───────┘

Let's remove the blanks:

      {1↓¨(⍵=' ')⊂⍵}' ',lst
┌────┬──┬─┬──────┐
│this│is│a│string│
└────┴──┴─┴──────┘

And then you "just" need to compute the length of each vector:

{1↓¨(⍵=' ')⊂⍵}' ',lst

This is a direct implementation of your request. However, if you're not interested in the substrings themselves but only the length of "non-blank segments", a more "APLy"-solution might be to work with booleans (usually most efficient):

      lst=' '
0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0

So the ones are the positions of the separators - where do they occur?

      ⍸lst=' '
5 8 10

But we need a trailing blank, too - otherwise we're missing the end of text:

      ⍸' '=lst,' '
5 8 10 17

So these ( minus the positions of the preceeding blank ) should give the length of the segments:

      {¯1+⍵-0,¯1↓⍵}⍸' '=lst,' '
4 2 1 6

This is still somewhat naive and can be expressed in more advanced way - I leave that as an "exercise for the reader" ;-)

While MBaas has already thoroughly answered , I thought it might be interesting to learn the idiomatic Dyalog "train" ≠⊆⊢ derived from Paul Mansour's comment . It forms a dyadic function which splits its right argument on occurrences of the left argument:

      Split ← ≠⊆⊢
      ' ' Split 'this is a string'
┌────┬──┬─┬──────┐
│this│is│a│string│
└────┴──┴─┴──────┘

You can extend this function train to do the whole job:

      SegmentLengths ← ≢¨Split
      ' ' SegmentLengths 'this is a string'
4 2 1 6

Or even combine the definitions in one go:

      SegmentLengths ← ≢¨≠⊆⊢
      ' ' SegmentLengths 'this is a string'
4 2 1 6

If you are used to the idiomatic expression ≠⊆⊢ then it may actually read clearer than any well-fitting name you can give for the function, so you might as well just use the expression in-line:

      ' ' (≢¨≠⊆⊢) 'this is a string'
4 2 1 6

For how to find the longhest word in a string i would use, in NARS APL the function

f←{v/⍨k=⌈/k←≢¨v←(⍵≠' ')⊂⍵}

example to use

  f  'this is a string thesam'
string thesam 

explenation

{v/⍨k=⌈/k←≢¨v←(⍵≠' ')⊂⍵}
            v←(⍵≠' ')⊂⍵  split the string where are the spaces and assign result to v
        k←≢¨v             to each element of v find the lenght, the result will be a vector
                          that has same lenght of v saved in k
      ⌈/k                 this find max in k
    k=                    and this for each element of k return 0 if it is not max, 1 if it is max
 v/⍨                      this return the element of v that are max

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM