简体   繁体   English

如何在 awk 中将 camelCase 字符串拆分为数组?

[英]How to split a camelCase string into an array in awk?

How can I split a camelCase string into an array in awk using the split function?如何使用拆分 function 将 camelCase 字符串拆分为 awk 中的数组?

Input:输入:

STRING="camelCasedExample"

Desired Result:期望的结果:

WORDS[1]="camel"
WORDS[2]="Cased"
WORDS[3]="Example"

Bad Attempt:错误尝试:

split(STRING, WORDS, /([a-z])([A-Z])/);

Bad Result:坏结果:

WORDS[1]="came"
WORDS[2]="ase"
WORDS[3]="xample"

You can't do it with split() alone which is why GNU awk has patsplit() :你不能单独使用split()来做到这一点,这就是为什么 GNU awk 有patsplit()

$ awk 'BEGIN {
    patsplit("camelCasedExample",words,/(^|[[:upper:]])[[:lower:]]+/)
    for ( i in words ) print words[i]
}'
camel
Cased
Example

With your shown samples, please try following.使用您显示的示例,请尝试以下操作。 Written and tested in GNU awk should work in any awk .在 GNU awk中编写和测试应该可以在任何awk中工作。 This will create array named words whose values could be accessed from index starting 1,2,3 and so on.这将创建名为words的数组,其值可以从索引 1、2、3 等开始访问。 I am printing it as an output, you can make use of it later on as per your wish too.我将其打印为 output,您以后也可以根据自己的意愿使用它。

awk -F'=|"' -v s1="\"" '
{
  gsub(/[A-Z]/,"\n&",$3)
  val=(val?val ORS:"")$3
}
END{
  num=split(val,words,ORS)
  for(i=1;i<=num;i++){
    if(words[i]!=""){
      print "WORDS[" ++count "]=" s1 words[i] s1
    }
  }
}
' Input_file

Explanation: Adding detailed explanation for above awk code.说明:对上述awk代码添加详细说明。

awk -F'=|"' -v s1="\"" '                     ##Starting awk program, setting field separator as = OR " and setting s1 to " here.
{
  gsub(/[A-Z]/,"\n&",$3)                     ##Using gsub to globally substitute captial letter with new character and value itself in 3rd field.
  val=(val?val ORS:"") $3                    ##Creating val which has $3 in it and keep adding values in val itself.
}
END{                                         ##Starting END block of this program from here.
  num=split(val,words,ORS)                     ##Splitting val into array arr with delmiter of ORS.
  for(i=1;i<=num;i++){                       ##Running for loop from value of 1 to till num here.
    if(words[i]!=""){                          ##Checking if arr item is NOT NULL then do following.
       print "WORDS[" ++count "]=" s1 words[i] s1    ##Printing WORDS[ value of i followed by ]= followed by s1 words[i] value and s1.
    }
  }
}
'  Input_file                                ##Mentioning Input_file name here.

Here is an awk solution that would work with any version of awk :这是一个awk解决方案,适用于任何版本的awk

s='camelCasedExample'
awk '{
   while (match($0, /(^|[[:upper:]])[[:lower:]]+/)) {
      wrd = substr($0,RSTART,RLENGTH)
      print wrd
      # you can also store it in array
      arr[++n] = wrd
      $0 = substr($0,RSTART+RLENGTH)
   }
}' <<< "$s"

camel
Cased
Example
 echo 'camelCasedExample' | mawk '{ for (_=(____=split($((_=_<_) * gsub("[>-[]", (___)"&")), __, ___) )^_; _<=____; _++) { print "","__["(_)"]",__[_] } }' OFS=':: ' FS='^$' ___='\20\22'
 :: __[1] :: camel
 :: __[2] :: Cased
 :: __[3] :: Example

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM