Multiple values for a key in ksh

Question

I'm trying to read a file which is in pairs as follows:

V1#K1.@
V2#K1.@
V3#K2.@,V4#K1.@,V5#K2
V1#K3.@

My aim is to store it in key<=>pairs with # as a delimiter after removing '@' Value is placed before # and Keys are after # in the example file

The answer mentioned in associate multiple values for one key in array in bash couldn't be implemented. So i tried it in the following way in ksh:

#!/usr/bin/ksh

typeset -A arr

while IFS= read -r line;do
    STRIPPED=`echo $line|sed 's/.@//g'`
    OIFS="$IFS"
    IFS=','
    read -A TOKENS <<< "${STRIPPED}"
    IFS="$OIFS"

    for key in ${TOKENS[@]};do
        echo "Token is $key"    
        arr[${i##*#}]=${i%%#*}
        echo "Key: ${key##*#}, Value: ${arr[${key##*#}]}"
    done
done <MYFILE

# Printing key and its values
for i in ${!arr[@]};do
    echo "key: ${i}, value: ${arr[$i]}"
done

But this overwrites the previous values for a key. It doesnt consider multiple values for a key. Is there a way to do it in ksh(not bash)?

Answer 1

I would do this, which stores multiple values as a comma-separated string

#!/usr/bin/env ksh

# The `exec` line tells ksh to read from MYFILE _if_ stdin has _not_ been redirected
# This allows you to do:
#    ./script.ksh
#    ./script.ksh < some_other_file
#    some_process | ./script.ksh

[[ -t 0 ]] && exec 0<MYFILE

typeset -A arr

while IFS= read -r line; do
    # greatly simplified tokenization
    IFS=',' read -rA tokens <<< "${line//.*/}"

    for t in "${tokens[@]}"; do
        key=${t%#*}
        val=${t#*#}
        [[ -n ${arr[$key]} ]] && arr[$key]+=,
        arr[$key]+=$val
    done
done

# Printing key and its values
for i in "${!arr[@]}"; do
    echo "key: ${i}, value: ${arr[$i]}"
done

which outputs

key: V1, value: K1,K3
key: V2, value: K1
key: V3, value: K2

Answer 2

Assumptions:

the input data is formatted exactly as displayed in the question (ie, no need to worry about other/extraneous text)
line 3 of the example input is missing a '.@' on the end of the 3rd attribute/value pair
to demonstrate duplicate processing I'll just copy the last input line a couple times
the question has no example of the desired output so I'll use glenn's example output
there is no explicit mention of any sorting preference (for the output) so I'll skip attempting to do any type of sorting at this point

Input file:

$ cat kdat
V1#K1.@
V2#K1.@
V3#K2.@,V4#K1.@,V5#K2.@
V1#K3.@
V1#K3.@
V1#K3.@

One solution based on sed and awk (both available in bash and ksh ) where we use the attribute/value pair as the indices of a 2-dimensional array. By assigning an arbitrary value ('1' in this case) as the array value we can eliminate duplicate values.

the first time we see a (new) attribute/value pair we create the array element
the next time we see the (same) attribute/value pair we simply overwrite the array element
when we're done processing the input we find that each attribute/value pair is associated with a single array element (ie, there are no duplicates)

Now the actual code:

$ sed 's/,/\n/g;s/.@//g' kdat | awk -F"#" '
{ myarray[$1][$2]=1 }
END { for (i in myarray)
      { delim=""
        printf "key: %s, value: ",i
        for (j in myarray[i])
            { printf "%s%s",delim,j
              delim=","
            }
        printf "\n"
      }
    }
'

key: V1, value: K1,K3
key: V2, value: K1
key: V3, value: K2
key: V4, value: K1
key: V5, value: K2

Where:

sed ... : replace comma with a carriage return (each attribute/value pair is on a separate line; this awk solution assumes one attribute/value pair per line); remove '.@'
awk -F"#" ... : use '#' as the input delimiter for separating our attribute ($1) and value ($2) pairs
myarray[$1][$2]=1 : create/overwrite array($1,$2) with '1'; this is where duplicates are discarded
for / printf : loop through array indices, using printf to pretty print our output

A couple fiddles: ksh and bash

Multiple values for a key in ksh

Question

2 answers

solution1
1 2019-07-12 10:51:08

solution2
1 ACCPTED 2019-07-12 12:49:28

Multiple values for a key in ksh

Question

2 answers

solution1 1 2019-07-12 10:51:08

solution2 1 ACCPTED 2019-07-12 12:49:28

solution1
1 2019-07-12 10:51:08

solution2
1 ACCPTED 2019-07-12 12:49:28