I'm trying to read a file which is in pairs as follows:
V1#K1.@
V2#K1.@
V3#K2.@,V4#K1.@,V5#K2
V1#K3.@
My aim is to store it in key<=>pairs
with #
as a delimiter after removing '@'
Value is placed before #
and Keys are after #
in the example file
The answer mentioned in associate multiple values for one key in array in bash couldn't be implemented. So i tried it in the following way in ksh:
#!/usr/bin/ksh
typeset -A arr
while IFS= read -r line;do
STRIPPED=`echo $line|sed 's/.@//g'`
OIFS="$IFS"
IFS=','
read -A TOKENS <<< "${STRIPPED}"
IFS="$OIFS"
for key in ${TOKENS[@]};do
echo "Token is $key"
arr[${i##*#}]=${i%%#*}
echo "Key: ${key##*#}, Value: ${arr[${key##*#}]}"
done
done <MYFILE
# Printing key and its values
for i in ${!arr[@]};do
echo "key: ${i}, value: ${arr[$i]}"
done
But this overwrites the previous values for a key. It doesnt consider multiple values for a key. Is there a way to do it in ksh(not bash)?
I would do this, which stores multiple values as a comma-separated string
#!/usr/bin/env ksh
# The `exec` line tells ksh to read from MYFILE _if_ stdin has _not_ been redirected
# This allows you to do:
# ./script.ksh
# ./script.ksh < some_other_file
# some_process | ./script.ksh
[[ -t 0 ]] && exec 0<MYFILE
typeset -A arr
while IFS= read -r line; do
# greatly simplified tokenization
IFS=',' read -rA tokens <<< "${line//.*/}"
for t in "${tokens[@]}"; do
key=${t%#*}
val=${t#*#}
[[ -n ${arr[$key]} ]] && arr[$key]+=,
arr[$key]+=$val
done
done
# Printing key and its values
for i in "${!arr[@]}"; do
echo "key: ${i}, value: ${arr[$i]}"
done
which outputs
key: V1, value: K1,K3
key: V2, value: K1
key: V3, value: K2
Assumptions:
Input file:
$ cat kdat
V1#K1.@
V2#K1.@
V3#K2.@,V4#K1.@,V5#K2.@
V1#K3.@
V1#K3.@
V1#K3.@
One solution based on sed
and awk
(both available in bash
and ksh
) where we use the attribute/value pair as the indices of a 2-dimensional array. By assigning an arbitrary value ('1' in this case) as the array value we can eliminate duplicate values.
Now the actual code:
$ sed 's/,/\n/g;s/.@//g' kdat | awk -F"#" '
{ myarray[$1][$2]=1 }
END { for (i in myarray)
{ delim=""
printf "key: %s, value: ",i
for (j in myarray[i])
{ printf "%s%s",delim,j
delim=","
}
printf "\n"
}
}
'
key: V1, value: K1,K3
key: V2, value: K1
key: V3, value: K2
key: V4, value: K1
key: V5, value: K2
Where:
sed ...
: replace comma with a carriage return (each attribute/value pair is on a separate line; this awk
solution assumes one attribute/value pair per line); remove '.@' awk -F"#" ...
: use '#' as the input delimiter for separating our attribute ($1) and value ($2) pairs myarray[$1][$2]=1
: create/overwrite array($1,$2) with '1'; this is where duplicates are discarded for / printf
: loop through array indices, using printf
to pretty print our output
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.