简体   繁体   中英

Search keywords in master csv if keyword exist then update input csv 2nd column with value true or false

Input csv - new_param.csv

value like -

ID
Identity
as-uid
cp_cus_id
evs
k_n

master.csv has value like -

A, xyz, id, abc
n, xyz, as-uid, abc, B, xyz, ne, abc
q, xyz, id evs, abc
3, xyz, k_n, abc, C, xyz, ad, abc
1, xyz, zd, abc
z, xyz, ID, abc

Require Output Updated new_param.csv - true or false in 2nd column

ID,true
Identity,false
as-uid,true
cp_cus_id,false
evs,true
k_n,true

tried below code no output -

#!/bin/bash

declare -a keywords=(`cat new_param.csv`)
 
length=${#keywords[@]}

for (( j=0; j<length; j++ ));
do
 a= LC_ALL=C awk -v kw="${keywords[$j]}" -F, '{for (i=1;i<=NF;i++) if ($i ~ kw) {print i}}' master.csv
b=0
if [ $a -gt $b ]
then
  echo true $2 >> new_param.csv
else
  echo false $2 >> new_param.csv
fi
done

Please help someone !

Tried above mention code but does not helping me

getings error like -

test.sh: line 29: [: -gt: unary operator expected test.sh: line 33: -f2: command not found

awk -v RS=', |\n' 'NR == FNR { a[$0] = 1; next }
        { gsub(/,.*/, ""); b = "" b $0 (a[$0] ? ",true" : ",false") "\n" }
        END { if (FILENAME == "new_param.csv") printf "%s", b > FILENAME }' master.csv new_param.csv

Try this Shellcheck -clean pure Bash code:

#! /bin/bash -p

outputs=()

while read -r kw; do
    if grep -q -E "(^|[[:space:],])$kw([[:space:],]|\$)" master.csv; then
        outputs+=( "$kw,true" )
    else
        outputs+=( "$kw,false" )
    fi
done <new_param.csv

printf '%s\n' "${outputs[@]}" >new_param.csv
  • You may need to tweak the regular expression used with grep -E depending on what exactly you want to count as a match.

Using grep to find exact word matches:

$ grep -owf new_param.csv master.csv | sort -u
ID
as-uid
evs
k_n

Then feed this to awk to match against new_param.csv entries:

awk '
BEGIN   { OFS="," }
FNR==NR { a[$1]; next }
        { print $1, ($1 in a) ? "true" : "false" }
' <(grep -owf new_param.csv master.csv | sort -u) new_param.csv

This generates:

ID,true
Identity,false
as-uid,true
cp_cus_id,false
evs,true
k_n,true

Once the results are confirmed as correct OP can add > new_param.csv to the end of the awk script, eg:

awk 'BEGIN { OFS="," } FNR==NR ....' <(grep -owf ...) new_parame.csv > new_param.csv
                                                                     ^^^^^^^^^^^^^^^

Alternative awk option:

Use a , for the field separator and concatenate the 3rd field for each record of the master.csv to the variable m . Second, read each record from the new-params.csv file and use the index funtion to determine whether that record exists in the m variable string.

 awk -F", " '
FNR==NR{m=m$3}
FNR<NR{print $0 (index(m,$0) ? ",true" : ",false")}                           
' master.csv new-params.csv

Output:

ID,true
Identity,false
as-uid,true
cp_cus_id,false
evs,true
k_n,true

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM