I have a requirement to grep values from a xml file in shell sample file below: test.xml
<wtc-import>
<name>WTCImportedService-288-rap04</name>
<resource-name>CAC040F</resource-name>
<local-access-point>lap01</local-access-point>
<remote-access-point-list>rap04</remote-access-point-list>
<remote-name>CAC040F</remote-name>
</wtc-import>
<wtc-import>
<name>WTCImportedService-289-rap04</name>
<resource-name>CAD040F</resource-name>
<local-access-point>lap01</local-access-point>
<remote-access-point-list>rap04</remote-access-point-list>
<remote-name>CAD040F</remote-name>
</wtc-import>
<wtc-import>
<name>WTCImportedService-290-rap04</name>
<resource-name>CAE040F</resource-name>
<local-access-point>lap01</local-access-point>
<remote-access-point-list>rap04</remote-access-point-list>
<remote-name>CAE040F</remote-name>
</wtc-import>
<wtc-import>
<name>WTCImportedService-289-rap04</name>
<resource-name>CAD040F</resource-name>
<local-access-point>lap01</local-access-point>
<remote-access-point-list>rap04</remote-access-point-list>
<remote-name>CAD040F</remote-name>
</wtc-import>
Have to grep all values associated with in he file and at last if any duplicate resource name present remove the duplicated from the output file
Execpted output:
CAC040F
CAD040F
CAE040F
the resource CAD040F is a duplicate so in the expected output its just appeared once
Tried:
grep 'resource-name' test.xml | awk -F">" '{print $2}' | awk -F"<" '{print $1}'
and this is working good..how about filtering duplicates after that?
You can do it with a single awk command
awk -F"[<>]" '/resource-name/ && !seen[$3]++ { print $3 } ' test.xml
with your sample xml file
$ awk -F"[<>]" '/resource-name/ && !seen[$3]++ { print $3 } ' test.xml
CAC040F
CAD040F
CAE040F
$
只是速度优化与@ stack0114106相比已经工作了
awk -F '[<>]' '$2 == "resource-name" && ! ( $3 in List) { print $3; List[$3] } ' test.xml
如果您已经获取了输出并且只是想删除重复项,那么最简单的方法是将输出通过管道进行排序,然后传递给uniq,因此您的命令将如下所示
grep 'resource-name' test.xml | awk -F">" '{print $2}' | awk -F"<" '{print $1}' | sort | uniq
If bash regex is your option, please try the following:
declare -A name
regex="<remote-name>([^<]+)</remote-name>"
while read -r line; do
if [[ $line =~ $regex ]]; then
name["${BASH_REMATCH[1]}"]=1
fi
done < "test.xml"
for i in "${!name[@]}"; do
echo "$i"
done
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.