I'm attempting to create a regex that captures both the HTTP status code as well as the body of a curl request. The regex pattern below works on multiple online sites, but won't match in a shell if-statement on my Mac's command line. Is my regex off or is there something else going on?
RESPONSE=$(curl -s -i -X GET http://www.google.com/)
# Match and capture the status code, match the headers, match two new lines, match and capture an optional body
re="^HTTP\/\d\.\d\s([\d]{3})[\w\d\s\W\D\S]*[\r\n]{2}([\w\d\s\W\D\S]*)?$"
if [[ "${RESPONSE}" =~ $re ]]; then
echo "match"
# Now do stuff with the captured groups, "${BASH_REMATCH[...]}"
else
echo "no match"
fi
I'm also open to other ways of doing this (I'm targeting a machine running CentOS 5).
Same idea as @delarsschneider, slightly less complicated
RESPONSE=$(curl -s -i -X GET http://www.google.com/)
CODE=$(echo $RESPONSE | sed -n 's/HTTP.* \(.*\) .*/\1/p')
BODY=$(echo $RESPONSE | tr '\n' ' ' | sed -n 's/.*GMT *\(.*\)/\1/p')
echo $CODE
echo $BODY
Since you are open to other solutions, too, you can try this out.
RESPONSE=$(curl -s -i -X GET http://www.google.com/)
HTTP_STATUS_CODE=`echo $RESPONSE | sed '
/HTTP/ {
s/^HTTP[^ ]* //
s/ .*$//
q
}
D'`
BODY=`echo $RESPONSE | sed '
/^.$/ {
:body
n
b body
}
D'`
echo $HTTP_STATUS_CODE
echo $BODY
HTTP_STATUS_CODE
is found in the first line starting with HTTP. Every non-space until the first space is removed and from the result ('302 Found') everything from first space till the end of the line is removed.
BODY
starts at the first line matching a single char (lines before are deleted with 'D'). From here print every line until the end of the input.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.