I am compressing my pdf file using ghostscript which throws error on password protected case which I have to handle.
Shell script
GS_RES=`gs -sDEVICE=pdfwrite -sOutputFile=$gsoutputfile -dNOPAUSE -dBATCH $2 2>&1`
if [ "$GS_RES" != "" ]
then
gspassmsg="This file requires a password for access"
echo "Error message is :::::: "$GS_RES
gspassworddoc=`awk -v a="$GS_RES" -v b="$gspassmsg" 'BEGIN{print index(a,b)}'`
if [ $gspassworddoc -ne 0 ]
then
exit 3 #error code - password protected pdf
fi
fi
And my GS_RES
value after executing the command is like the following
Error message 1:
GPL Ghostscript 9.19 (2016-03-23) Copyright (C) 2016 Artifex Software, Inc. All
rights reserved. This software comes with NO WARRANTY: see the file PUBLIC for d
etails. Error: /syntaxerror in -file- Operand stack: Execution stack: %interp_ex
it .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --n
ostringval-- --nostringval-- --nostringval-- false 1 %stopped_push 1967 1 3 %opa
rray_pop 1966 1 3 %oparray_pop 1950 1 3 %oparray_pop 1836 1 3 %oparray_pop --nos
tringval-- %errorexec_pop .runexec2 --nostringval-- --nostringval-- --nostringva
l-- 2 %stopped_push Dictionary stack: --dict:1196/1684(ro)(G)-- --dict:0/20(G)--
--dict:78/200(L)-- Current allocation mode is local Current file position is 1
Error message 2:
GPL Ghostscript 9.19 (2016-03-23) Copyright (C) 2016 Artifex Software, Inc. All rights reserved. This software comes with NO WARRANTY: see the file PUBLIC for details. gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html Error: Cannot find a 'startxref' anywhere in the file. Output may be incorrect. gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html Error: An error occurred while reading an XREF table. gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html The file has been damaged. This may have been caused gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html by a problem while converting or transfering the file. gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html Ghostscript will attempt to recover the data. gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html However, the output may be incorrect. gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html Error: Trailer dictionary not found. Output may be incorrect. No pages will be processed (FirstPage > LastPage). gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html This file had errors that were repaired or ignored. gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html Please notify the author of the software that produced this gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html file that it does not conform to Adobe's published PDF gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html specification. gs.pdf gsempty.pdf new_sathishks_protected.html sathishks_protected.html The rendered output from this file may be incorrect.
On running awk on Error message 2
gspassmsg="This file requires a password for access"
gspassworddoc=`awk -v a="$GS_RES" -v b="$gspassmsg" 'BEGIN{print index(a,b)}'`
It throws me the following error
Error : awk: newline in string GPL Ghostscript 9.19... at source line 1
Error message 3
**** Error: Cannot find a 'startxref' anywhere in the file.
**** Warning: An error occurred while reading an XREF table.
**** The file has been damaged. This may have been caused
**** by a problem while converting or transfering the file.
**** Ghostscript will attempt to recover the data.
**** Error: Trailer is not found.
**** This file had errors that were repaired or ignored.
**** Please notify the author of the software that produced this
**** file that it does not conform to Adobe's published PDF
**** specification.
I couldn't capture this error with the snippet from the below answer
if ! gs_res=$(gs -sDEVICE=pdfwrite -sOutputFile="$gsoutputfile" -dNOPAUSE -dBATCH "$2" 2>&1 1>/dev/null); then
echo "Error message is :::::: $gs_res" >&2
gspassmsg='This file requires a password for access'
[[ $gs_res == *"$gspassmsg"* ]] && exit 3 # password protected pdf
echo "Some other error !"
fi
Please clarify me the following
awk
behaves weird here? What I'm missing? I am quite new to this shell script. Someone please help me on this.
PS: I have edited my question with additional details. Please look into it. If something has to be added i'll add it.
Ghostscript's error messages all follow the same pattern, however there are some gotchas:
Part of the output is a dump of the operand stack at the time of the error. Since PostScript is a programming language, the contents of the stack depends on the program, and is entirely unpredictable. Even though you are dealing with PDF files, not PostScript programs, the interpreter is itself written in PostScript, so the same still applies.
The
'Error: /syntaxerror...'
is limited to a small number of actual possible errors, the PostScript Language Reference Manual defines them.
PostScript (but not PDF) programs can install an error handler, which can totally alter the error output, and even swallow the error altogether.
As regards 'compressing PDF files', that is absolutely not what you are doing. Please have a read here which explains what's actually happening. In short though, you are producing a new PDF file, not compressing an old one.
You can, of course, process a password protected PDF file with Ghostscript, as long as you know the password. Look for PDFPassword in the documentation here
Now the error message you quote above is not due to the file being encrypted (password protected), there's something else wrong with it. In fact given the simple command line you are using, I'd say there's something quite seriously wrong with it. Of course without seeing the file I can't tell for certain.
Now if a file is encrypted, the output from Ghostscript should read something like:
GPL Ghostscript GIT PRERELEASE 9.21 (2016-09-14) Copyright (C) 2016 Artifex Software, Inc. All rights reserved. This software comes with NO WARRANTY: see the file PUBLIC for details.
**** This file requires a password for access.
Error: /invalidfileaccess in pdf_process_Encrypt
Operand stack:
Execution stack: %interp_exit .runexec2 --nostringval--
--nostringval-- --nostringval- - 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- fa lse 1 %stopped_push 1983 1 3 %oparray_pop 1982 1 3 %oparray_ pop 1966 1 3
%oparray_pop --nostringval-- --nostringval-- --nostri ngval--
--nostringval-- false 1 %stopped_push Dictionary stack: --dict:1199/1684(ro)(G)-- --dict:1/20(G)-- --dict:83/200(L)-- --dict:83 /200(L)-- --dict:135/256(ro)(G)-- --dict:291/300(ro)(G)-- --dict:26/32(L)- - Current allocation mode is local GPL Ghostscript GIT PRERELEASE 9.21: Unrecoverable error, exit code 1
So simply grepping for "This file requires a password" should be enough to identify encrypted files.
Now, as noted by mklement0, if you'd like to explain what it is about your actual script which is causing a problem, perhaps we can help with that too. You haven't shown the output of your script, or explained what is not working as you expect.
KenS's helpful answer addresses your questions about Ghostscript itself.
Here's a streamlined version of your code that should work:
# Run `gs` and capture its stderr output.
gs_res=$(gs -sDEVICE=pdfwrite -sOutputFile="$gsoutputfile" -dNOPAUSE -dBATCH "$2" 2>&1 1>/dev/null)
ec=$? # Save gs's exit code.
# Assume that something went wrong, IF:
# - gs reported a nonzero exit code
# - but *also* if any stderr output was produced, as
# not all problems may be reflected in a nonzero exit code.
if [[ $ec -ne 0 || -n $gs_res ]]; then
echo "Error message is :::::: $gs_res" >&2
gspassmsg='This file requires a password for access'
[[ $gs_res == *"$gspassmsg"* ]] && exit 3 # password protected pdf
fi
I've double-quoted the variable and parameter references in your gs command
.
I've changed your redirection from just 2>&1
to 2>&1 1>/dev/null
so as to only capture stderr output.
2>&1
redirects stderr ( 2
) to the (still-original) stdout ( 1
), so that error messages are sent to stdout and can be captured as part of the command substitution ( $(...)
); 1>/dev/null
then redirects stdout to the null device, effectively silencing all stdout output. Note that the earlier redirection of stderr to the original stdout is not affected by this, so in effect what the overall command sends to stdout is the original stderr output only. I'm using the more modern and flexible $(..)
command-substitution syntax instead of the legacy `...`
form (for background information, see here ).
I've renamed GS_RES
to gs_res
, because it is better not to use all-uppercase shell-variable names in order to avoid conflicts with environment variables and special shell variables .
I'm using simple pattern matching to find the desired substring in gs
's stderr output. Given that you already have the input to test against in a variable, Bash's own string-matching features will do (which are actually quite varied), and there is no need to use an external utility such as awk
.
As for why your awk
command failed :
It sounds like you're using BSD awk
, such as the one that comes with macOS as of 10.12 (your question is tagged linux
, however):
BSD awk
doesn't support newlines in variable values passed via -v
unless you \\
-escape the newlines.
With unescaped multi-line strings, your awk
call fails fundamentally, before index()
is ever called.
By contrast, GNU Awk and Mawk do support multi-line strings as-is passed via -v
.
Read on for optional background information .
To determine which awk
implementation you're using, run awk --version
and examine the output:
awk version 20070501
-> BSD Awk
GNU Awk 4.1.3, API: 1.1 ...
-> GNU Awk
mawk: not an option: --version
-> Mawk
Here's a simple test to try with your Awk version:
awk -v a=$'1\n2' -v b=2 'BEGIN { print index(a, b) }'
Gnu Awk and Mawk output 3
, as expected, whereas BSD Awk fails with awk: newline in string 1
.
Also note that \\
-escaping newlines works ONLY in BSD Awk (eg,
awk -v var=$'1\\\\\\n2' 'BEGIN { print var }'
), which unfortunately means that there is no portable way to pass multi-line variable values to Awk .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.