简体繁体中英

System call time out?

原文 2010-06-07 19:57:52 2 5 c/ unix/ system-calls

I'm using unix system() calls to gunzip and gzip files. With very large files sometimes (ie on the cluster compute node) these get aborted, while other times (ie on the login nodes) they go through. Is there some soft limit on the time a system call may take? What else could it be?

5 answers

The calling thread should block indefinitely until the task you initiated with system() completes. If what you are observing is that the call returns and the file operation as not completed it is an indication that the spawned operation failed for some reason.

What does the return value indicate?

Almost certainly not a problem with use of system(), but with the operation you're performing. Always check the return value, but even more so, you'll want to see the output of the command you're calling. For non-interactive use, it's often best to write stdout and stderr to log files. One way to do this is to write a wrapper script that checks for the underlying command, logs the commandline, redirects stdout and stderr (and closes stdin if you want to be careful), then execs the commandline. Run this via system() rather than the OS command directly.

My bet is that the failing machines have limited disk space, or are missing either the target file or the actual gzip/gunzip commands.

I'm using unix system() calls to gunzip and gzip files.

Probably silly question: why not use zlib directly from your application?

And system() isn't a system call. It is a wrapper for fork()/exec()/wait(). Check the system() man page. If it doesn't unblock, it might be that your application interferes somehow with wait() - eg do you have a SIGCHLD handler?

If it's a Linux system I would recommend using strace to see what's going on and which syscall blocks.

You can even attach strace to already running processes: # strace -p $PID

Sounds like I'm running into the same intermittent issue indicating a timeout of some kind. My script runs every day. I'm starting to believe GZIP has a timeout.

gzip -vd filename.txt.gz 2>> tmp/errorcatch.txt 1>> logfile.log
stderr: Error for filename.txt.gz
Moves to next command 'cp filename* new/directory/', resulting in zipped version of filename in new directory
stdout from earlier gzip showing successful unzip of SAME file: filename.txt.gz: 95.7% -- replaced with filename.txt
Successful out file from gzip is not there in source or new directory.
Following alerts, manual run of 'gzip -vd filename.txt.gz' never fails.

Details:

Only one call in script to unzip that file
Call for unzip is inside a function (for more rebust logging and alerting)
Unable to strace in production
Unable to replicate locally
In occurences over last month, found no consistency among file size, only

I'll simply be working around it with a retry logic and general scripting improvements, but I want the next google-er to know they're not crazy. This is happening to other people!

sh: out of memory for system call

close() system call takes long time to finish

C system call fails from second time on

Why doesn't time() invoke a system call?

Sleep system call ,What is the default sleep time?

Linux system call time() is returning ((time_t) -14) on error

the select() socket system call on mac osx taking a long time for reads

Write kernel system call that prints human readable time

Does calling a stdio function result in a system call every time?

Why my time count of a system call in linux is almost 0 us?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question sh: out of memory for system call close() system call takes long time to finish C system call fails from second time on Why doesn't time() invoke a system call? Sleep system call ,What is the default sleep time? Linux system call time() is returning ((time_t) -14) on error the select() socket system call on mac osx taking a long time for reads Write kernel system call that prints human readable time Does calling a stdio function result in a system call every time? Why my time count of a system call in linux is almost 0 us?

Related Tags

System call time out?

Question

5 answers

solution1
1 ACCPTED 2010-06-07 20:07:55

solution2
0 2010-06-07 20:17:33

solution3
0 2010-06-07 23:23:15

solution4
0 2010-06-07 23:26:21

solution5
0 2015-11-16 19:49:25

System call time out?

Question

5 answers

solution1 1 ACCPTED 2010-06-07 20:07:55

solution2 0 2010-06-07 20:17:33

solution3 0 2010-06-07 23:23:15

solution4 0 2010-06-07 23:26:21

solution5 0 2015-11-16 19:49:25

solution1
1 ACCPTED 2010-06-07 20:07:55

solution2
0 2010-06-07 20:17:33

solution3
0 2010-06-07 23:23:15

solution4
0 2010-06-07 23:26:21

solution5
0 2015-11-16 19:49:25