Next article: Link: Implementing imp_implementationWithBlock
Previous article: Friday Q&A 2011-03-18: Random Numbers
Tags: fridayqna gcd kqueue signal
Happy April Fool's Day to all my readers, and welcome to one web site which won't irritate you all day with bizarre practical jokes. Instead, I bring you another edition of Friday Q&A. In this edition, I will discuss various ways of handling signals in Mac programs, a topic suggested by friend of the blog Landon Fuller.
Signals
Signals are one of the most primitive forms of interprocess communication imaginable. A signal is just a small integer sent to a process. You can send a signal using the kill
command, which also has a corresponding function available from C.
When a signal is delivered, it can terminate the process, pause/resume the process, be ignored, or invoke some custom code. That last option is called signal handling, and that is what I want to discuss today.
The list of defined signals can be seen in the header sys/signal.h
. Many of these are used for familiar purposes. SIGINT
is the signal generated when you press control-C in the shell. SIGABRT
is used to kill your program when you call abort()
, and SIGSEGV
is the infamous segmentation fault, which pops up when you dereference a bad pointer.
Signal handling is esoteric and most programs don't need to worry about it at all. However, there are cases where it can be useful. For terminal and server programs, it's handy to catch SIGHUP
, SIGINT
, and other similar signals to do cleanup before exiting, as a sort of low-level version of Cocoa's applicationWillTerminate:
. The SIGWINCH
signal is handy for sophisticated terminal applications. SIGUSR1
and SIGUSR2
are user-defined signals which you can use for your own purposes.
sigaction
The lowest level interface for signal handling is the sigaction
function. It provides some sophisticated and arcane options, but the important part is that it allows you to specify a function which is called when the signal in question is delivered:
static void Handler(int signal)
{
// signal came in!
}
struct sigaction action = { 0 };
action.sa_handler = Handler;
sigaction(SIGUSR1, &action, NULL);
Wrong.
Reentrancy
The problem is that signals are delivered asynchronously, and the function registered here is also invoked asynchronously. Code always has to run on a thread somewhere. Depending on how the signal is generated, the handler is either run on the thread that the signal is associated with (for example, a SIGSEGV
handler will run on the thread that segfaulted) or it will run on an arbitrary thread in the process. The problem is that it's essentially an interrupt in userland, and whatever code was running when it came in will be paused until the handler is done.
As anyone who was around in the classic Mac days knows, writing code that runs in an interrupt is hard. The problem is reentrancy. Many people confuse reentrancy with thread safety, but they are not the same concept, although they are somewhat similar.
Thread safety means that a particular piece of code can run on multiple threads at the same time safely. Thread safety is most commonly accomplished by using locks. A call acquires a lock, does work, releases the lock. A second thread that comes along in the middle will block until the first thread is done.
If code is reentrant that means that a particular piece of code can run multiple times on the same thread safely. This is different and considerably harder.
What if you take the thread safety approach of locking and apply it to reentrancy? The first call acquires the lock. While it's active, the code is called again. It tries to acquire the lock, but the lock is already taken, so it blocks. However, the first call can't run until the second call is done. The second call can't run until the first call is done. The result is a frozen program.
Writing reentrant code is hard, and as a result very few system functions are reentrant. Because a signal handler functions as an interrupt, it can only call reentrant code. You can't call something as simple as printf
safely, because printf
could take a lock, and if there's already an active call to printf
on the thread where the handler runs, you'll deadlock.
The sigaction
man page gives a list of functions you are allowed to call from a signal handler. It's pretty limited.
The complete list is: _exit()
, access()
, alarm()
, cfgetispeed()
, cfgetospeed()
, cfsetispeed()
, cfsetospeed()
, chdir()
, chmod()
, chown()
, close()
, creat()
, dup()
, dup2()
, execle()
, execve()
, fcntl()
, fork()
, fpathconf()
, fstat()
, fsync()
, getegid()
, geteuid()
, getgid()
, getgroups()
, getpgrp()
, getpid()
, getppid()
, getuid()
, kill()
, link()
, lseek()
, mkdir()
, mkfifo()
, open()
, pathconf()
, pause()
, pipe()
, raise()
, read()
, rename()
, rmdir()
, setgid()
, setpgid()
, setsid()
, setuid()
, sigaction()
, sigaddset()
, sigdelset()
, sigemptyset()
, sigfillset()
, sigismember()
, signal()
, sigpending()
, sigprocmask()
, sigsuspend()
, sleep()
, stat()
, sysconf()
, tcdrain()
, tcflow()
, tcflush()
, tcgetattr()
, tcgetpgrp()
, tcsendbreak()
, tcsetattr()
, tcsetpgrp()
, time()
, times()
, umask()
, uname()
, unlink()
, utime()
, wait()
, waitpid()
, write()
, aio_error()
, sigpause()
, aio_return()
, aio_suspend()
, sem_post()
, sigset()
, strcpy()
, strcat()
, strncpy()
, strncat()
, strlcpy()
, strlcat()
.
Finally, the list ends with this amusing note: "...and perhaps some others." "Perhaps" is not a nice word to run into in this sort of documentation.
You can call your own reentrant code, but you probably don't have any, because it's hard to write, it can't call any system functions except from the above list, and you never had any reason to write it before. For the Objective-C types, note that objc_msgSend
is not reentrant, so you cannot use any Objective-C from a signal handler.
There is very little that you can do safely. There is so little that I'm not even going to discuss how to get anything done, because it's so impractical to do so, and instead will simply tell you to avoid using signal handlers unless you really know what you're doing and you enjoy pain.
Fortunately, there are better ways to do these things.
kqueue
One of those better ways is to use kqueue
. This is a low level operating service which allows a program to monitor many different events, and one of the events it can monitor is signals. You can create a kqueue
just for signal handling, or you can add a signal handling event to an existing kqueue
you already have within your program.
Setting things up is a bit more involved, but all in all not too hard. First, the kqueue
is created:
int fd = kqueue();
struct kevent event = { SIGUSR1, EVFILT_SIGNAL, EV_ADD, 0, 0 };
kevent(fd, &event, 1, NULL, 0, NULL);
kqueue
to watch for SIGUSR1
being delivered to the process. Note that kqueue
exists separately from the lower level sigaction
handling. Because we don't want the program to terminate when the signal is delivered, which is the default behavior, we also have to tell sigaction
to ignore it:
struct sigaction action = { 0 };
action.sa_handler = SIG_IGN;
sigaction(SIGUSR1, &action, NULL);
kqueue
is now ready. We can wait for it to receive an event by calling kevent
again, this time not adding anything, but having it give us an event:
struct kevent event;
int count = kevent(fd, NULL, 0, &event, 1, NULL);
if(count == 1)
{
if(event.filter == EVFILT_SIGNAL)
printf("got signal %d\n", (int)event.ident);
}
printf
or any other code when handling the signal. Convenient!
kqueue
isn't always all that convenient to use in real programs, though. There are two reasonable ways to do it. One way is to have a dedicated signal handling thread which sits in a loop calling kevent
repeatedly. Another way is to add the kqueue
file descriptor to your runloop using something like CFFileDescriptor
to integrate it with your Cocoa runloop. However neither of these is particularly great.
GCD
Finally we reach a signal handling solution which is extremely easy to use: Grand Central Dispatch. In addition to the better-known multiprocessing capabilities, GCD also includes a full suite of event monitoring abilities which match those of kqueue
. (And in fact, GCD implements them using kqueue
internally.)
To handle a signal with GCD, we create a dispatch source to monitor the signal:
dispatch_source_t source = dispatch_source_create(DISPATCH_SOURCE_TYPE_SIGNAL, SIGUSR1, 0, dispatch_get_global_queue(0, 0));
dispatch_source_set_event_handler(source, ^{
printf("got SIGUSR1\n");
});
dispatch_resume(source);
kqueue
, this exists separately from sigaction
, so we have to tell sigaction
to ignore the signal:
struct sigaction action = { 0 };
action.sa_handler = SIG_IGN;
sigaction(SIGUSR1, &action, NULL);
That's it! Every time a SIGUSR1
comes in, the handler is called. Because the source targets a global queue, the handler automatically runs in a background thread without interfering with anything else. If you prefer, you can give GCD a custom queue, or even the main queue, to control where the handler runs. Like with kqueue
, because the handler runs normally on a normal thread, it's safe to do anything in it that you would do in any other piece of code. GCD makes signal handling convenient, easy, and safe.
Conclusion
Signal handling is a rare requirement, but sometimes useful. Using the low level sigaction
to handle signals makes life unbelievably hard, as the signal handler is called in such a way as to place extreme restrictions on the code it contains. This makes it almost impossible to do anything useful in such a signal handler.
The best way to handle a signal in almost every case is to use GCD. Signal handling with GCD is easy and safe. On the rare occasions where you need to handle signals, GCD lets you do it with just a few lines of code.
If you can't or don't want to use GCD but still want to avoid sigaction
, kqueue
provides a good middle ground. While it's more complicated to set up and manage than the GCD approach, it still works well to handle signals in a reasonable manner.
That wraps up today's April Fool's edition of Friday Q&A. Come back in two weeks for the next one. Until then, as always, keep sending me your ideas for topics. Friday Q&A is driven by reader suggestions, so if you have something you would like to see covered, send it in!
Comments:
2) Always remember to backup and restore the previous signal mask if one uses pthread_sigmask() or sigprocmask(). As a general rule, one cannot assume that one's caller hasn't also fiddled with the mask.
3) Library writers should never install real signals handlers via signal() or sigaction(). The kernel only supports one handler and therefore that is the right of the app, not libraries. Libraries should use technologies like GCD or kqueues directly (if one must).
4) Keep in mind that setting SIGCHLD to SIG_IGN has standards defined side effects. Namely, one cannot call the wait*() family of APIs against child processes when SIGCHLD is ignored.
Reentrancy doesn't necessarily have to apply to interrupts. For example, you'll find a note in the Cocoa documentation that NSNotificationCenter is reentrant. That most certainly does not mean that you can call it from a signal handler! Instead, what this means is that you can safely reenter NSNotificationCenter by calling into it from code which is in turn being called by NSNotificationCenter because of a posted notification.
That sort of reentrancy is much more useful (it's a good idea for almost any code with callbacks) and much easier to achieve (just make sure you're in a clean state and not holding any locks when you call the callback).
In the context of signal handling they're really the same, but in general not entirely.
1) Race conditions. For example:
if (bit) do_something();
// signal fires
r = select();
// select doesn't return -1 with errno == EINTR in this case like one would expect. Therefore: the bit isn't noticed and acted upon until the next FD becomes readable/writable, which may be a long time
This is why pselect() was later invented, so that one might control when the signals fire. The availability of pselect() doesn't help a developer though if they're using a system provided event loop technology rather than rolling their own. This is one of may reasons why facilities like GCD exist.
2) The vast majority of app and library code doesn't check for errno being equal to EINTR after an error and retry. That is why when we were designing GCD, we blocked all of the maskable signals from being delivered on GCD threads.
write()
is a safe call to make from one. Create a pipe, stick the read end into your event system, and write a byte to the write end to signal. Make sure the pipe is nonblocking, though, otherwise you could be in serious trouble.
Of course there's no real reason to do that rather than using the built-in facilities which take care of the difficult parts for you.
signal handler:
do {
old_bits = gSignalBits;
new_bits = old_bits | (1 << signum);
if (old_bits == new_bits)
return;
} while(!CAS(new_bits,old_bits,&gSignalBits));
nonblocking_write(fd,1);
threaded signal dispatcher:
while(1)
{
do {
bits = gSignalBits;
while(!CAS(bits,0,&gSignalBits));
if (bits == 0)
blocking_read(fd,1);
else
{
if ((bits & DO_SOMETHING_BIT) != 0)
do_something();
}
}
At best they provide absolutely no protection what-so-ever because they are run on the same thread and just work. Which is typically what the misguided programmer thinks they wanted.
At worst they hang in the spin lock portion of the locking/unlocking primitive, unless it's entirely atomic implemented via a single CAS (most aren't). Which is just like with a regular lock, but with a such a small deadlock window that most programmer's don't catch them for years.
I think this is referring to memcpy(), memset(), and similar functions that are simple enough that they are pretty much always async signal safe.
We are trying to detect memory leaks when quitting our OS X application by calling 'leaks' on our own process. In some cases 'leaks' crashes or hangs, which stalls our app (in a freed/wait4). For some reason I can't catch the SIGCHLD signal in our process.
The leak detection process is called in the termination function, registered with atexit(). The leak detection in its simplest form uses popen/fread/pclose, but I have tried also other approaches, from kevent, GCD, pthreads, and NSPipe. None of them seems to work, I always hang somewhere (wait4, kevent, sigsuspend_nocancel).
I'd appreciate any pointers where I might go wrong; I'm sure it's my inability to capture some important point.
Thanks a lot,
Akos (asomorjai at graphisoft.com)
I think that leaks will pause your process, at least occasionally, while doing its analysis in order to capture a consistent snapshot of it.
If that's the case, then you end up with a deadlock. Leaks is writing into a pipe that's drained by your process. But your process is paused by leaks, so it can't drain the pipe. If the pipe fills up, leaks will block waiting for you to drain it, but if your process is still paused, you'll never drain it.
A simple workaround would be to have leaks write its data into a file instead of a pipe. Your app can then read out of the file when it's done. You can do this with popen by just tacking on > somefile to the command, or with NSTask by setting its stdout to an NSFileHandle.
#0 0x00007fff83fb96ac in wait4 ()
#1 0x00007fff86afd894 in pclose ()
#2 0x00000001149a409d in DetectLeaks at /Users/asomorjai/Work/DevMain.8/Sources/GSRoot/GSRootDLL/LeakDetectorMac.mm:1442
#3 0x000000011476a25f in GSTermImage at /Users/asomorjai/Work/DevMain.8/Sources/GSRoot/GSRootDLL/GSRootMain.cpp:161
#4 0x00007fff86af1525 in __cxa_finalize ()
#5 0x00007fff86af368b in exit ()
#6 0x000000010000403b in start ()
Here's the actual code; it runs on the main thread:
extern "C" void DetectLeaks (void)
{
if (!NeedsLeaks ())
return;
signal (SIGCHLD, SIG_DFL);
pid_t myPid = getpid ();
char s[2048], fn[128];
sprintf (fn, "/tmp/leaks_%d.txt", myPid);
sprintf (s, "leaks -exclude \"-[NSApplication(NSWindowsMenu) setWindowsMenu:]\" %d > %s\n", myPid, fn);
printf ("%s\n", s);
FILE *fp = popen (s, "r");
if (fp != nullptr) {
pclose (fp);
fp = fopen (fn, "r");
if (fp != nullptr) {
SInt32 bytesRead = 1;
while (bytesRead > 0) {
char buffer [1024];
bytesRead = fread (buffer, sizeof (char), sizeof (buffer) - 1, fp);
buffer[bytesRead] = '\0';
for (char *p = buffer; *p != '\0'; p++) {
if (*p == '|')
*p = '\n';
}
printf ("%s", buffer);
}
fclose (fp);
}
}
}
How can I detect that 'leaks' has finished/crashed, if —for some reason— I don't get the SIGCHLD signal? Or how can I detect where the SIGCHLD is delivered?
Thanks, Akos
I do it using system(). Here's the code:
#define LEAKS_LOG "~/Library/Logs/DiagnosticReports/JMP-LeaksReport.txt"
void hostDebugLeaks()
{
/*
* This function is called from JSL via the
* Debug( Leaks );
* command.
*
* It works best if you have launched JMP with the symbol MallocStackLogging set to 1,
* like so:
* MallocStackLogging=1 ./JMP.app/Contents/MacOS/JMP
*
* If leaks are found, this function writes a leaks report to ~/Library/Logs/
* DiagnosticReports/JMP-leaks.log. JMP is then exited.
*
* This can be used with the TestBot in leaks detection mode. In this mode, TestBot will
* launch JMP with MallocStackLogging enabled, then run the UT Framework with
* _utDailyBuild=1 and _utLeaks=1. This causes UT Framework to check for leaks after
* running each unit test. If a leak is found, JMP exits and TestBot captures the leaks
* report for the leaking test. Then JMP is restarted and the test stream resumes on the
* following test.
*/
#if 0
// Debugging - force a leak
int * leaky = new int[42];
leaky = 0;
#endif
// Run the 'leaks' command on our process; capture output; look for magic "no leaks" string
JString leaksCmd( "leaks ^PID 2>&1 | tee ^LOG | grep \": 0 leaks for 0 total leaked bytes.\"" );
leaksCmd.replace( "^PID", getpid()).replace( "^LOG", LEAKS_LOG );
int status = system( leaksCmd.value());
// If status is 0, it means the magic "no leaks" string was found, so we don't have any leaks.
// Delete the log file and return.
if( status == 0 ) {
JString rmLogCmd( "rm -f ^LOG" );
rmLogCmd.replace( "^LOG", LEAKS_LOG );
system( rmLogCmd.value());
return;
}
// We have a leak; leave the log file intact and force JMP to exit
NSLog( @"Leaks detected; exiting" );
exit( EXIT_FAILURE );
}
I'm sure you can ignore/interpret the parts that are specific to our internal libraries.
Prior to Mavericks, the system() call would hang for us occasionally, because of a crash in leaks as you describe. I had the opportunity to discuss it with an Apple engineer at WWDC a few years back and he thought he knew what the problem was. I submitted a radar and it is apparently now fixed.
Rather than having our app clean up the call stack in the captured leak report as your code does, we have the testing driver ("TestBot" in the above comment) do it. It's a perl script and the cleanup code looks like this:
sub process_leaks_report {
my( $file ) = @_;
return unless -e $file;
my $modified = 0;
my $contents = '';
open( LOG_FILE_IN, '<', $file ) || die "Cannot open $file: $!";
while( <LOG_FILE_IN> ) {
chomp;
if( s/^(\s*Call stack: .*?:) \| // ) {
$contents .= "$1\n\t\t";
$contents .= join( "\n\t\t", reverse split( / \| /, $_ ));
$contents .= "\n";
++$modified;
next;
}
$contents .= "$_\n";
}
close( LOG_FILE_IN );
if( $modified ) {
open( LOG_FILE_OUT, '>', $file ) || die "Cannot create $file: $!";
print LOG_FILE_OUT $contents;
close( LOG_FILE_OUT );
}
}
What interests me is your exclusion of setWindowsMenu:. I see that leak too. Have you reported it? Have you done any further investigation on it?
Is there any way to obtain siginfo_t when handling signal with kqueue?
With epoll on signalfd (Linux) its just as simple as read signalfd_siginfo from fd on which signal is received.
[[UIApplication sharedApplication] openURL:[NSURL URLWithString:@"tel://1111111"]];
, is that at all possible?Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.