Next article: First Post
Previous article: Performance Comparisons of Common Operations
Tags: osbug security strnstr
The strnstr
function is broken on Mac OS X 10.4 (and presumably earlier) and should be avoided.
I've probably told most of the people who read this blog about this bug already, but just in case, I thought I'd make a post. I don't usually post when I find bugs in Apple's stuff, but this is a pretty fundamental string function so it's a bit more important.
The quick version is this: strnstr
on 10.4 will sometimes read one byte beyond the end of the length which you specify. Under special circumstances, this will lead to a segmentation fault in correct code.
The following test program illustrates the bug:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
int size = 20480;
char *str = malloc(size);
memset(str, 'x', size);
strnstr(str, "aa", size);
return 0;
}
strnstr
line.
In order to trigger the crash, several conditions must be met. First, the buffer must be lacking a terminating NUL byte. (Note that this is allowed according to how strnstr
is documented to operate.) The search string must not exist in the buffer. The end of the buffer must terminate on a page boundary. And finally, the page following the buffer must be unreadable.
This confluence of circumstances is pretty difficult to produce. I've been using it without incident for a long time. I only discovered the problem when stress-testing some HTTP parsing code, and my stress test happened to cause reads which were nicely lined up with the page size, and it happened to generate enough data to push the malloced block past the malloc memory pool limit and into its own dedicated pages.
This rarity is all the more reason to avoid the function, though, since it just means that when it crashes, it'll likely to be on a user's machine, and it'll likely to be extremely difficult to reproduce.
For my friends out there in Apple land, this bug has been filed as rdar://5504733.
Comments:
strnstr( str, “aa”, size – strlen(“aa” );
The man page is a little ambiguous with how “not more than len characters are searched” is phrased or intended. The implementation may vary well define “search” as matching the first character and overlook the length of the search string when determining when the internal loop stops. (strnstr is from BSD, so I wouldn’t be surprised if the bug is present or independently fixed there.)
I should add, by the way, that it’s quite common for implementations of the string functions to read over the “end” of the string by a small amount. Usually they do this so that they can read words (or even larger elements with SSE/Altivec), and normally this won’t cause a problem because they won’t issue a read that straddles a page boundary (i.e. they’ll read, at most, up to the end of the page where your NUL terminator is).
echo ab | sed -nE s/((a)|(b)){0,}//p
echo ab | sed -nE s/((a)|(b)){0,2}//p
Hint: They should both have returned [b::b]
Wormwood, it is not more correct. It does avoid this bug but it causes a new one. Subtracting the length of the string to be searched will cause it to miss matches at the very end of the string.
alastair, I don’t care about portability to other OSes but I absolutely need my code to work on Tiger, so features in Leopard are useless to me. What’s more, I was not able to find any POSIX/SUSv3 equivalent to strnstr. If you know of one, I’d appreciate hearing about it. And of course reading off the end of the string is legal as long as you don’t cross page boundaries, but the whole problem is that it does read past page boundaries, causing a segfault.
Andrew, I dispute your dispute! The function is quite clear that it will not read beyond the length you give it. That is really the entire reason it exists. Whether or not the string is terminated after the length you give it is irrelevant, because it has no way of legally finding out whether there’s a terminator or not. You will also notice that the man page goes out of its way to explicitly state when strings are terminated. The description of strstr says that it, “locates the first occurrence of the null-terminated string s2 in the null-terminated string s1.” (Emphasis mine.) The description of strnstr states, “locates the first occurrence of the null-terminated string s2 in the string s1….” (Emphasis mine, again.) Note how it neglects to say that s1 is null-terminated. I cannot imagine that this is anything other than intentional.
Else this function would be called strlstr().
The strn…() functions was not created to avoid buffer overflow, but to works on substring.
They all expects to have a valid C string as argument.
Comments RSS feed for this page
Add your thoughts, post a comment:
Spam and off-topic posts will be deleted without notice. Culprits may be publicly humiliated at my sole discretion.