mikeash.com pyblog/friday-qa-2013-05-31-c-quiz.html commentshttp://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsmikeash.com Recent CommentsThu, 28 Mar 2024 22:21:53 GMTPyRSS2Gen-1.0.0http://blogs.law.harvard.edu/tech/rssmikeash - 2013-06-28 15:22:23http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsI'm always impressed with the quality of the comments on my blog. Don't know if I had anything to do with it, but they're almost always great. One of my favorite things about it.a24b000f0b34f8f9b40107a340c01744Fri, 28 Jun 2013 15:22:23 GMTj b - 2013-06-19 05:19:51http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsa) I got them all right. I suppose it helped to have been a member of X3J11. <br /> <br />b) I'm impressed by the quality of the comments, with the exception of the one from OldETC, which is largely wrong, starting with the first sentence (there is no ANSII, and these answers apply to standard C and to Microsoft's implementation of it.24201a8d42ae6fac18803ad603ada521Wed, 19 Jun 2013 05:19:51 GMTverec - 2013-06-16 21:05:34http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#comments1. My guess is that to allow all posible 256 values and still have space for an extra one meaning "nothing there" the type had to be int, otherwise the <i>EOF</i> symbol would have had to hijack an otherwise very valid character slot.818c9195dfab1079de22cc512acba49bSun, 16 Jun 2013 21:05:34 GMTDavid Thornley - 2013-06-12 20:53:49http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsOldETC: The quiz is on ANSI/ISO standard C. It doesn't appear to have anything in it from C99. Any differences between what Microsoft C does and the standard requires are Microsoft's fault. <br /> <br />A character constant is indeed an int in C. It's a char in C++, but not in C. A char must have at least 8 bits and is permitted to have more (although there are very few systems any more where that matters). <br /> <br />short and int are indeed implementation-defined, the idea being that C should run efficiently on various architectures. This does cause confusion, and people need to be careful when transferring data values around. I am unaware that ANSI ever did try to straighten that out, since standardization of data type sizes would make C less useful for systems programming. In C99, there are standard definitions you can #include and use for fixed-length integral types. <br /> <br />Char does indeed become more of a problem when going beyond 7-bit ASCII. You'd have no problem storing a UTF-8 string in an array of char, but the standard C library functions won't handle it well. There's the wchar_t type for wide characters, but that's not normally 32 bits, and a Unicode character won't necessarily fit in 16 bits. C's string handling is sufficiently primitive that it wouldn't be hard to make it work just as well with UTF-8. <br /> <br />BOOL is not a type in C. It isn't possible to restrict the values that can be assigned to an int. <br /> <br />Signed to unsigned comparison is perfectly well defined: convert to unsigned. It can be confusing, and can easily cause surprising results, so it isn't a good idea. <br /> <br />It really doesn't matter what free(x) does; all that's necessary is to reclaim the memory somehow at some time. Doing a double free is undefined behavior, and so is using data after it has been freed. (If you want to make sure it isn't used later, write zeros to it and then free it.) Assigning NULL to a pointer when you free it means that double frees are harmless, will bite you more certainly if you try to use that pointer again, and can be a good idea. <br /> <br />C does have a lot of dark corners in the language, and it really helps to know what the limits of defined behavior are.d85ff96786cb2c6b1756ce9f6ae83eb2Wed, 12 Jun 2013 20:53:49 GMTOldETC - 2013-06-11 16:06:43http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsLooks like this is the Microsoft C version, and not ANSII. There are some differences in Microsoft and ANSII. <br /> <br />For instance, a char is never an int. C will convert it dynamically when used, but it is 8bits and the int is system dependent in C. <br /> <br />Ditto for short. Short is system dependent. It causes some confusion when compilers are implemented. ANSII tried to straighten some of these discrepancies out, but there are issues that make it very difficult (legacy code, differing standards, etc.) <br /> <br />Char also becomes more convoluted when using modern syntax, due to the extended charactersets. <br />I haven't done much programming in recent years, but I realize the issues that will arise. ASCII, which is the 8 bit set is not international. <br /> <br />BOOL has the value 0f 0 or -1 and is an int, but only accepts these values. Basically 0 or non zero. <br /> <br />Signed to unsigned comparison is also implementation dependent. Best to avoid this. Your compiler should emit a warning. <br /> <br />There are several other areas that are implementation dependent. For example, free(x) may result in garbage collection immediately, or set a marker which will ultimately result in garbage collection on most operating systems. The difference means that on some systems the data will still be present although you have nulled the pointer. Using free a second time on that pointer will result in the same as free(null). The result will depend a lot on the compiler, the library and the OS. There are lots of corners in any language that can drive you bonkers, and C because it is so close to the machine has greater power to wreck havoc when misused.53b77031c7b7dd0b4ae494b93e484fd2Tue, 11 Jun 2013 16:06:43 GMTAlex - 2013-06-11 14:41:34http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#comments@Ricky Bennett <br />That looks weird. Does dereferencing 0 on ARM mean dereferencing the beginning of stack? <br /> <br />How does it compare to this? <br /> <br />int main() { // assuming constant size for argv[0] <br />&nbsp;&nbsp;int a; <br />&nbsp;&nbsp;int stack_addr = (int)&amp;a; <br /> <br />&nbsp;&nbsp;// ... <br />}a008763238ad6aeea9f8986dddd97344Tue, 11 Jun 2013 14:41:34 GMTDavid Thornley - 2013-06-11 14:31:23http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsRainer: char constants like that are ints with the value implementation-defined (meaning that the Standard isn't going to tell you how it's handled, but the documentation for the compiler has to). MacOS files (before Mac OSX) had two four-character literals assigned as file and document type (I don't remember the details for applications), which made this very convenient.9b1ffb7d72485fb50d12737d0c0a9e6eTue, 11 Jun 2013 14:31:23 GMTRicky Bennett - 2013-06-11 13:47:30http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#comments(17) On an ARM Cortex-M3 processor, I use *(uint32_t *) 0 to access the initial stack pointer. This is almost the same as *(char *) NULL. It's completely legal with the IAR compiler.0b01d1a7d9e2b300af6a63759b0b698aTue, 11 Jun 2013 13:47:30 GMTLandon Fuller - 2013-06-02 14:37:24http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsNikolai/Jordan: I believe the union type punning behavior was defined in a C99 update, and in C11 proper. <br /> <br />The issue was reported in Defect Report #283 (<a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm">http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm</a>), and C99 Technical Corrigendum 3 was issued with the following footnote: <br /> <br />"If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning"). This might be a trap representation. " <br /> <br />C99 TC3: <a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1235.pdf">http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1235.pdf</a> <br />4397e6bee6498322f016f0c0238c3412Sun, 02 Jun 2013 14:37:24 GMTNick Barkas - 2013-06-02 12:58:37http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsWith regards to <code>free(NULL)</code> being commonly thought to crash or do something else undefined, I wonder if this might be because calling <code>free()</code> twice on a non-NULL pointer causes undefined behaviour. Since <code>free()</code> doesn't NULL out the pointer passed in, I wonder how many bugs might be out there because someone wrote <code>if (ptr) free(ptr)</code> incorrectly thinking they were defending against double <code>free()</code>.8dc27b3885042987c29f947889453250Sun, 02 Jun 2013 12:58:37 GMTJordan - 2013-06-01 17:31:32http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsNikolai: Yes, of course you're right. Both Clang and GCC will DWYM in this case, but you would never want to use this in production code. <code>reinterpret_cast</code> in C++ might have been a more compliant-choice. (IIRC, behavior there is merely implementation-defined and not fully undefined.)f53a34000a3a8765abc0ed9c096b8752Sat, 01 Jun 2013 17:31:32 GMTNikolai Ruhe - 2013-06-01 10:22:52http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsJordan: Please note that using a union like you propose is actually another case of undefined behavior according to the C standard. <br /> <br />You can't write one element and then read another. This is a common misconception of the use of unions. The safe way to do this would be to use memcpy.b288213034b25bf4781c8ba167774447Sat, 01 Jun 2013 10:22:52 GMTMarcel - 2013-06-01 07:56:28http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsOn 19: It turns out that the parens are not necessary in sizeof("abcde"), sizeof "abcde" works just as well. <br /> <br />sizeof is a unary operator, the parens are only necessary if the argument is a type.fc5cbee8e9098d5df56efc4316b9fe05Sat, 01 Jun 2013 07:56:28 GMTRainer Brockerhoff - 2013-05-31 21:34:36http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsOn #1, at least on Mac C compilers, constants like 'ABCD', 'XYZ' and 'MN' have always been ints, too, and 4-character literals were extremely common in the Carbon and Classic days.605d1fa7a96a57ad7bcdabced3660a21Fri, 31 May 2013 21:34:36 GMTKen Keenan - 2013-05-31 20:26:09http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsRegarding free(NULL), I remember using a C compiler for the Macintosh in the late 80s/ early 90s (I'm pretty sure it was ThinkC but I could be wrong) that did very strange things if you freed a null pointer...091d6a4c4298b4827358b95c75c5d854Fri, 31 May 2013 20:26:09 GMTDennis - 2013-05-31 17:28:55http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsRe 20: the same also applies for delete in C++... I removed a bunch of 'if(ptr) delete ptr' from a project at work recently. It's just unnecessary clutter. <br /> <br />And if you overload new in C++, it's easy enough to remove the 'if(!ptr)' checks as well -- just have the new operator call abort() instead, since if you are really out of memory on today's machines, that's probably one of the sanest ways to handle it. (On iOS, I would do something different, this was more for desktop machines).e5b5d993d7374bd8bcb2efe5e71c63d6Fri, 31 May 2013 17:28:55 GMTJordan - 2013-05-31 16:11:19http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsTadpol: It's worth noting that the numeric value of a null pointer does not, in fact, have to be 0 -- the only rule is that when an integral 0 is converted to a pointer, you get a null pointer. That is, this code might not leave 0 in u.i: <br /> <br /><code>union { <br />&nbsp;&nbsp;intptr_t i; <br />&nbsp;&nbsp;void *p; <br />} u; <br />u.p = 0; <i>// or NULL</i> <br />printf("%" PRIdPTR "\n", u.i);</code> <br /> <br />This is one way in which a system that has to be able to access address 0 can still have a null pointer representation that's distinct from all valid pointers. On Apple systems, though, the representation of a null pointer is 0, presumably because most CPUs have instructions that special-case 0 anyway.4a82cc87952415ccf95e93da7c4f3e2aFri, 31 May 2013 16:11:19 GMTmikeash - 2013-05-31 15:12:55http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsYou're the first person to explain the existence of <code>auto</code> to me, thanks very much for that. I knew it was there but never could figure out why.9a595b08e4224b774a909d2fa5bfe466Fri, 31 May 2013 15:12:55 GMTJens Ayton - 2013-05-31 14:31:29http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsOh, yeah: B was designed for 36-bit systems, hence the “convenient” syntax for octal numbers that plagues us to this day.47814c6e4ffccd292498b7fc474ce626Fri, 31 May 2013 14:31:29 GMTJens Ayton - 2013-05-31 14:29:46http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsThe reason for #1 is the same as for many oddities about C types: the type system started as a hack to make B programs portable to platforms where types were needed. (In B, every value was a word that could be treated as an integer or a pointer.) The goal was to require minimal changes for existing B code, hence the various "implicit int" rules and free casting between integer and pointer types. Character literals are ints because character literals existed in B. <br /> <br />This is also why "auto" exists: in B, in the absence of types, every variable had to be introduced with a storage qualifier. <br /> <br />Regarding free(), I’ve always felt the misconception might be more common among old Mac programmers because DisposePtr() wasn’t null-safe.710638475e9b8cf41a2b3afbd21d022bFri, 31 May 2013 14:29:46 GMTTadpol - 2013-05-31 14:18:13http://www.mikeash.com/?page=pyblog/friday-qa-2013-05-31-c-quiz.html#commentsOn 17. Another thing that the compiler can do is to dereference the memory at location 0; thus working like any other memory pointer. I've only seen this on a few very small embedded processors. (small as in less than 1K of RAM.) <br /> <br />On 20. I've seen a at least one implementation where calling free(NULL) does in fact fail in strange ways. The detail here is to make sure you know how well your tools follow the standard. (it was a bug that did get fixed; but not before I had to ship, so if() checks abounded.) <br />fd9397d31318ccf43100eee8e1e9ebdcFri, 31 May 2013 14:18:13 GMT