Reading his comment I've also realized that my post might be interpreted as a criticism on clang and this is not the case - I think clang is a great addition to the existing tools on Mac OS X. It's just that, unlike leaks, static checkers should be used before the testing stage.
clang and the static analyzer that's built on top of it work by analyzing your sources while being compiled and making suggestions. Sort of like the warnings in GCC but on a deeper level, e.g.:
NSString *string;
string = [[NSString alloc] initWithData: data encoding: NSUTF8StringEncoding];
'autorelease' should be called right after alloc & init. Do not leave the responsibility of calling 'release' until later stages.
But clang goes wrong as the code that follows the lines above is:
if(string != nil)
[textBuffer appendString:string];
[string autorelease];
Trusting clang's warning without realizing that two lines later the string variable was already being auto-released results in a crasher due to over-releasing.
Sure, clang is right and it doesn't really make sense to not auto-release the string immediately on but from a functional point of view not acting on clang's warning produces leak-free workable code, while following it blindly (i.e. without re-reading the entire context) makes clang happy but yields a crasher.
Which is exactly what happened with releases 1.5 and 1.6 of BucharestApp.
The moral of the story? never ship untested code.

7 comments:
Thanks for the blog post on clang and the static analyzer.
I'm the principal architect of the clang static analyzer, and I wanted to clarify that the diagnostic mentioned in this blog post is not actually emitted by the static analyzer at all. In fact, the text of that diagnostic does not appear anywhere in the clang source tree, meaning that it is neither emitted as a frontend message (i.e., during parsing) or as a result of doing deeper code analysis.
Further, the diagnostic mentioned in this blog post is something that we actually want to try to avoid in the static analyzer. Since the analyzer has more time to analyze code than a regular compiler does, our goal is to try and provide only useful diagnostics that people will use to actually fix code; vacuous diagnostics like the one you mentioned are far less interesting as people tend to ignore them and (as you pointed out in your blog posting) can lead to misinformation when it comes to coding suggestions. That said, the analyzer isn't perfect (more on this point later).
I believe that the diagnostic reported in this blog post is most likely being emitted by the employed compiler. If you use scan-build to run the analyzer over your project, it will use your regular build setup to compile your code so that the analyzer sees the same source that the compiler does. This means that the normal compiler is also executed along with the analyzer. This means that any output you see to your terminal/console from your build will include an interlacing of (1) diagnostics from your regular compiler, (2) diagnostics from the clang frontend (which are type checking and parsing warnings) and (3) "ANALYZE" diagnostics emitted by the analyzer to indicate what functions it is analyzing and (4) any other diagnostics that your build system emits.
All of the warnings from the static analyzer itself (at least when you use scan-build) are emitted as HTML reports. You will never see text diagnostics on the console from the static analyzer unless you directly invoke the clang executable and specify the necessary command-line options to run specific checkers. As I mentioned before, you may also see text diagnostics from the clang frontend, but these are either parsing or type-checking warnings or typical run-of-the-mill compiler warnings (but hopefully better formatted than conventional compilers).
That said, your blog post has a good point. Static analysis warnings represent a "best effort" approach. We want to provide you meaningful information, but the analyzer warnings are not guaranteed to be sound nor complete. This is a strategy that I decided from the onset to try to aim for a sweet spot of providing valuable results to users with a tolerable amount of noise. The diagnostics provided by the analyzer (e.g., information on what path of control-flow a bug occurs on) are meant to help a user both diagnose a bug as well as verify that the bug is feasible. Over time the diagnostics will continue to improve so that this task becomes easier. Moreover, we plan to gradually increase the precision of the analyzer. This will reduce false positives as well as enable the analyzer to find more bugs.
Another thing I take away from this post is that the workflow of the analyzer might be a little confusing. Because it is essentially interposing on an existing process (i.e., a project build) people may get confused on what entity is emitting specific diagnostics. Any suggestions on how to improve the workflow to make this clearer and easier to use are definitely welcome.
Great tools take time to create, and we greatly appreciate feedback on the analyzer, either by writing about the analyzer in blogs or by filing bug reports. The static analyzer is also an open source project; anyone who wants to can get involved and work on either clang (the frontend) or the static analyzer are welcome! Further, suggestions and patches for new checkers are how to improve existing ones are definitely appreciated. My goal is to make this tool indispensable; if you ever feel that warning from the static analyzer is as vacuous as the one written about in this blog post please do not hesitate to file a bug report.
Thank you again for posting about Clang and the static analyzer! It's great to hear about individual experiences with these tools, and we will endeavor to make them better.
Hi Ted
I'm sorry for placing the blame on clang - I assumed, without checking, that AnalysisTool.app (http://www.karppinen.fi/analysistool/) was running vanilla clang.
In fact, it seems it is not:
cd:AnalysisTool.app diciu$ pwd
/Users/diciu/Downloads/AnalysisTool.app
cd:AnalysisTool.app diciu$ grep -r "Do not leave the responsibility of calling" .
Binary file ./Contents/Resources/llvm/bin/clang matches
Binary file ./Contents/Resources/llvm/lib/libclangAnalysis.a matches
The latest checker's clang does not contain this warning indeed:
cd:checker-137 diciu$ pwd
/Users/diciu/Downloads/checker-137
cd:checker-137 diciu$ grep -r "Do not leave the responsibility of calling" .
cd:checker-137 diciu$
I'll update the post to reflect the actual nature of the problem I was facing.
And thanks for the great work you've put in clang - it's very much appreciated.
Hi,
I'm the author of AnalysisTool. The tool was originally developed for our in-house use at MK&C, and we decided to release it for free since other Mac developers could find it useful. As the home page says, it includes some custom analyses developed by me, so yes, it's a fork from the official clang. The clang revision on which each AnalysisTool build is based on is mentioned each time in release notes of AnalysisTool.
Calling 'autorelease' right after alloc & init is indeed our internal convention, and you can choose to ignore it if you like. The reason for this convention is that we like to avoid coding patterns just like in your example, where autorelease happens later and not on the same line. It's a style issue, but it has minor effect on maintainability of the code.
I acknowledge that AnalysisTool's product page could be clearer about it including a forked version of clang. It should also document all custom analyses it introduces, and enable users to be able to turn them on and off. These tasks are on my todo list.
Hi Nikita and thanks for AnalysisTool - it makes running clang so much easier...
As I said in the post, the autorelease following init makes sense, it's just that I'm used to a different style of maintaining ownership.
Off the top of my head I think my reluctance in auto-releasing comes from the fact that my first Objective-C project was multi-threaded.
So when I've moved from malloc/free to Objective-C memory management rules I was confused by the multiple auto release pools "competing" for the life cycle of an object -> I think this is why I tend to think of init and release as two different steps.
Yes, getting used to autorelease pools takes time and it's a mental shift from malloc/free pattern. However, I think that maintaining a coding convention where objects are nearly always autoreleased (and not released immediately by sending them a 'release' message) prepares developers mindset to garbage collection, where you cannot directly control when your objects are released.
If efficiency of memory usage is important, it can always be controlled by creating new autorelease pools and releasing them manually, just as you can force garbage collector to run more often in GC-enabled app. Emptying autorelease pool (in manually controller memory app) and manually instructing garbage collector to collect objects (in GC-enabled app) is even done with the same method call - 'drain' on NSAutoreleasePool.
But this discussion is a bit off topic :)
The new version of AnalysisTool (build 99, released 15th Jan, 2009) now prefixes all non-standard warnings (not produced by the official clang) with a "AnalysisTool :" prefix, so these kinds of confusions shouldn't happen in the future.
Post a Comment