Tom Lane pinged me a couple of days ago about why, when there's a build failure, we remove the ccache. The answer is that a long time ago (about 8 years), we had some failures that weren't completely explained but where suspicion arose that ccache was returning stale compilations when it shouldn't have been. I didn't have a smoking gun then, and I certainly don't have one now. Eight years ago we just used this rather elephant-gun approach and moved on.
But Now I've started looking at the whole use of ccache. And the thing I find most surprising is that the hit rate is so low. Here, for example, are the stats from my FreeBSD animal nightjar, after a week since a failure:
So I'm a bit puzzled. Most changes that trigger a build leave most of the files intact. Surely we should have a higher hit rate than 7.3%. If that's the best we can do It seems like there is little value in using ccache for the buildfarm. If it's not the best we can do I need to find out what I need to change to get that best. But so far nothing stands out.cache directory HEAD cache hit (direct) 2540 cache hit (preprocessed) 45 cache miss 32781 called for link 5571 called for preprocessing 1532 compile failed 899 preprocessor error 248 bad compiler arguments 31 autoconf compile/link 3990 no input file 155 files in cache 25114 cache size 940.9 Mbytes max cache size 1.0 Gbytes
Tom also complained that we keep a separate cache per branch. The original theory was that we would be trading disk space for a higher hit rate, but that seems less tenable now, with some hindsight.
Ccache misses mostly come from changes in a central header file. It's included everywhere and thus invalidates many compilation units. The CVS/SVN magic variables or __TIME__ don't play well as well.
ReplyDeleteSee later post about what we found the real problem is.
ReplyDelete