EWONTFIX - Breakincludes

Breakincludes

04 Jul 2013 16:46:23 GMT

A little-known part of GCC's build process is a script called "fixincludes", or fixinc.sh. Purportedly, the purpose of this script is to fix "non-ANSI system header files" which GCC "cannot compile". This description seems to correspond roughly to the original intended purpose of fixincludes, but the scope of what it does has since ballooned into all sorts of unrelated changes. Let's look at the first few rules in fixincludes' inclhack.def:

Changing AIX's _LARGE_FILES redirection of open to open64, etc. to use GCC's __asm__ keyword rather than #define, as the latter breaks C++.
Exposing the long double math functions in math.h on Mac OS 10.3.9, which inexplicably omitted declarations for them.
Adding workaround for Linux 2.2 and earlier kernel bug with direction flag to FD_ZERO macros.
Doing something inexplicable with Solaris's nonstandard header sys/varargs.h.
Removing incorrect char *-based (rather than void *-based) prototypes for memcpy, etc. on Sun OS 4.x, and replacing them with the correct prototypes.
Replacing VxWorks' assert.h with the GCC developers' version. No explanation given for the reason.
Modifying some nonstandard VxWorks regs.h header to include another header whose definitions it depends on.
Replacing VxWorks's stdint.h, which is claimed to be broken, with one full of incorrect definitions, for example #define UINT8_MAX (~(uint8_t)0) (which, contrary to the requirements of C, is not valid for use in preprocessor conditionals).
etc.

(Source: fixincludes/inclhack.def)

Of the first 8 hacks (using GCC's terminology here) cited above, only one deals with fixing pre-ANSI-C headers. One more is fixing serious C++ breakage that would probably make it impossible to use C++ at all with the system headers, but the rest seem to be fixing, or attempting to fix, unrelated bugs that have nothing to do with making the compiler or compilation environment usable. And at least one has introduced major header breakage that might or might not be worse than what was in the vendor's original header.

In other words, what fixincludes evolved into is the GCC developers forcibly applying their own, often arguably incorrect or buggy, bug fix patches to system headers (and in some cases, non-system headers that happen to be in /usr/include) with no outside oversight or input from the maintainers of the software they're patching.

So how does fixincludes work? Basically, it iterates over each header file it finds under /usr/include (or whatever the configured include directory is), applies a set of heuristics based on the filename, machine tuple, and a regular expression matched against the file contents, and if these rules match, it applies a sequence of sed commands to the file. As of this writing, there are 228 such "hacks" that may be applied. The output is then stored in GCC's private include-fixed directory (roughly /usr/lib/gcc/$MACH/$VER/include-fixed), which GCC searches before the system include directory, thus "replacing" the original header when the new GCC is used.

In case it's not already obvious what a bad idea this whole concept is, here are a few serious flaws:

Fixincludes prevents library upgrades from working. Suppose for example you have libfoo version 1.0 installed at the time GCC is built and installed. The fixincludes script decides to patch foo.h, and puts its patched version in GCC's include-fixed directory. Now suppose you install libfoo version 2.0, which comes with a new foo.h and which is incompatible with the definitions in the old version of foo.h. Due to GCC's include path order, the new version of the header will be silently ignored and GCC will keep using the old header from the version of libfoo that was present when GCC was installed. Moreover, since fixincludes does not take any precautions to avoid applying its changes to files other than the original broken file they were intended to fix, library authors who want to avoid the danger of having their users get stuck with old headers must take on the burden of ensuring that their header files don't match any of the patterns in fixincludes.
Fixincludes can lead to unintended copyright infringement or leakage of private data. Unless you are fully aware of fixincludes, when building GCC, you would not expect an unbounded amount of local header files, some of which may be part of proprietary programs or site-local private headers, to end up in the GCC directory. Now, if you package up the GCC directory (think of people building cross compiler binaries and such), you could inadvertently ship copies of these headers in a public release.
Many of the fixes are actually incorrect or fail to achieve what they're trying to achieve. For example, the VxWorks stdint.h "fix" creates a badly broken stdint.h. Another example, which came up in our development of musl, is the fix for va_list exposure in the prototypes in stdio.h and wchar.h. Per ANSI/ISO C, va_list is not defined in these headers (POSIX, on the other hand, requires it to be defined), so GCC uses bad heuristic regex matches to find such exposure and change it to __gnuc_va_list. Somehow (we never determined the reason), the resulting headers were interfering with the definition of mbstate_t and preventing libstdc++ from compiling successfully. In addition, we found that, while attempting to remedy an extremely minor "namespace pollution" issue in these headers, fixincludes was making a new namespace violation: for its double-inclusion guard macro, it used FIXINC_WRAP_STDIO_H_STDIO_STDARG_H, a name that happens to be in the namespace reserved for the application, not the implementation.
The rules for whether and how to apply the "hacks" are poor heuristics, and no effort is made to avoid false positives. The README for fixincludes even states (line 118) their policy of applying hacks even when they might be a false positive, with no consideration for how incorrectly applying them (after all, they are hackish sed replacements, not anything robust or sophisticated) might break proper, working headers.

How could this situation be fixed? The GCC developers claim fixincludes is still needed (see also here), and while I'm fairly skeptical of this claim, I don't think it's a matter where they'll be convinced otherwise in the near future, so I'd like to look for other more constructive approaches. Here are the steps I think would be needed to fix fixincludes:

Remove all outdated hacks, i.e. hacks for systems which GCC no longer supports. While not strictly necessary, cleaning up the list of hacks in this manner should make the next steps more practical.
Remove all hacks for files that are none-of-GCC's-business. That means anything that doesn't absolutely need to be fixed to successfully compile GCC or provide a working (not necessarily fully conforming, if the underlying system was non-conforming, but "working") build environment after installation.
Eliminate false positives and buggy sed replacements by adding to the hack definitions in inclhack.def a list of hashes for known-bad files the hack is meant to be applied to. If necessary, include a configure option, off by default, that would ignore the hashes.
Where some of the "fixes" made by fixincludes themselves have bugs like namespace violations or macros that do not meet the requirements for being usable in the preprocessor, they should be changed to output something more correct. There is no justification for replacing one broken header with another, potentially worse, broken header.
Add a --disable-fixincludes option to configure so that fixincludes can be completely turned off. This would be ideal for system integrators, packagers, and basically anyone installing GCC from source on a modern system. It's especially important for the case where the user is installing GCC on a system that already has many third-party library headers in /usr/include, some of which may be "broken" in the eyes of fixincludes, where "fixing" them would have the dangerous consequence of preventing future library upgrades from working properly.

Finally, I suppose one might wonder why something that seems so broken, as I've described fixincludes, might go undetected for so long. The explanation is simple: distros. Most users of GCC use binary packages prepared for a particular OS distribution, where the packager has already cleaned up most of the mess, either by building GCC in a sterile environment where it can't find any headers to pick up and hack up, or by pruning the resulting include-fixed directory. Thus, the only people who have to deal with fixincludes are people who build GCC from the source packages, or who are setting up build scripts for their own deployment/distribution.

For the curious, here are some links to the tricks distros do to overcome fixincludes:

Gentoo toolchain.eclass
Arch Linux gcc package
Sabotage Linux gcc package
Linux From Scratch 7.1
musl-cross cross compiler build
OpenSDE gcc package, whose comments claim there is another problem in fixincludes related to cross compiling, of which I am not aware.

It's unclear to me exactly what Debian does, but as their installed include-fixed directory is very minimal, they must also be doing something. I have not inspected the other major binary distributions with complex build and package systems, but casual experience suggests they are taking some measures to contain the breakage of fixincludes.