EWONTFIX

Incorrect configure checks for availability of functions

14 Aug 2013 00:35:06 GMT

The short version: using functions without prototypes is dangerous, and bad configure script recipes directly encourage this practice.

One of the most basic and important checks configure scripts perform is checking for the availability of library functions (either in the standard library or third-party libraries) that are optional, present only in certain versions, wrongly missing on some systems, and so on. In a sense, this is the main purpose of having a configure script, and one would think this kind of check would be hard to get wrong. Unfortunately, these checks are not only possible to get wrong, they're usually wrong, especially in GNU software or other software using gnulib.

The basic problem is that most configure scripts check only for the existence of a symbol whose name matches that of the interface they want to use. The basic formula for this test is to make a C program which does nothing but call the desired function. There are several general strategies that are used:

  1. Including the headers needed to use the function, then calling the function from main.
  2. Omitting all headers, prototyping the function correctly, then calling it from main.
  3. Omitting all headers, declaring the function with a bogus (prototype or non-prototype) declaration, then calling it from main.
  4. Omitting all headers and calling the function from main with no declaration whatsoever.

All of the above are wrong, with varying degrees of wrongness, but the basic common problems they all share are that:

In addition, the various approaches are likely to give false positives and/or false negatives depending on factors like the version of C the compiler is providing, and user-provided CFLAGS such as -Werror=implicit-function-declaration.

Here is an excerpt of the latest configure check breakage we encountered building GNU coreutils on musl 0.9.12:

#undef strtod_l

/* Override any GCC internal prototype to avoid an error.
   Use char because int might match the return type of a GCC
   builtin and then its argument prototype would still apply.  */
#ifdef __cplusplus
extern "C"
#endif
char strtod_l ();
/* The GNU C library defines this for functions which it implements
    to always fail with ENOSYS.  Some functions are actually named
    something starting with __ and the normal name is an alias.  */
#if defined __stub_strtod_l |    defined __stub___strtod_l
choke me
#endif

int
main ()
{
return strtod_l ();
  ;
  return 0;
}

Note the usage of an incorrect declaration; this could actually fail to link under an advanced linker performing LTO, resulting in a false-negative. And, as mentioned above, it can result in a false-positive if there is no proper declaration of strtod_l for the application to use. This issue affected musl 0.9.12 because we added strtod_l purely as a compatibility symbol for loading binary libraries which use it, but did not intend for it to be part of the API for linking new programs. GNU coreutils happily attempted to use strtod_l with no prototype, resulting in treatment of the (junk) contents of the integer return-value register as the result of the conversion.

One legitimate question to ask would be: Is this musl's fault? In other words, is it a bug to lack a public declaration for a symbol which exists and which is a public interface on some systems? I believe the answer is no, for multiple reasons:

What it comes down to is that the only public interfaces are those which are documented as public, and short of human-readable documentation indicating otherwise, the documentation of public interfaces is the headers associated with a library. Assuming the existence of a public interface based on the existence of a symbol with a particular name is not reasonable, especially when the whole point of having a configure script is to detect rather than assume.

So what would a correct check for strtod_l look like? A good start would be:

#include <stdlib.h>
#include <locale.h>
int main()
{
    char *p;
    return (strtod_l)("", &p, (locale_t)0);
}

Enclosing the function name in parentheses serves two purposes: it suppresses any function-like macro, and it forces the compiler to produce an error when there is no declaration for the function.

Unfortunately, the argument check logic above seems difficult to handle in an automated check generator. So a better approach might be something like:

#include <stdlib.h>
#include <locale.h>
int main()
{
    double (*p)(const char *, char **, locale_t) = strtod_l;
}

This version has even stronger protection against implicit function declarations, since strtod_l is never called; instead, its address is taken. The pointer assignment should then yield an error if the types are not compatible, giving us a clean formulatic check for types.

Unfortunately, these types of configure bugs are endemic, and for musl, the solution is going to be providing public prototypes (under the appropriate feature tests). The alternative, taking explicit action to prevent linking against the symbols, involves hacks that are even worse and more fragile than what configure is doing.