Bobulate


Archive for the ‘hacking’ Category

A little bit Elfish

Friday, October 22nd, 2010

Q: how many different STL implementations are there available on OpenSolaris?
A: at least five by my count: Cstd, stlport4, stlport5, stdcxx4 (either as package from Oracle or from our KDE4-Solaris packages) and g++.
Q: can you mix and match them?
A: only if you like interesting segfaults. Take ‘Hello World’ and compile as follows: CC -lstdcxx4 hello.cc. Ignore the (dire and in this case serious) linker warnings. See program. Run program. Run, program, run! See program dump core right on the new carpet. Bad program. The crash happens inside the static constructors of one of the two libraries as it calls methods in the other one which has a very different idea of how things work. LD_DEBUG=libs might help you spot this. Compare also CC -library=no%Cstd -lstdcxx4 -lCstd versus CC -library=no%Cstd -lCstd -lstdcxx4 (the former crashes, the latter doesn’t, for some reason).
Q: so why is this a problem?
A: plugins make this a problem. If you load a plugin (i.e. dlopen() a shared object) which uses a different C++ STL library from the one you’re already linked to, then you hit the problem described above:
Q: so how can that happen?
A: different parts of an OpenSolaris system are built with different C++ STLs. This is a historical problem; at some point there were plans to slowly migrate all the desktop-relevant stuff to stdcxx4 (that’s the Apache STL, by the way).but it’s a big effort and takes time. One component that is still built against Cstd is Firefox. So all the Firefox plugins that are written in C++ or that use system components in C++, link to Cstd. Like, for instance, libtotem plugins.
Q: so just don’t load them, right?
A: how do you identify “them” then? In Qt 4.7.0 the demo browser (WebKit based) goes and loads all the Firefox plugins, pretty much on startup. And that crashes in a slow and strange way — it dlopen()s a libtotem plugin, loads in libCstd, runs the initializers on libCstd and boom. It’s probably related to the order in which the libraries are loaded. In any case, we don’t know beforehand what libraries a given plugin uses nor can we say what plugins there are.

If you like, you can pretend this was a conversation between Legolas and Haldir and not a run-in to looking at library files in ELF format.

So, having established that plugin loading can kill a Qt application in a gory fashion and that it happens in practice, the KDE4-Solaris team (a kind of Fellowship of the Ring, committed to binding themselves in the darkness that is OSOL) went looking for a solution. With /usr/bin/dump -Lv you can get a listing of dynamic items in an ELF object, including the libraries it’s going to load (e.g. grep for NEEDED). We decided that a likely place to insert a check was in QLibrary::load_sys(), which has some code to search for libraries and also already has some checks in place for strange platforms. So we had in mind something like bool is_library_acceptable(const char *filename) which would do the necessary check. If the library that’s about to be loaded uses Cstd, then don’t load it at all and return failure. That way, we avoid the mysterious crashes caused by STL incompatibility.

When mucking about with ELF files, the obvious resource to use is libelf. However, the libelf available on OpenSolaris is incompatible with largefile support. Why? Because it uses off_t in some structures instead of a 64-bit offset which could accomodate but 32- (non-largefile) and 64-bit (largefile) offsets. It does have the decency to #error out.

I suppose gelf would have been a solution. But thinking about the problem of finding which libraries are NEEDED by a loadable object, I came up with the following algorithm (which applies regardless of which ELF reading approach I’m using):

  • Read the ELF header
  • Find the ELF .dynamic section.
  • Scan the .dynamic section for NEEDED tags, look them up in the .dynstr section.
  • Bail out if libCstd.so is mentioned.

The precise details — sizes of headers, layouts of structures — depends on whether it’s a 32-bit or a 64-bit ELF file, but the alogrithm remains the same.

I initially coded this in C, with macros. (Bear with me here — I realize now that I tend to iterate over my own code as I understand the problem and solution space better, so I often start with something horrible that gets improved later). Then the code ended up looking like this:

#define ALGORITHM_STEP(d0,d1) \
  /* a step using values from d0 and d1 */
if (is32) { 32bit_t d0,d1; ALGORITHM_STEP(d0,d1); }
else { 64bit_t d0,d1; ALGORITHM_STEP(d0,d1); }

Ugh. From there I went to C++ and template functions. Each macro became a template, but I could also put the types in there, albeit a little awkwardly:

template <typename t0, typename t1> ALGORITHM_STEP() 
{ t0 d0; t1 d1; /* a step using values from d0 and d1 */ }
if (is32) { ALGORITHM_STEP<32bit_t0,32bit_t1>() } 
else { ALGORITHM_STEP<64bit_t0, 64bit_t1>() }

Feel free to go “Ugh” again, especially if you’re an experienced C++ template hacker. Oddly enough, I haven’t written much in the way of neat template code, just usually debugged what others have done. So by now my program was a bunch of template functions and then a main function with a sequence of if (is32) {} statements to string together those template functions. I could actually collapse it down to just one if() statement, but then the template would have 8 or 9 parameters and it’d be even more ugly than it already is.

From here I remembered non-type template parameters and specializations. It came down to this: define a template class with an integer parameter that can be specialized to hold the 32- or 64-bit types that I need. This looked like so:

template <int> class elftype;
template<> elftype<32> {
  typedef 32bit_t0 t0; typedef 32bit_t1 t1; } ;
template<> elftype<64> {
  typedef 64bit_t0 t0; typedef 64bit_t1 t1; } ;

Not much change in the program itself, except that instead of writing 32bit_t0 I’d write elftype<32>::t0. But this specialization approach applies to the functions themselves, too, and they became:

template<int n> ALGORITHM_STEP() {
  elftype<n>::t0 d0; elftype<n>::t1 d1;
  /* actual algorithm step */ } ;

Now it makes sense to collapse the whole thing into a single function parameterized by 32- or 64-bits, and then the main function I was looking for becomes something like this:

/* read ELF header */
if (is32) { return is_library_acceptable<32>(); }
else { return is_library_acceptable<64>(); }

The algorithm needs to be written only once, it’s specialized for the two cases, the types are neatly squirreled away in their respective class templates. Although functionally it’s no different from the original C version, I find it much more satisfying — for one thing, it’s extensible in a way that wouldn’t be easy with macros (say if there’s a 48-bit ELF format).

And the original purpose, of checking whether a plugin is usable before loading it? We’ve got that patched in for the next package release of Qt 4.7.0 on OpenSolaris, and the demo browser won’t crash anymore (but it won’t play whatever libtotem supports, either).