Embedding Binary Blobs With GCC

For a long time I've wanted to know how to embed binary blobs into executables. This would be most useful for files like Glade and and UI Manager definitions, which are required for a given program to work at all but either cannot be embedded as a string literal (Glade) or can be but is annoying (UI Manager). I finally asked the Interweb, and Daniel Jacobowitz replied with some pointers. It turns out that doing this is remarkable simple.

First, a caveat. This probably requires GNU ld, which may or may not be a deal breaker for many people.

First, create a data file. Let's call it foo.txt, and put some text in it.

Hello, World!

Using ld this can be read in as a plain binary blob, and then written as a standard relocatable ELF object.

ld -r -b binary -o foo.o foo.txt

Now we have a standard ELF object with the data and some useful symbols defined. objdump will show you the contents.

$ objdump -x foo.o 
foo.o:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .data         0000000d  00000000  00000000  00000034  2**0
                  CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
00000000 l    d  .data  00000000 .data
0000000d g       .data  00000000 _binary_foo_txt_end
0000000d g       *ABS*  00000000 _binary_foo_txt_size
00000000 g       .data  00000000 _binary_foo_txt_start

Here we see 13 bytes of data, and a symbol which contains the address of the data. This is all we need to access it from a C program.

#include <stdio.h>
extern char _binary_foo_txt_start[];

int main (void) {
  puts (_binary_foo_txt_start);
  return 0;
}

Now if we compile this and link it against the generated object, we'll have a binary.

$ gcc -o test test.c foo.o
$ ./test
Hello, World!

Hooray! One small problem which alert people should have noticed: the string itself is in the .data section, which is read/write. For my use, I want it to be read-only data in the .rodata section so that it isn't copied for every instance of the application. As far as I know, this isn't possible with ld but objcopy will let us rename sections on the fly.

$ objcopy --rename-section .data=.rodata,alloc,load,readonly,data,contents foo.o foo.o
$ objdump  -h foo.o
...
  0 .rodata       0000000d  00000000  00000000  00000034  2**0

Excellent, problem solved. If you want to download this sample, I have a tarball. Many thanks to Daniel Jacobowitz for pointing out how to achieve this.

Update: note that any data embedded in the binary like this won't be terminated with a NULL. This is obvious in hindsight, but due to luck my example still worked. There might be a way of asking objcopy to append a 0 to the end of the data, but if not always remember to use the start and end pointers or size instead of just the start, or append a NULL yourself before converting to an ELF.

NP: (), Sigur Rós

15:50 Friday, 13 Jul 2007 [#] [computers] ( comments)