Decompiling the ESP8266 boot loader v1.3(b3)

So having looked at the standard boot process on the ESP8266, let’s look at the boot loader and how it extends it.

The boot loader is written to the first sector of the SPI flash and is executed like any other program, the built in first stage boot loader does not know it is loading a second stage loader rather than any other program.

So what happens next? Well the second stage boot loader isn’t open source, it’s provided to us as a binary blob to use blindly. Of course we can work out roughly what must happen by examining the structure of the rom files used with the boot loader, but that’s not really good enough. Instead we must decompile the boot loader to see what’s really going on inside. This requires IDA and the Xtensa plugin and will give you an assembly listing for the boot loader. I can read this listing but it’s slow going and difficult to follow in this form. To really get an understanding I converted it to C code, which I have attached below. The nice thing about this is we can then potentially recompile it. Converting this to C and getting it to a point where it would recompile took me about 2.5 days! It’s a painfully slow process, but I’m sure someone regularly programming for embedded devices in assembler could have done it in a fraction of that time. The code below isn’t perfect, there are errors in my conversion process but it gives you a pretty good idea of what’s going on.

The basic boot loader process:

  • The boot loader is loaded like any other program.
  • It reads it’s config from the last sector of the flash (to know which of the two roms to boot).
  • It loads a second config structure from the second to last (for rom 2) or third to last (for rom 1) sector of the flash.
  • It finds the flash address of the rom to boot. Rom 1 is always at 0x1000 (next sector after boot loader). Rom 2 is half the chip size + 0x1000 (unless the chip is above a 1mb when it’s position it kept down to  to  0x81000).
  • It copies some compiled code, from the .rodata section of the boot loader, to the top of iram and executes that code, passing it the flash address. I’ll call this the stage 2a loader.
  • That stage 2a loader actually performs the same basic functions as the first stage loader – it copies the iram elf segments and calls the entry point.

The new rom header format

The roms loaded by the boot loader can be of the standard 0xe9 format or of a new type. The new type is basically a normal 0xe9 rom proceeded by a new header and the .irom.text section. The new header is as follows:

typedef struct {
	uint8 magic1;
	uint8 magic2;
	uint8 config[2];
	uint32 entry;
	uint8 unused[4];
	uint32 length;
} rom_header;

magic1 is 0xea. magic2 is 0x04. entry is the entry point point of user code. length is the length of the .irom.text segment. This header is then followed by the .irom.text segment, then a standard 0xe9 header and elf segments. The boot loader skips all the new stuff and loads from the standard 0xe9 part as normal.

So why so complicated?

  • Some of it may be the compilation and decompilation process – that can change the structure quite a bit from the original. I don’t know if it was originally written in C or ASM.
  • Why memcpy code into iram rather than just running it in the normal way? The boot loader is already running from iram where the user code needs to be copied to. That would break it, so the extra loader stage is deployed to the top of iram, which is assumed to be spare and safe (as long as the user code doesn’t try to load a section there).
  • So why not just load the boot loader to the top of iram in the first place? The first stage loader will not load sections to an address that high in iram, I don’t know why but I tested it and it simply doesn’t work.
  • Why not not run the loader entirely from rom? The flash isn’t memory mapped at this stage, so that’s not an option, pity.
  • Why the new rom header? By putting the .irom.text section first it can have a known address on the flash (so will be mapped to a known address in memory) and all the space after it is available to store your iram sections. The original format had the .irom.text after the iram sections, so you needed to adjust the linker script and position on the flash if you wanted to rebalance your sections.
  • Why does it need 3 x 4kb sectors of the flash to store only a handful of bytes of config? I can’t see a good reason for that.
  • Why do they have so many copy routines depending on the size of data to be copied? I can only assume someone thought it was more efficient, maybe it it is but I’m sure the performance benefit would be negligible and it certainly makes the code a lot more complicated than it needs to be.

There are various other details, like extended mode and switching to backup rom, have a look at the code if you want to know more.

Problems with the loader

  • Not open source – can’t modify it.
  • Only two rom slots.
  • Uses 144 bytes of stack space, which cannot then be used by user code.
  • Image is validated by the stage 2a loader, if there is a problem with the image (e.g. bad checksum) the code returns to the 2nd stage which might have been overwritten already.
  • No checksum on .irom.text section.
  • Overly complicated code, possibly buggy (I’ve often OTA updated my device and found it won’t boot afterwards without clearing the loader config sector, but this could equally be bugs in the OTA update code).
  • Trying the backup rom requires a reboot, not a big deal but also not necessary.

C source for boot loader 1.3(b3)

Should just about compile, but don’t expect it to work propery as-is. Just for your education. Stage 2a needs to be compiled and the compiled code needs to be extracted and put as data into stage2 code, see the memcpy at line 254 for where it’s used.


One thought on “Decompiling the ESP8266 boot loader v1.3(b3)”

Leave a Reply