Memory map limitation affecting rBoot

I’ve become aware of a serious limitation that testing should should have found, if I’d done a bit more of it! The ESP8266 only memory maps the the first 8 Mbit of the SPI flash. rBoot doesn’t use memory mapped flash, it uses SPI reads, so rBoot itself is fine with any size of flash. The problem comes if you try to put an irom section beyond 8 MBit (i.e. from address 0x40300000) – code there won’t be accessible at run time!

I’m quite disappointed about that. It’s something that rBoot can’t overcome, it’s a limitation of the ESP8266 design. You can still use the extra space on larger chips for something, like logging data, or for storing a filesystem for your resources. You can also still use rBoot OTA to flash these resources by dedicating a non-bootable rom slot to them. Your code would need to access them with SPI reads rather than through the memory mapping though.

Update: this problem has been largely mitigated, see here.

esptool2 – a rom creation tool for ESP8266

Time to write something about esptool2, as I’ve been using it in the last few posts. The SDK way of creating rom images is a mess involving a shell script / batch file and Makefile. I did find a windows version of esptool by mamalala, but it didn’t create the newer rom format used by the SDK boot loader. So I wrote my own. I used a simpler design and command line compared to the version by mamalala (which I think is designed to be a drop in replacement with compatible command line interface). Although I prefer it this way I didn’t call it esptool2 to imply it was better or a replacement and I hadn’t originally intended to release it.

Features

  • Create old style rom images, e.g. eagle.flash.bin & eagle.irom0text.bin (for use without a boot loader).
  • Create rom images for the first type SDK boot loader.
  • Create rom images for the newer (v1.2+) SDK boot loader.
  • Create rom images for rBoot (because it uses the same format as as SDK v1.2 roms).
  • Export elf sections as bytes in a C header file (used when building rBoot itself).
  • Single output file per command, perfect for use in a Makefile.
  • Open source.
  • Compiles under Visual Studio on Windows and GCC on Linux.

It doesn’t flash though, so you’ll need something else to do that, sorry. Never done any serial programming in C and I suspect it’s probably difficult to to in a nice platform independent way. If you can do this please get in touch.

Get the source from github: https://github.com/raburton/esptool2 Or get the compiled version for windows: https://dl.dropboxusercontent.com/u/5500141/esptool2.exe

Run it to see full usage instructions or look at the rBoot sample project for usage examples.

rBoot tutorial for ESP8266 – OTA updates

Ok, hope you’re still with me after my previous massive post. Now I’m going to show you how to perform an over-the-air (OTA) update with rBoot. I’ve covered all the background already, so this should be pretty straight forward as long as you have a simple two rom rBoot setup running.

Now add rboot.h, rboot-ota.h and rboot-ota.c to your project and call the rboot_ota_start function. How you invoke the OTA update code is up to you, the sample project on GitHub (now updated) has a simple command line interface over the UART allowing the user to enter the command “ota”.

rboot_ota_start takes an rboot_ota struct with the options for the update.

typedef struct {
	uint8 ip[4];
	uint16 port;
	uint8 *request;
	uint8 rom_slot;
	ota_callback callback;
} rboot_ota;
  • ip is the ip address of the web server to download the new rom from.
  • port is the web server port (usually 80).
  • request is a complete http request which will be sent to the web server, this may seem like a slightly odd way to do it, but it gives you full control over what is sent and it’s the same way the SDK OTA update works.
  • rom_slot is the number of the rom slot on the flash to update, starting at zero. In our two rom example that will be either 0 or 1 (and the opposite to the one we are currently running).
  • callback is a user function that will be called when the update is completed (either success or failure), it is passed a pointer to the rboot_ota structure and a bool to indicate success or failure. This is where you will then switch to the new rom (using rboot_set_current_rom) and restart the device.

Example

static void ICACHE_FLASH_ATTR OtaUpdate_CallBack(void *arg, bool result) {

	char msg[40];
	rboot_ota *upServer = (rboot_ota*)arg;

	if(result == true) {
		// success, reboot
		os_sprintf(msg, "Firmware updated, rebooting to rom %d...\r\n", upServer->rom_slot);
		uart0_send(msg);
		rboot_set_current_rom(upServer->rom_slot);
		system_restart();
	} else {
		// fail, cleanup
		uart0_send("Firmware update failed!\r\n");
		os_free(upServer->request);
		os_free(upServer);
	}
}

static const uint8 ota_ip[] = {192,168,7,5};
#define HTTP_HEADER "Connection: keep-alive\r\nCache-Control: no-cache\r\nUser-Agent: rBoot-Sample/1.0\r\nAccept: */*\r\n\r\n"

static void ICACHE_FLASH_ATTR OtaUpdate() {

	uint8 slot;
	rboot_ota *ota;

	// create the update structure
	ota = (rboot_ota*)os_zalloc(sizeof(rboot_ota));
	os_memcpy(ota->ip, ota_ip, 4);
	ota->port = 80;
	ota->callback = (ota_callback)OtaUpdate_CallBack;
	ota->request = (uint8 *)os_zalloc(512);

	// select rom slot to flash
	slot = rboot_get_current_rom();
	if (slot == 0) slot = 1; else slot = 0;
	ota->rom_slot = slot;

	// actual http request
	os_sprintf((char*)ota->request,
		"GET /%s HTTP/1.1\r\nHost: "IPSTR"\r\n" HTTP_HEADER,
		(slot == 0 ? "rom0.bin" : "rom1.bin"),
		IP2STR(ota->ip));

	// start the upgrade process
	if (rboot_ota_start(ota)) {
		uart0_send("Updating...\r\n");
	} else {
		uart0_send("Updating failed!\r\n\r\n");
		os_free(ota->request);
		os_free(ota);
	}

}

It’s really that simple, that’s all you need to add to your application to be able to perform OTA updates. You might want to put in a version check, so it only updates if there is a new version, but that’s up to you.

Web Server

All that remains is to drop rom0.bin and rom1.bin in the root of your web server. Obviously you can change where it looks for the files with a small tweak to the code above.

A sample rBoot project

To help illustrate my previous explanation of getting started with rBoot I’ve produced a simple sample project. It doesn’t do anything very exciting itself, but it shows you how to compile, link and build roms that will work with rBoot. When I’ve written my next post on OTA updating with rBoot I’ll add an implementation to the project to demonstrate that too.

https://github.com/raburton/rboot-sample

I realise my last post was also very long (actually most of my posts are), so this code should serve as a TL;DR version of the previous post for those who prefer to just get stuck in.

If you wanted to use the SDK boot loader instead of rBoot this project would still serve as a useful example.

A boot loader tutorial for ESP8266 using rBoot

From recent reading and discussions it seems that lots of people aren’t using the Espressif boot loader (or my own rBoot, but that’s less of a surprise) on the ESP8266. Why? Maybe people aren’t aware of the reasons why you might want to. Or maybe they can’t figure out how to – when I started playing with the boot loader it was poorly documented (probably still is) and I had to work it out myself.

I talk about “rom” or “roms” here to mean full compiled user apps, that might traditionally have been deployed on their own but with a boot loader you can have several on the flash with just one operating at a time. To avoid confusion (hopefully) I’ll refer to the rom section of code (the code that is run from rom, usually just the .irom0.text elf section) as irom from now on.

Why use a boot loader?

The main reason is to allow you to have multiple “roms” and to be able to switch between them. That may not sound quite as useful as being able to dual boot your computer between Windows and Linux, but there are uses for it. For example, if you want to update your device over the air (OTA) you’ll need to have at least two rom slots on your flash, a running one and one that’s getting flashed with the new version (which you then reboot into). There are work-arounds you could do to OTA update a device from the running rom, but it wouldn’t be very safe. OTA updates are probably the main reason for wanting to use a boot loader. However, you might have a need to deploy a device with two completely different functions and not want to combine them into a single application. With a boot loader you could put both separate apps on the device and switch between them remotely or with a GPIO etc.

Another reason is that the boot loader can load your application differently to the built in loader and add features not present in the original loader. For example the built in loader checks a checksum for the elf sections it loads into iram, but not for the .irom0.text section that is run from rom (this is often referred to as the SDK lib, but that’s also where your code goes if you mark it with ICACHE_FLASH_ATTR). A boot loader could add this extra check (rBoot can, it’s coded but not currently pushed to git), the Espressif loader doesn’t).

The Espressif 1.2 loader introduced a new format for the roms that puts the irom section (aka SDK lib) first and the iram sections immediately after. This means you don’t need to work with the arbitrary split of iram going before 0x40000 and irom after (you can change this split, but you may need to keep changing it as your sections change). Now the iram sections are still limited in size (because there is a finite amount of iram) but the irom section can be pretty much as large as the flash chip (minus the space needed for the iram sections, boot loader and SDK config (last 4 sectors of flash)), without needing to play with the linker scripts. With rBoot your options for laying out the flash are unlimited, so at present I’m not supplying sample linker files, but I do explain how to make them (it’s not hard).

How do you use the boot loader?

1) Think about how you want to lay out your flash, particularly how many roms you want and if they should be the same sizes. See below for a worked example.
2) Compile your code in the normal way.
3) Link your code slightly differently. For each rom slot on the flash you need a linker script – a copy of the standard eagle.app.v6.ld with one simple change. You need to link your object files against each one, to produce multiple elf files. See below for examples and an explanation.
4) Use esptool to build a rom image from each of the linked elf files, this time using the ‘boot_v1.2+’ option.
5) Write the boot loader to 0x00000 on the flash.
6) If you want something more than the simple 2 rom default create and write a boot loader config sector to the flash at 0x01000.
6) Write the roms to the flash at the appropriate addresses (see below).
7) Enjoy.

Linker Files

Why do you need new linker files and why do you need to link several times? I’ll assume if you are programming in C you understand the concepts of compiling and linking (but if you don’t it doesn’t really matter). The boot loader copies most sections to iram, this means they will always end up where you want them to be in memory, regardless of where they get placed on the flash. However the irom sections (usually just .irom0.text) aren’t copied, access to that code is via the memory mapped SPI flash. The whole chip is mapped at a base address of 0x40200000 so when you have multiple roms on your flash the irom section for each rom will be at a different place on the flash and so a different place in memory. When the code is linked the linker needs to know where that code will be in memory and the way to tell it is via the linker script. Short version: each copy of the rom on the flash needs to have been linked differently depending on where it will be flashed.

Example

I’ll make this one easy – a two rom setup, both the same size. A simple but common scenario (all you would need for an app with OTA updates) and rBoot will self configure for this when you run it.

We want two roms on the flash, so the sensible thing to do is place one at the beginning and the other half way along. We need to leave space for the boot loader and the config so we can’t put the first one at the very beginning, so we’ll start it at sector 3, flash address 0x02000. There is no reason that the second can’t start exactly half way into the rom, but for symmetry we’ll start that at half+0x02000 (and rBoot’s default config expects this). Lets say you have an ESP8266 board with an 4Mbit SPI flash, that second rom is going to be written to 0x42000, for 8MBit it will be 0x82000. If you have a flash larger than 8Mb the default position for the second rom will remain at 0x82000 due to the memory mapping limitation.

Now we need to make two appropriate linker files. Copy eagle.app.v6.ld to rom0.ld and rom1.ld. Edit rom0.ld and change the value of ‘irom0_0_seg org = ‘ from 0x40240000 to 0x40202010. This is the base memory mapped flash address (0x40200000) + our chosen flash address (0x02000) + the length of an extra header (0x10). Edit rom1.ld and set the value to 0x40282010. You can increase the len parameter in both of these too if you need more space for your irom section. Just don’t make it so big that it will overflow into the flash space of the next rom (or push the iram sections into it) i.e. the rom must come out at <512KB if you are going to fit two of them on a 1MB flash.

Now edit your Makefile and find the linker line, which is likely to start $(LD). You need to duplicate this and make one of them use rom0.ld, instead of eagle.app.v6.ld, and the other to use rom1.ld. Also make sure they output two different files, don’t have the second one write over the output of the first!

Now you should have two elf files, previously you would have just had one. What normally happens next appears to involve black magic, a Makefile, a shell script/batch file (gen_misc) and a python script (gen_appbin). This produces the flashable rom from the elf file. The whole thing is a mess and can be greatly simplified by using my esptool2 to build roms (other simplified tools also exist, I called mine esptool2 not to imply it is better or supersedes other tools, but just to distinguish it on my own system, way before I intended to release it). The key thing here is that you need to run it twice now, to produce two rom files from the two different elf files. You also need to instruct it to produce roms for what it describes as “boot_v1.2+” (the SDK boot loader v1.2+). However you are currently calling this will need to be updated, but I’d suggest switching to esptool2 if you’re using the original SDK code to do this.

Now just flash the three files (two roms and rBoot itself):
e.g. esptool.py –port COM7 -fs 8m 0x00000 rboot.bin 0x02000 rom0.bin 0x82000 rom1.bin

Using ‘-fs 8m’ here is important, it ensures the flash size is stored in the first few bytes of the flash, this will be read by rBoot to determine the flash size so it can work out where the half way point is.

On first boot you’ll see a message that a default config is being created and all being well the first rom will start. Assuming you got this far, hold on for part two where I’ll show you how to switch rom and/or OTA update from your app…

rBoot – A new boot loader for ESP8266

As promised here is my new boot loader for the ESP8266 – rBoot.

Advantages over SDK supplied bootloader:

  • Open source (written in C) – this is the big one.
  • Supports any number of roms.
  • Roms can be different sizes.
  • Rom slots can be used for resource storage as well as bootable apps (and benefit from the OTA update system).
  • Can use the full size of the SPI flash (see below).
  • Rom slots can be altered after deployment (with care!).
  • Earlier rom validation (less prone to errors).
  • Can try multiple backup roms (without needing to reboot).
  • Rom selection by GPIO (e.g. hold down a button when powering on to start a recovery rom).
  • Wastes no stack space  (SDK boot loader uses 144 bytes).
  • Documented config structure (easy to configure from user code).

Disadvantages over SDK supplied bootloader:

  • Not compatible with sdk libupgrade (but equivalent source included, based on open source copy shipped with earlier SDKs, so you can easily update your existing OTA app use this new code).
  • Requires you to think slightly more about your linker scripts, rather than just using the pair supplied with the SDK (but it’s not really that difficult – if you’re programming in C it’ll be well within your capabilities).

Problems common to both:

  • You still need to relink user code against multiple different linker scripts depending where you intend to place it on the flash, because the memory mapped position of the .irom0.text section needs to be known at link time. This also prevents you moving roms around at will once they have been compiled.
  • Only 8MBit of flash can be memory mapped at a time (the SDK bootloader allows at most the first 2 x 8Mbit chunks to be used for roms, rBoot doesn’t have this limit, on a 32MBit flash you can have 4 x 8MBit roms), see memory mapping imitation for more details.

Source code

I’ve decided to start putting my source code on GitHub, it’ll be easier to maintain keep my blog tidier.

https://github.com/raburton

Decompiling the ESP8266 boot loader v1.3(b3)

So having looked at the standard boot process on the ESP8266, let’s look at the boot loader and how it extends it.

The boot loader is written to the first sector of the SPI flash and is executed like any other program, the built in first stage boot loader does not know it is loading a second stage loader rather than any other program.

So what happens next? Well the second stage boot loader isn’t open source, it’s provided to us as a binary blob to use blindly. Of course we can work out roughly what must happen by examining the structure of the rom files used with the boot loader, but that’s not really good enough. Instead we must decompile the boot loader to see what’s really going on inside. This requires IDA and the Xtensa plugin and will give you an assembly listing for the boot loader. I can read this listing but it’s slow going and difficult to follow in this form. To really get an understanding I converted it to C code, which I have attached below. The nice thing about this is we can then potentially recompile it. Converting this to C and getting it to a point where it would recompile took me about 2.5 days! It’s a painfully slow process, but I’m sure someone regularly programming for embedded devices in assembler could have done it in a fraction of that time. The code below isn’t perfect, there are errors in my conversion process but it gives you a pretty good idea of what’s going on.

The basic boot loader process:

  • The boot loader is loaded like any other program.
  • It reads it’s config from the last sector of the flash (to know which of the two roms to boot).
  • It loads a second config structure from the second to last (for rom 2) or third to last (for rom 1) sector of the flash.
  • It finds the flash address of the rom to boot. Rom 1 is always at 0x1000 (next sector after boot loader). Rom 2 is half the chip size + 0x1000 (unless the chip is above a 1mb when it’s position it kept down to  to  0x81000).
  • It copies some compiled code, from the .rodata section of the boot loader, to the top of iram and executes that code, passing it the flash address. I’ll call this the stage 2a loader.
  • That stage 2a loader actually performs the same basic functions as the first stage loader – it copies the iram elf segments and calls the entry point.

The new rom header format

The roms loaded by the boot loader can be of the standard 0xe9 format or of a new type. The new type is basically a normal 0xe9 rom proceeded by a new header and the .irom.text section. The new header is as follows:

typedef struct {
	uint8 magic1;
	uint8 magic2;
	uint8 config[2];
	uint32 entry;
	uint8 unused[4];
	uint32 length;
} rom_header;

magic1 is 0xea. magic2 is 0x04. entry is the entry point point of user code. length is the length of the .irom.text segment. This header is then followed by the .irom.text segment, then a standard 0xe9 header and elf segments. The boot loader skips all the new stuff and loads from the standard 0xe9 part as normal.

So why so complicated?

  • Some of it may be the compilation and decompilation process – that can change the structure quite a bit from the original. I don’t know if it was originally written in C or ASM.
  • Why memcpy code into iram rather than just running it in the normal way? The boot loader is already running from iram where the user code needs to be copied to. That would break it, so the extra loader stage is deployed to the top of iram, which is assumed to be spare and safe (as long as the user code doesn’t try to load a section there).
  • So why not just load the boot loader to the top of iram in the first place? The first stage loader will not load sections to an address that high in iram, I don’t know why but I tested it and it simply doesn’t work.
  • Why not not run the loader entirely from rom? The flash isn’t memory mapped at this stage, so that’s not an option, pity.
  • Why the new rom header? By putting the .irom.text section first it can have a known address on the flash (so will be mapped to a known address in memory) and all the space after it is available to store your iram sections. The original format had the .irom.text after the iram sections, so you needed to adjust the linker script and position on the flash if you wanted to rebalance your sections.
  • Why does it need 3 x 4kb sectors of the flash to store only a handful of bytes of config? I can’t see a good reason for that.
  • Why do they have so many copy routines depending on the size of data to be copied? I can only assume someone thought it was more efficient, maybe it it is but I’m sure the performance benefit would be negligible and it certainly makes the code a lot more complicated than it needs to be.

There are various other details, like extended mode and switching to backup rom, have a look at the code if you want to know more.

Problems with the loader

  • Not open source – can’t modify it.
  • Only two rom slots.
  • Uses 144 bytes of stack space, which cannot then be used by user code.
  • Image is validated by the stage 2a loader, if there is a problem with the image (e.g. bad checksum) the code returns to the 2nd stage which might have been overwritten already.
  • No checksum on .irom.text section.
  • Overly complicated code, possibly buggy (I’ve often OTA updated my device and found it won’t boot afterwards without clearing the loader config sector, but this could equally be bugs in the OTA update code).
  • Trying the backup rom requires a reboot, not a big deal but also not necessary.

C source for boot loader 1.3(b3)

Should just about compile, but don’t expect it to work propery as-is. Just for your education. Stage 2a needs to be compiled and the compiled code needs to be extracted and put as data into stage2 code, see the memcpy at line 254 for where it’s used.

 

ESP8266 boot process

I decided to write my own version of esptool for windows to create rom images. Although there is already a windows version available it can’t create new type firmware images for use with the latest versions of the boot loader from the espressif sdk (e.g v1.2). I could have just used the python version, but as with all this playing it was as much for my education and entertainment as for any practical purpose. In the process I ended up learning more about the boot process than I expected and writing my own boot loader.

As I haven’t seen a lot of info about it online I thought it might be useful to document the normal boot process here. The built in first stage bootloader reads the start of the SPI flash where it expects to find a simple 8 byte structure:

typedef struct {
	uint8 magic;
	uint8 sect_count;
	uint8 flags1;
	uint8 flags2;
	uint32 entry_addr;
} rom_header;

The magic value should be 0xe9. sect_count contains the number (may be zero) of elf sections to load to iram (this does not include the .irom.text section). flags1 & flags2 control the flash size, clock rate and IO mode. entry_addr contains the entry point to start executing user code from.

After the header come the actual elf sections. Each is headed by another 8 byte structure (followed immediately by the data itself):

typedef struct {
	uint32 address;
	uint32 length;
} sect_header;

The first stage boot loader verifies the magic and sets the flash mode according to the flags. Then it copies each section to the corresponding address from the header (which should be within the iram section starting at 0x40100000). As the sections are loaded a single checksum is created of all the data (headers are not included). If the final checksum matches the one stored at the end of the elf section on the flash it will call the function found at entry_addr.

The whole of the flash is also mapped to an area of memory from 0x40200000. The .irom.text elf section just sits somewhere on the flash after the other elf sections and does not have a header like those destined for iram. The default linker script eagle.v6.ld bases the section at 0x40240000 so it should be written to 0x40000. This mapping does not occur until later (presumably by sdk library code), so you can’t access the flash directly in memory in the boot loader – it must be accessed through spi read calls.

A simple NTP client for ESP8266

Once you have a real time clock working on the ESP8266, you might actually want to set it. As they have a backup battery you may just set it before you connect it to the ESP8266 and forget about it, but that’s not ideal. These cheap RTCs probably aren’t perfectly accurate and if it stops for any reason (e.g. dead backup battery) you’ll need to reset it. The DS3231 has a flag to indicate it’s been stopped – ideally this should be checked on startup and the clock set via NTP if there has been any interruption.

I did find one other simple NTP implementation but it’s incomplete, there is no timeout and it doesn’t clean up the connection when it’s finished (so it’ll leak memory). I think my version should work a little better, but I can’t guarantee it’s bug free so please let me know if you find any. As well as getting the time, the code is a nice simple example of a UDP client.

To use simply call ntp_get_time( ). The NTP request is asynchronous so you get the time in the ntp_udp_recv callback function, have a look there for two simple examples of what you could do with your newly received NTP time (print it out or set an RTC).

Code now on GitHub: https://github.com/raburton/esp8266