Flash write bug fix

Looks like I finally have a fix for the flash write issue, thanks to some help from Espressif. Embarrassingly it turns out to be my fault, but I’m not too proud to admit that. So what was the problem? Since SDK v1.5.1 people have been reporting intermittent failed OTA updates and I’ve been able to reproduce it easily myself. This only seemed to occur when using wifi for long sustained writes.

So what was happening? Occasionally the network receive would get a much larger packet than normal. In my latest testing they are usually around 1436 bytes, but occasionally one would arrive that was 5744 (this is the only large value I have seen myself, but presumably other values could occur). The flash is erased in 4k sectors and the rBoot code erases them as required, checking if the current data block will fit in the current sector and if not erasing the next one too. Where I went wrong was to assume that a receive would always be less than the 4k sector size, having never seen one before SDK v1.5.1 that was anywhere near that large. When one of these very large packets arrives it could span 3 sectors. The second sector would get erased correctly but the third sector would not. The flash write command would not return an error when it tried to write to the non-erased sector, so no fault was noticed at this time. Then on the next write that third sector, now being the first sector for that next write, gets erased and the new data written part way through it (where it should be).

Moral of the story. Don’t make assumptions and don’t ignore the edges cases – I’m usually pretty good at this second point, but occasionally I seem to need reminding. In this case I had thought about it and added code that detected a chunk over 4k that would need more erasing. As I didn’t think this could ever happen I merely put a comment in the if statement to say what would need to be done in that scenario, if I’d thrown an error at that point instead this problem would have been easily diagnosed. However, I did also assume that the flash write would fail if it tried to write to flash that was not erased, so I expected to see an error of some kind.

Why did it suddenly happen at v1.5.1? I don’t know that but presumably Espressif made some change in the SDK that makes this more likely to occur. While playing with some code for Pete Scargill I did manage to reproduce the problem with v1.4.0 so it wasn’t impossible for it to happen there, but I never had any reports of it there previously. I also found that the timers in Pete’s code made it more likely to happen in my testing, so I suspect the extra processing these caused was impacting the performance of the network stack and causing more packets to be bunched together and delivered as larger chunks to the application. Further testing showed RF interference could also cause the same result.

The fix is available in the rBoot GitHub repo, and has also been updated in Sming.

Flash write bug in Espressif SDK v1.5.1+

Users of rBoot have reported OTA problems when using Espressif SDK v1.5.1 which I have been able to reproduce on this version and v1.5.2. The problem appears to be in the sdk flash write functions. While most writes work fine some will occasional fails to fully write (leaving small areas of blank data (0xff)). To make matters worse no error is reported from the write function and attempts to read the flash back for verification immediately afterwards return the correct data (presumably they are serviced by the rom cache). This is very similar to the problem seen when trying to perform an OTA with insufficient power. I once had a similar report from a user using just a 100mA power supply and the problem was fixed by using a proper power supply. This makes me wonder if the new SDK causes the device to draw more power, although supplying extra power does not fix the problem in this case.

I have created a test case that doesn’t use rBoot but simply downloads a file and writes it to flash (which is of course very similar to what rBoot does) to demonstrate the problem to Espressif. I have yet to hear back from them. If you encounter this problem please add your support to my bug report. I have no reason to believe this bug will be limited to rBoot users and I assume SDK bootloader users and anyone else writing to the flash from an application (possibly only in long sustained writes) will also have this problem.

This is a good opportunity to encourage rBoot users to enable the irom checksum option, to help detect badly flashed roms at boot time. See the readme for more information.

New rBoot version allows temporary boot

I’ve just pushed an update to rBoot that allows 2-way communication between rBoot and the running user application. This is something I had though about previously, and I mentioned it in a previous post, but nobody had actually asked for it until a couple of weeks ago. The main use of this is to allow the application to request rBoot to perform a temporary boot to a different rom (i.e. not the one identified in the config, which would normally be booted). This helps to make updating safer, because you can perform an OTA and only temporarily switch to the new rom, until you are happy to update the config and make this the standard. rBoot is already safer for updating than the SDK bootloader, if you enable to irom checksum option, but this new functionality also guards against valid but buggy roms that simply don’t work properly once booted.

How you decide that your rom is good enough to switch to booting it by default is up to you of course. Perhaps if the rom is able to boot, connect to wifi and stays up for 5 minutes, that would be deemed sufficient. Another option would be to have the user manually initiate the change of default rom once they are happy with the way it runs.

You can also get information about the boot from your application such as the boot mode (standard, temporary or GPIO), and the currently running rom. Previously the running rom could be determined by reading the config, but that would not work for a temporary boot.

This new functionality makes use of the ESP8266’s RTC data area and to use it uncomment the #define BOOT_RTC_ENABLED line in rboot.h. See the documentation in the GitHub rBoot repository and the updated sample project for example use.

One other change made at the same time is that GPIO-rom selection is now an optional feature and not compiled in by default. Please enable the appropriate #define in rboot.h if you wish to use this feature.

rBoot now in Sming

rBoot has now been integrated into Sming. This includes rBoot itself (allowing the bootloader to be built alongside a user app) and a new Sming specific rBoot OTA class. The sample app from my GitHub repo has been improved and included under the name Basic_rBoot, it demonstrates OTA updates, big flash support and multiple spiffs images.

The old sample app will be removed from my repo shortly. If you want to make use of rBoot in Sming it’s now easier than ever – just use release v1.3.0 onwards, or clone the Sming master branch, and take a look at the sample project.

C Version of system_rtc_mem_read

Here are C versions of system_rtc_mem_write & system_rtc_mem_write. You don’t need these when your app is compiled against the SDK but if you are running bare-metal code you might find them handy. You could add this code to rBoot so you can communicate with it between boots, e.g. setting a flag from your app to alter the behaviour of rBoot on next boot.

Note: in the SDK version if you specify a length that is not a multiple of 4 the actual length read will be rounded up and so it may overflow the supplied buffer (although alignment/packing of memory may mean this isn’t a problem), my version requires a read in multiples of 4 bytes, but you can easily remove that check if you wish.

uint32 system_rtc_mem_read(int32 addr, void *buff, int32 length) {

	int32 blocks;
	
	// validate reading a user block
	//if (addr < 64) return 0;
	if (buff == 0) return 0;
	// validate 4 byte aligned
	if (((uint32)buff & 0x3) != 0) return 0;
	// validate length is multiple of 4
	if ((length & 0x3) != 0) return 0;
	
	// check valid length from specified starting point
	if (length > (0x300 - (addr * 4))) return 0;

	// copy the data
	for (blocks = (length >> 2) - 1; blocks >= 0; blocks--) {
		volatile uint32 *ram = ((uint32*)buff) + blocks;
		volatile uint32 *rtc = ((uint32*)0x60001100) + addr + blocks;
		*ram = *rtc;
	}

	return 1;
}

You’ll notice how similar these two functions are, if you need both you could easily combine them into a single function with a parameter to indicate read/write mode (which would save rom space).

uint32 system_rtc_mem_write(int32 addr, void *buff, int32 length) {

	int32 blocks;
	
	// validate reading a user block
	if (addr < 64) return 0;
	if (buff == 0) return 0;
	// validate 4 byte aligned
	if (((uint32)buff & 0x3) != 0) return 0;
	// validate length is multiple of 4
	if ((length & 0x3) != 0) return 0;
	
	// check valid length from specified starting point
	if (length > (0x300 - (addr * 4))) return 0;

	// copy the data
	for (blocks = (length >> 2) - 1; blocks >= 0; blocks--) {
		volatile uint32 *ram = ((uint32*)buff) + blocks;
		volatile uint32 *rtc = ((uint32*)0x60001100) + addr + blocks;
		*rtc = *ram;
	}

	return 1;
}

C Version of system_get_rst_info

People keep asking me how to use SDK functions (e.g. system_get_rst_info) in rBoot so they can add interesting new features. The simplest answer is you can’t – if you want to use the full set of SDK features you would need to link rBoot against the SDK, it’s size would go from <4k to >200k and it wouldn’t actually be possible to chain load a user rom. The less simple answer is if the SDK functions are simple enough and can be reverse engineered then you can replicate them in rBoot, or any other bare-metal app. So some simple things will be possible, with work, but you’ll never be able to use more complex features like wifi from rBoot.

Here is C code to replicate the system_get_rst_info. Note that it passes back a structure rather than a pointer to one (the SDK creates and stores this structure at boot and later just passes a pointer to it when requested, but the code below creates it when needed). Also note that you cannot use this code after the SDK has started because the SDK resets this information when it boots, but it will work just fine in rBoot.

struct rst_info {
	uint32 reason;
	uint32 exccause;
	uint32 epc1;
	uint32 epc2;
	uint32 epc3;
	uint32 excvaddr;
	uint32 depc;
};

struct rst_info system_get_rst_info() {
	struct rst_info rst;
	system_rtc_mem_read(0, &rst, sizeof(struct rst_info);
	if (rst.reason >= 7) {
		ets_memset(&rst, 0, sizeof(struct rst_info));
	}
	if (rtc_get_reset_reason() == 2) {
		ets_memset(&rst, 0, sizeof(struct rst_info));
		rst.reason = 6;
	}
	return rst;
}

rBoot now supports Sming for ESP8266

Although it’s always been possible to use Sming compiled apps with rBoot it wasn’t easy. I’ve shared Makefiles and talked a few people through it previously on the esp8266.com forum, but now there is a new sample project on GitHub to help everyone do it.

The sample demonstrates:

  • Compiling a basic app (similar to the rBoot sample for the regular sdk).
  • Big flash support, allowing up to 4 roms each up to 1mb in size on an ESP12.
  • Over-the-air (OTA) updates.
  • Spiffs support, with a different filesystem per app rom.

Spiffs support depends on a patch to Sming, for which there is a pull request pending to have it included properly. Probably the most common way I envisage this being used is a pair of app roms (to allow for easy OTA updates ) with a separate spiffs file system each, but rBoot is flexible enough to let you lay out your flash however you want to.

Suggested layout for 4mb flash:

0x000000 rboot
0x001000 rboot config
0x002000 rom0
0x100000 spiffs0
0x1fc000 (4 unused sectors*)
0x200000 (2 unused sectors†)
0x202000 rom1
0x300000 spiffs1
0x3fc000 sdk config (last 4 sectors)

* The small unused section at the top of the second mb means the same size spiffs can be used for spiffs0 and spiffs1. The top of the fourth mb (where spiffs1 sits) is reserved for the sdk to store config.
† The small unused section at the start of the third mb mirrors the space used by rBoot at the start of the first mb. This means only one rom needs to be produced, that can be used in either slot, because it will be of the same size and have the same linker rom address.

C Version of Cache_Read_Enable for ESP8266

Just a quick post with a C version of the decompiled ESP8266 rom function Cache_Read_Enable. This function is responsible for memory mapping the SPI flash. I’ve previously discussed it, but a couple of people have wanted the code so it seemed worth posting here. This compiles to quite a few bytes which must stay in iram so, if you want to tamper with the parameters (like rBoot big flash support does), it’s best to write a wrapper to the original rom function rather than use this code to replace it.

void Cache_Read_Enable(uint8 odd_even, uint8 mb_count, uint8 no_idea) {
	
	uint32 base1 = 0x3FEFFE00;
	volatile uint32 *r20c = (uint32*)(base1 + 0x20c);
	volatile uint32 *r224 = (uint32*)(base1 + 0x224);
	
	uint32 base2 = 0x60000200;
	volatile uint32 *r008 = (uint32*)(base2 + 8);
	
	while (*r20c & 0x100) {
		*r20c &= 0xeff;
	}

	*r008 &= 0xFFFDFFFF;
	*r20c &= 0x7e;
	*r20c |= 0x1;
	
	while ((*r20c & 0x2) == 0) {
	}

	*r20c &= 0x7e;
	*r008 |= 0x20000;

	if (odd_even == 0) {
		*r20c &= 0xFCFFFFFF;  // clear bits 24 & 25
	} else if (odd_even == 1) {
		*r20c &= 0xFEFFFFFF;  // clear bit 24
		*r20c |= 0x2000000;   // set bit 25
	} else {
		*r20c &= 0xFDFFFFFF;  // clear bit 25
		*r20c |= 0x1000000;   // set bit 24
	}

	*r20c &= 0xFBF8FFFF; // clear bits 16, 17, 18, 26
	*r20c |= ((no_idea << 0x1a) | (mb_count << 0x10)); // set bits 26 & 18/17/16
	// no_idea should be 0-1 (1 bit), mb_count 0-7 (3 bits)

	if (no_idea == 0) {
		*r224 |= 0x08; // set bit 3
	} else {
		*r224 |= 0x18; // set bits 3 & 4
	}
	
	while ((*r20c & 0x100) == 0) {
		*r20c |= 0x100;
	}

	return;
}

Important bug in esptool2

Over the weekend I found a bug in the checksum calculation in esptool2. It caused some images to have a bad checksum, which would then not be bootable as rBoot would think they were corrupt.

If you haven’t already please pull the latest source and rebuild it. Or if you are on windows and using an old pre-compiled copy you can get an updated version here.

Memory map limitation – workaround

I was a little hasty in my judgement that this problem could not be solved. While it still appears to be true that only 1MB can be mapped at a time, it is possible to choose which 1MB is mapped. How the mapping was performed was a mystery, to me at least, and I can’t find any info about it on the internet. It was obvious that it was performed by the SDK code, but I hadn’t worked out where.

Since I released rBoot, Espressif have released a new SDK with a new version of the boot loader. This allows you to have two 1MB roms, which is clearly working around the limitation. The nice people at Espressif obviously know the internals of the hardware and have documentation for it, so they can do things that the rest of us wouldn’t know how to (or even know if it was possible).

How does the SDK memory map the flash?

Decompiling the new version of the SDK and comparing it to an older version made it easier to find where the magic happens. The function is Cache_Read_Enable (not well named!) and does not appear to be documented anywhere on the internet. I’ve decompiled it and so I know what the function does, but it communicates with other hardware through memory mapped I/O. Without documentation for that hardware it not easy to really know what it going on beyond this function. As a result some trial and error was required.

The SDK uses new flash size options in the flash header to indicate flash layout as well as size. This method is limited to what the SDK supports and isn’t in a place you want to be rewriting when the config changes (e.g. on an OTA update). So how can rBoot replicate, and improve on, this functionality? Cache_Read_Enable is called from several places in the SDK, because the flash has to be unmapped before normal SPI reads and writes can take place. The SDK SPI read, write and various other functions handle this unmapping (and remapping afterwards) for you. These functions in older versions of the SDK called Cache_Read_Enable directly, but now they all call a wrapper method called Cache_Read_Enable_New, which handles the extra logic involved with rom selection. This gives us a single point which, if we can replace it, would allow us to control the mapping ourselves.

Replacing Cache_Read_Enable_New

So how do we replace it? I first tried using the gcc -wrap option, but it didn’t work. Most references to Cache_Read_Enable_New where replaced with my own code, except those in user_init. It seems that -wrap doesn’t work well when the function you are wrapping is called within the same compilation unit (.o file). Another option is to just define a new method of the same name to override the original. Normally having two matching methods would cause an error at link time, to avoid this we mark the original as ‘weak’ to allow it to be overridden. This isn’t quite as neat, because it requires a small modification to the original libmain.a, but it works!

So, after writing a suitable replacement for Cache_Read_Enable_New I have a working solution. It’s a pity we still can’t map more than 8Mbit at a time, but at least we can now use the whole of larger flash chips in chunks. The new code is now on GitHub. See the readme file for explanation of how to use big flash support.