Flash write bug in Espressif SDK v1.5.1+

Users of rBoot have reported OTA problems when using Espressif SDK v1.5.1 which I have been able to reproduce on this version and v1.5.2. The problem appears to be in the sdk flash write functions. While most writes work fine some will occasional fails to fully write (leaving small areas of blank data (0xff)). To make matters worse no error is reported from the write function and attempts to read the flash back for verification immediately afterwards return the correct data (presumably they are serviced by the rom cache). This is very similar to the problem seen when trying to perform an OTA with insufficient power. I once had a similar report from a user using just a 100mA power supply and the problem was fixed by using a proper power supply. This makes me wonder if the new SDK causes the device to draw more power, although supplying extra power does not fix the problem in this case.

I have created a test case that doesn’t use rBoot but simply downloads a file and writes it to flash (which is of course very similar to what rBoot does) to demonstrate the problem to Espressif. I have yet to hear back from them. If you encounter this problem please add your support to my bug report. I have no reason to believe this bug will be limited to rBoot users and I assume SDK bootloader users and anyone else writing to the flash from an application (possibly only in long sustained writes) will also have this problem.

This is a good opportunity to encourage rBoot users to enable the irom checksum option, to help detect badly flashed roms at boot time. See the readme for more information.

4 comments

  1. I can reproduce the problem with your test app, using my d1-mini which shouldn’t be underpowered, and have commented on espressif’s forum (they’ve finally approved my post!).

    In all the cases I’ve seen the unwritten (0xff) section starts at the sector boundary. Do you see this as well? Any chance you’re accidentally erasing the old sector (meaning you erase of portion of what you just wrote) instead of the new one? (Reviewing your test app code I can’t see an error like this.)

    I can’t repro this with a simpler test app I’ve written myself – but a key difference there is that while I have wifi running I’m not actually passing any ip traffic.

    1. Thanks for the feedback. Looking at my saved tests they do appear to be on sector boundaries, I had a feeling they weren’t always but perhaps I’m just remembering that incorrectly from earlier tests. I suspect that using wifi at the same time is important, but whether that is power related or something else I can only guess at.

Leave a comment