GithubHelp home page GithubHelp logo

Comments (15)

JF002 avatar JF002 commented on June 7, 2024 1

Thanks for this torough analysis @lupyuen !

Here are some more details: In InfiniTime 0.8.0, there are race conditions on between the Display task (which writes on the SPI bus to the display and reads on the TWI bus from the touch controller) and the System task (which put the watch to sleep and wake it up when the button in pressed) : sometimes, system task disables the SPI and TWI devices while it is still in use by the Display task. When that happens, Display task waits for the end of a transaction, and that will never happen because the peripheral is disabled.

When I debugged this issue, I noticed 2 behaviors:

  • System task is locked into an infinite loop waiting for the SPI event, but System task is still running. InfiniTime seems to be frozen, but in fact, only the Display task is locked. System Task is still running and feeds the watchdog. In that case, a long press on the button should prevent the watchdog from beeing fed, and it should reset the CPU.
  • System task is frozen because of the race condition on the SPI/TWI bus, it doesn't feed the watchdog, and the watchdog resets the CPU.

In both cases, I'm pretty sure that the watchdog did reset the MCU, but nothing appears on the display, as if the bootloader was not running properly. That's why I created this issue.

In your case when you waited for 1 minute before attaching GDB, I think you were in the first case: Display task is locked into an infinite loop waiting for the SPI/TWI peripheral. That is the bug I fixed in InfiniTime 0.8.1.
When that happen, could you try to push the button for 7-10s, confirm that nothing happens on the screen and attach the debugger. If I'm right, GDB should tell you it is locked somewhere in the bootloader code.

Using InfiniTime 0.8.1, where this race condition is fixed, I noticed that the bootloader would not run properly when the watchdog reset occurs when the display is OFF (meaning that everything is disabled/put to sleep).

I saw in the chat that you were wondering what happens when the button is pushed:

  • An IRQ is triggered and the function nrfx_gpiote_evt_handler() is called. It starts the debounce timer.
  • DebounceTimerCallback() is called when the debouncing is over.
  • It calls SystemTask::OnButtonPushed()
  • It enables/disabled (Wakeup()/Sleep()) all devices and sends message to display task and system task to let them know they should restart processing or stop them to go to sleep.

Sorry for this long post, but I doubt I'll be able to give it a lot of time before the week-end, but hopefully some of these information will help your analysis. If not, I'll do my best to help you during the evenings in my timezone or during the weekend :)

Thanks again!

from pinetime-rust-mynewt.

JF002 avatar JF002 commented on June 7, 2024

Using InfiniTime 0.8.1 RC, I noticed that the bootloader won't run properly if the watchdog reset is triggered while the display is off (meaning that the cpu is in deep sleep, all devices are disabled,...).

I don't know if this is caused by a bug in InfiniTime in sleep mode or in the bootloader... Anyway, I think the bootloader should be able to run no matter what happens in the application firmware!

from pinetime-rust-mynewt.

lupyuen avatar lupyuen commented on June 7, 2024

Looks like I need to do a full reset or something. Lemme do some research.

How can I reproduce this?

Once I can reproduce this, I'll isolate the problem...

  1. Disable SPI access to display controller

  2. Display SPI access to external flash

  3. Check whether we need to power on the external flash

Also is there some way for me to test whether SPI is in incoherent state?

Once I detect that SPI is incoherent, I can do a full reset maybe.

I can debug the bootloader without SPI by toggling the GPIO pins to change the screen brightness, so that it flashes various patterns to show the bootloader status.

from pinetime-rust-mynewt.

lupyuen avatar lupyuen commented on June 7, 2024

I'm seeing this intermittent problem with the SPI flash, maybe because it's powered down. Will investigate...

target halted due to breakpoint, current mode: Thread 
xPSR: 0x21000000 pc: 0x00000b5e msp: 0x2000ff90, semihosting

That's an assertion failure...

int
spiflash_identify(struct spiflash_dev *dev)
{
    int i;
    int j;
    uint8_t manufacturer = 0;
    uint8_t memory_type = 0;
    uint8_t capacity = 0;
    int release_power_down_count = 0;
    /* Table with unique release power down functions */
    void (*rpd[3])(struct spiflash_dev *);
    int rc = 0;

    /* List of supported spi flash chips can be found in:
     * hw/drivers/flash/spiflash/chips/sysconfig.yml
     */
    static_assert((sizeof(supported_chips) / sizeof(supported_chips[0])) > 1,
        "At lease one spiflash chip must be specified in sysconfig with SPIFLASH_<chipid>:1");

    spiflash_lock(dev);

    /* Only one chip specified, no need for search*/
    if ((sizeof(supported_chips) / sizeof(supported_chips[0])) == 2) {
        supported_chips[0].fc_release_power_down(dev);
        spiflash_read_jedec_id(dev, &manufacturer, &memory_type, &capacity);
        /* If BSP defined SpiFlash manufacturer or memory type does not
         * match SpiFlash is most likely not connected, connected to
         * different pins, or of different type.
         * It is unlikely that flash depended packaged will work correctly.
         */
        assert(manufacturer == supported_chips[0].fc_jedec_id.ji_manufacturer &&
               memory_type == supported_chips[0].fc_jedec_id.ji_type &&
               capacity == supported_chips[0].fc_jedec_id.ji_capacity);  //  Stops here

from pinetime-rust-mynewt.

lupyuen avatar lupyuen commented on June 7, 2024

On 0.8.0: When pressing the buttons to make it hang, we don't get the the assertion failure above. It just hangs.

With the new bootloader that blinks, nothing blinks during the hang.

When I run with gdb and force break during the hang, gdb says that device is stuck at address 0xc488

According to the .map file, 0xc488 is in Pinetime::Drivers::SpiMaster::Write...

 .text._ZN8Pinetime7Drivers9SpiMaster5WriteEhPKhj
                0x000000000000c3b0      0x184 CMakeFiles/pinetime-mcuboot-app.dir/drivers/SpiMaster.cpp.o
                0x000000000000c3b0                Pinetime::Drivers::SpiMaster::Write(unsigned char, unsigned char const*, unsigned int)
 .text._ZN8Pinetime7Drivers9SpiMaster4ReadEhPhjS2_j
                0x000000000000c534      0x134 CMakeFiles/pinetime-mcuboot-app.dir/drivers/SpiMaster.cpp.o
                0x000000000000c534                Pinetime::Drivers::SpiMaster::Read(unsigned char, unsigned char*, unsigned int, unsigned char*, unsigned int)

So it's hanging while writing to SPI?

Screenshot 2020-09-16 at 8 03 03 PM

from pinetime-rust-mynewt.

lupyuen avatar lupyuen commented on June 7, 2024

While the watch was hanging, I waited 1 minute, forced break in gdb. It stopped in a different location: 0x10f7a, which is in Pinetime::Drivers::TwiMaster::Write...

 .text._ZN8Pinetime7Drivers9TwiMaster5WriteEhPKhjb
                0x0000000000010f50       0xa4 CMakeFiles/pinetime-mcuboot-app.dir/drivers/TwiMaster.cpp.o
                0x0000000000010f50                Pinetime::Drivers::TwiMaster::Write(unsigned char, unsigned char const*, unsigned int, bool)

According to the call stack, the caller of Pinetime::Drivers::TwiMaster::Write is 0x11020, which is Pinetime::Drivers::TwiMaster::Wakeup...

 .text._ZN8Pinetime7Drivers9TwiMaster6WakeupEv
                0x0000000000011118       0xa4 CMakeFiles/pinetime-mcuboot-app.dir/drivers/TwiMaster.cpp.o
                0x0000000000011118                Pinetime::Drivers::TwiMaster::Wakeup()

Screenshot 2020-09-16 at 9 35 41 PM

from pinetime-rust-mynewt.

lupyuen avatar lupyuen commented on June 7, 2024

Cool thanks and congrats :-) Here's what happened when I tried 0.8.1...

You're right I don't see any more hanging in InfiniTime. The bootloader is now hanging because of the assertion failure in spiflash_identify I mentioned earlier. So lemme now fix this.

The SPI bus is probably messed up, causing this assertion failure.

If I bypass the assertion failure and call NVIC_SystemReset, it seems to fix the problem.

Right now the assertion failure is stuck in an NMI Handler that loops forever. Lemme change the NMI Handler to call NVIC_SystemReset...

Screenshot 2020-09-17 at 11 52 03 AM

from pinetime-rust-mynewt.

lupyuen avatar lupyuen commented on June 7, 2024

OK NMI Handler now calls NVIC_SystemReset, seems to fix the hanging problem! The test binaries are here (with fixed manual firmware rollback checking): https://github.com/lupyuen/pinetime-rust-mynewt/releases/tag/v5.0.3

from pinetime-rust-mynewt.

JF002 avatar JF002 commented on June 7, 2024

Wow, thanks for this fix, I'll try it asap!

I admit that, at first, I was a bit surprised by your fix : override NMI_Handler to call NVIC_SystemReset(). But yeah, what else can you do? It looks like that many default implementation just enter into an infinite loop and does nothing (probably what happens here).

I searched how the NRF SDK handles that case :

  • A failed assert calls assert_nrf_callback()
  • which calls app_error_fault_handler()
  • If it is running in DEBUG mode, it logs a lot of useful info about the failed assert and enter into an infinite loop, and if it not running in DEBUG, it calls... NVIC_SystemReset();

So, yeah, congratulations, you did as good as NRF engineers can do 👍

image
image

Now, I'm still wondering why the SPI bus does not work correctly after a watchdog reset... Reset should put all the registers in their default state and everything should work reliably after the reset...
Or, maybe the SPI flash is "frozen" and needs a bit of time or a different init sequence to run properly after InfinTime messed the spi bus?

from pinetime-rust-mynewt.

JF002 avatar JF002 commented on June 7, 2024

I've just done some tests :

  • using InfiniTime 0.8.0 : quickly push the button or wdt reset while in sleep mode : the bootloader reboots correctly, with 4 rapid flashes on the backlight.
  • using InfiniTime 0.8.1 : quickly push the button does not crash the firmware, and reset while in sleep mode reset to the bootloader with 4 rapid flashes.
  • OTA from 0.8.0 to 0.8.1 succeeded 3 times
  • auto revert when the image is not validated works
  • manual revert when the image is validated but the button is pushed while the bootloader is running works too!

Everything looks fine to me! I think we can do some other test (like OTA a not working firmware) to ensure that edge cases are covered too, but I won't have the time to do that this evening :)

Thanks a lot for your analysis, work and fix, I think the bootloader is much more reliable now!

from pinetime-rust-mynewt.

lupyuen avatar lupyuen commented on June 7, 2024

Cool thanks for testing! I have merged to master branch and released it here:

https://github.com/lupyuen/pinetime-rust-mynewt/releases/tag/v5.0.4

from pinetime-rust-mynewt.

rafacouto avatar rafacouto commented on June 7, 2024

To review:

  • syscfg.MCU_DEBUG_IGNORE_BKPT (default 0): When enabled, asm(bkpt) will be ignored. If not set, it will hit the breakpoint wherever it gets called, For example, reset and crash.
  • syscfg.HAL_SYSTEM_RESET_CB (default 0): If set, hal system reset callback gets called inside hal_system_reset().

from pinetime-rust-mynewt.

rafacouto avatar rafacouto commented on June 7, 2024

What about disabling SPI master on bootloader at early stage (before newt's hal_init_*)? SPIM.ENABLE register <-- 0

from pinetime-rust-mynewt.

rafacouto avatar rafacouto commented on June 7, 2024

See InfiniTimeOrg/InfiniTime#79 (comment)

In this case, example in 3.8.1 SPIM - TX workaround subsection of Nordic's Anomaly 109 for nRF52832 could be good.

from pinetime-rust-mynewt.

lupyuen avatar lupyuen commented on June 7, 2024

Sorry I'm not planning to make any more changes to MCUBoot, please submit a tested PR for updates.

from pinetime-rust-mynewt.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.