In the previous three exercises, we have worked with local debugging where direct debug access to devices is available. In this exercise, we will use Memfault to remotely debug a device, an important feature for debugging deployed in-field devices. We will cover signing up for an account on the Memfault platform, setting up an example project, sending diagnostic data to the Memfault cloud, and using the Memfault web app to explore this data and debug our example application.
Memfault provides an observability and OTA update management platform that is purpose-built for embedded devices. It is optimized for use in constrained microcontroller-based devices and scalable to complex Linux and Android systems; Memfault helps embedded development teams understand exactly how their devices perform in the field and find and fix faults fast.
The Memfault platform not only serves as a powerful tool for monitoring large device fleets at scale, but it also provides tangible benefits from the earliest stages of development. With first-class support integrated into the nRF Connect SDK, embedded development teams can start reaping these benefits from revision 0.0.1!
The exercise is divided into four parts:
Part 1 – Setting up a Memfault account
Nordic Developer Academy students are eligible for a free Memfault account for use with all supported Nordic development kits. You can use this account to explore all the platform has to offer as you explore any of the nRF Connect SDK sample applications on your devices.
1. Create an account.
Go to Memfault Nordic Registration and fill out the information to create an account.
2. Create a new project in the Memfault platform.
Once you’ve completed the signup process, you will be prompted to create a new Project. Give your project a name. In this example, we will use hello-memfault. Next, select the template corresponding to the development kit you are using. At this point, you’ll be greeted with an empty project just waiting to be populated with rich observability and debugging data.
For the screenshot below, we are using an nRF5340 DK.
3. Note the Project Key value.
Navigate to the Project Settings view using the left-hand navigation menu and take note of the Project Key value. We’ll use this in a later step to tie our example application to this newly created project.
Great job! You’ve completed Step 1 — now let’s spin up our firmware project to begin populating this project with useful debugging data.
Part 2 – Enabling Memfault in an nRF Connect SDK sample
4. Open up the Hello World application in nRF Connect SDK.
To keep it simple, we’re going to start with the “Hello World” application found under <nRF Connect SDK Install Path>/<version>/zephyr/samples/hello-world
.
From the Welcome view in nRF Connect for VS Code, select Create a new application -> Copy a sample -> (SDK Version), then search for and select the “Hello World” app found under zephyr/samples/hello-world
. Lastly, select a directory to store the example. Save the application to wherever you cloned the code base for this course, as <ncs_inter_repo>\<version_directory>\l2\l2_e4
.
5. Build and flash the application to your device.
At this point, you can build and flash the unmodified sample to your board to confirm it works.
For our next steps, we will be using Zephyr’s UART shell capabilities, so you can also go ahead and fire up your favorite serial terminal emulator.
6. Add Memfault to the application.
Since Memfault support is baked into nRF Connect SDK, adding Memfault to the mix is a straightforward affair.
Simply add the following lines to your prj.conf
, being sure to include the Project Key collected earlier.
CONFIG_SHELL=y
CONFIG_MEMFAULT=y
CONFIG_MEMFAULT_NCS_PROJECT_KEY="<INSERT_YOUR_PROJECT_KEY_HERE>"
CONFIG_MEMFAULT_NCS_FW_TYPE="app"
CONFIG_MEMFAULT_NCS_DEVICE_ID="test-device"
KconfigCONFIG_MEMFAULT
: Enables integration with the Memfault SDKCONFIG_MEMFAULT_NCS_PROJECT_KEY
: Memfault project keyCONFIG_MEMFAULT_NCS_FW_TYPE
: Firmware type running on the boardCONFIG_MEMFAULT_NCS_DEVICE_ID
: Memfault device IDNote that CONFIG_SHELL
is not strictly necessary for Memfault integration, but it provides a convenient means to test our basic integration.
These Kconfig options will enable the Memfault integration present in the nRF Connect SDK with a minimal set of device details.
7. Build and flash the application to your device.
Upon booting, you should see the following output on the console.
*** Booting nRF Connect SDK ***
‹inf> mflt: Reset Reason, RESETREAS=0x1
‹inf› mflt: Reset Causes:
‹inf> mflt: Pin Reset
<inf› mflt: GNU Build ID: 15efedceaa4e9a4b068c9Cac91118eb3e386f61b
Hello World! nrf5340dk/nrf5340/cpuapp
uart: ~$
Terminal8. Upload the project’s .elf file to the Memfault backend.
Now that we have a compiled and running firmware, we will equip the Memfault backend with the debug info necessary for the metrics collection and advanced debugging workflows ahead. All that is required is uploading our project’s .elf
symbol file, which is securely uploaded and stored in the Memfault backend.
Navigate to the Symbol Files View and select Upload Symbol File in the upper right-hand corner. Then upload your project’s compiled .elf
file. This is the zephyr.elf
file located in l2/l2_e4/build/l2_e4/zephyr
of the course repository.
After uploading, you can return to the serial console of your attached device.
9. Trigger a heartbeat through the shell command.
Memfault’s observability solution begins with tracking key device vitals derived from individual metrics, which are collected automatically at regular “Heartbeat” intervals. By default, these Heartbeat Metrics are collected at hourly intervals. We’ll begin by triggering a Heartbeat and uploading the data to verify metrics are working.
Run the mflt test heartbeat
shell command :
uart:~$ mflt test heartbeat
Triggering Heartbeat
Terminal command10. Export the heartbeat data over UART.
Memfault diagnostic and metric data (and logs!) is stored on-device in an efficient “chunk” data format and is intended to be transmitted to the Memfault cloud using a variety of connectivity paths. While Wi-Fi, cellular network, and Bluetooth LE are the most common, to keep things extra simple for our example, we will be using the Chunks Debug feature of the Memfault web app to simply export our debug data over the UART for import into the platform.
To export our heartbeat metric data, issue the mflt export
command in the serial terminal:
uart:~$ mflt export
MC:SP8MgQlDT1JFAgYAA6gKFAABTAYAAQEGACF0JQAgdCUAIAEAAGAA4ADgAQYACbMCAQD8Bv8ReCUAIMziAAACDgATswIBABglACDHSAoAAWA=:
MC:wE0GABNh4E0AIBglACAMDgABFAYAKRXv7c6qTppLBoycrJERjrPjhvYbAg4AAQsGABducmY1MzQwZGstMQoOAAEMBgAZMC4wLjErMTVlZmU=:
MC:wJsBZAsOAAEFBgALaGVsbG8EDgABCQYAE25yZjUzNDBkawcOAAEEBgABKAYAAQUOAAEECAAHkQAABg4AAQMMAAEBBgAJJO0A4BwGAAsCAAc=:
...
Terminal command11. Copy and paste the exported data into the Memfault project.
Now, we’ll copy this output (each line beginning with MC:
) and paste it into the Chunks Debug view within our Memfault project:
When you’ve completed step 3 of the ‘import chunks’ process, navigate to the Devices view. And now, voila! You should see your device listed there.
12. Explore the imported data on the Memfault platform.
From here, you can begin to explore the data you’ve just imported.
Navigate to the Dashboards → Metrics page and inspect the aggregate metric charts. You may need to refresh the charts using the icon in the bottom left corner of each chart:
We have shown that we can successfully extract metrics from our device. Memfault allows you to see the metrics for a single device on a timeline view, or aggregated and visualized across all devices in charts as shown above.
A powerful feature of Memfault is how easy it is to add a new metric. Doing so is out of scope for this exercise, but it’s as simple as adding the following two lines of code:
// Add to file config/memfault_metrics_heartbeat_config.def
MEMFAULT_METRICS_KEY_DEFINE(temperature_c, kMemfaultMetricType_Signed)
// Add in code where the temperature reading is read
MEMFAULT_METRIC_SET_SIGNED(temperature_c, get_temperature());
CNow that we’ve seen some metrics, let’s move on to advanced debugging.
Step 3 – Exploring crash data in Memfault
It’s time to have some fun driving our board into fault territory. Our Memfault shell commands will allow us to provoke those error conditions we hope never to see but are inevitable in our production devices.
We can generate any of these corresponding fault conditions using the following commands:
uart:~$ mflt test
test - commands to verify memfault data collection
(https://mflt.io/mcu-test-commands)
Subcommands:
busfault : trigger a busfault
hardfault : trigger a hardfault
memmanage : trigger a memory management fault
usagefault : trigger a usage fault
hang : trigger a hang
zassert : trigger a zephyr assert
stack_overflow : trigger a stack overflow
assert : trigger memfault assert
loadaddr : test a 32 bit load from an address
double_free : trigger a double free error
badptr : trigger fault via store to a bad address
isr_badptr : trigger fault via store to a bad address from an ISR
reboot : trigger a reboot and record it using memfault
heartbeat : trigger an immediate capture of all heartbeat metrics
log_capture : trigger capture of current log buffer contents
logs : writes test logs to log buffer
trace : capture an example trace event
Terminal command13. Trigger an assert through the Memfault shell commands.
For now, how about starting with a simple assert? Type mflt test assert
in the shell prompt:
uart:~$ mflt test assert
Terminal commandYou will be gifted with something resembling the following:
*** Booting nRF Connect SDK ***
<inf> mflt: Reset Reason, RESETREAS=0x8
<inf> mflt: Reset Causes:
<inf> mflt: Software
<inf> mflt: GNU Build ID: 15efedceaa4e9a4b068c9cac91118eb3e386f61b
Hello World! nrf5340dk/nrf5340/cpuapp
Terminal14. Retrieve the coredump through the Memfault shell commands.
Now what? Well, let’s see if our test assert left a coredump behind:
uart:~$ mflt get_core
<inf> mflt: Has coredump with size: 2728
Terminal commandSure enough, there it is –- 2728 bytes worth of coredump data! The Memfault SDK component of our firmware has stored this data locally for eventual analysis using the Memfault web app.
15. Copy and paste the exported coredump into the Memfault project.
Once again, we’re going to export our coredump data as an encoded bunch of chunks, and upload it to Chunks Debug, just like we did in step 11.
uart:~$ mflt export
MC:SP8MgQlDT1JFAgYAA6gKFAABTAYAAQEGACF0JQAgdCUAIAEAAGAA4ADgAQYACbMCAQD8Bv8ReCUAIMziAAACDgATswIBABglACDHSAoAAWA=:
MC:wE0GABNh4E0AIBglACAMDgABFAYAKRXv7c6qTppLBoycrJERjrPjhvYbAg4AAQsGABducmY1MzQwZGstMQoOAAEMBgAZMC4wLjErMTVlZmU=:
MC:wJsBZAsOAAEFBgALaGVsbG8EDgABCQYAE25yZjUzNDBkawcOAAEEBgABKAYAAQUOAAEECAAHkQAABg4AAQMMAAEBBgAJJO0A4BwGAAsCAAc=:
...
Terminal command16. Navigate to the Devices page to view the trace.
After you’ve completed the now-familiar chunks import process, navigate to the Devices page.
Clicking the device name in this list will bring up the timeline view. From this view, you can either find your coredump in the timeline view under the ‘Traces’ row, or you can navigate to the Traces tab. When you find your trace, click on it, and you will be greeted with a view of the trace that should be familiar to anyone who has stepped through a running firmware with the aid of a debugger and attached JTAG probe. And like your paused debugger, this view represents a complete snapshot of the state of the device at the time of our provoked test crash.
17. Explore the diagnostics view in the Memfault platform.
That looks much better! Now we clearly see where this particular fault occurred in our code. And now that we have our ELF data, we can really explore this fully fleshed-out system state snapshot — starting with the full RTOS thread view that you’ve come to expect in your debugger. Go ahead and peel back the layers on these stack frames. From there, you might want to explore the register values and local variables. The Globals & Statics view has a host of useful Zephyr-related structs that can be invaluable when you need to untangle complicated OS-level behaviors. Lastly, the Memory Viewer exposes the difference between where you thought you put that data and where it actually ended up.
You can imagine how this might look for our naturally occurring firmware faults and how valuable this information will be as our firmware grows up and finds its way into production, far from our desktop debugger.
(Optional) Step 4 – Exploring remote debugging
18. (Optional) Transfer the diagnostic data to the Memfault platform over Bluetooth LE, Wi-Fi or cellular IoT.
Transferring the chunks (diagnostic data) manually from the serial terminal to the Memfault platform was quite convenient during the development phase and while working with development kits. However, it’s not feasible at all for deployed in-field devices. The Memfault SDK, which is fully integrated into the nRF Connect SDK, offers multiple transport options for transferring diagnostic data from remote devices, including Bluetooth LE, Wi-Fi, and cellular IoT.
Depending on the development kit you are using, you can pick the supported transports below
Follow this section if you are using one of the following development kits: nRF9161 DK, nRF9160 DK, nRF9151 DK or nRF7002 DK.
Setting up Memfault on a cellular IoT or Wi-Fi development kit is quite straightforward since the device already has a direct connection to the internet.
The Memfault sample in the nRF Connect SDK shows how to use Memfault to collect coredumps and metrics over a cellular or Wi-Fi connection.
The workflow is simple
Follow this section if you are using one of the following development kits: nRF54L15 DK, nRF5340 DK, nRF52840 DK, nRF52833 DK, or nRF52 DK.
It is possible to use Bluetooth LE to forward the diagnostic data collected by the firmware through a Bluetooth gateway, by using the GATT custom service: Memfault Diagnostic Service (MDS).
The Peripheral MDS sample in the nRF Connect SDK shows you how to use the Memfault Diagnostic Service to collect core dumps and metrics over Bluetooth LE using different Bluetooth gateway options. You can, for example, use the nRF Memfault mobile app (nRF Memfault for Android or nRF Memfault for iOS).
The workflow is simple:
Ok, let’s review:
In our short session, we’ve equipped our firmware with powerful remote debugging capabilities. Having this functionality enabled from the beginning of your project will allow you to observe the performance of the firmware over the releases that follow. As an embedded engineer, you will be equipped with new insights, confidently adding features to your firmware as it goes from development into production.
Ultimately, when remote debugging is combined with some kind of periodic data connectivity path, custom metrics, fleet segmentation, and OTA (over-the-air) functionality, embedded development teams can perform proactive release monitoring — equipped with the data to ensure firmware quality.