You are currently not logged in and your progress will not be saved. Register or Log in

Exercise 3

Workqueue creation and work item submission

Since threads of higher priority have the ability to starve other low priority threads, it is good practice to offload all non-urgent execution in these threads into lower-priority threads.

In this exercise, we will create and initialize a workqueue to offload work from a higher priority thread.

Exercise Steps:

1. Download the base exercise project and extract it in your exercise folder for this course.

Threads with different priorities

2. Define the three thread priorities used in this exercise, so that thread0 is of higher priority than thread1. The workqueue thread should have the lowest priority since we want this thread to execute offloaded (non-urgent) work. Remember that high priority translates to lower numerical value.

#define THREAD0_PRIORITY 2 
#define THREAD1_PRIORITY 3
#define WORKQ_PRIORITY   4

3. thread0 is already provided in the codebase. It initializes the internal data structures time_stamp and delta_time. Then, in a while-loop, the kernel function k_uptime_get() is called to capture the time stamp. After that we emulate some work, and use k_uptime_delta() to get and print the time it took to finish this round of work. Then sleep for 20 ms and repeat forever.

void thread0(void)
{
    uint64_t time_stamp;
    int64_t delta_time;

    while (1) {
        time_stamp = k_uptime_get();
        emulate_work();
        delta_time = k_uptime_delta(&time_stamp);

        printk("thread0 yielding this round in %lld ms\n", delta_time);
        k_msleep(20);
    }   
}

4. thread1 should do the exact same thing. Add the following code for thread1:

void thread1(void)
{
    uint64_t time_stamp;
    int64_t delta_time;

    while (1) {
        time_stamp = k_uptime_get();
        emulate_work();
        delta_time = k_uptime_delta(&time_stamp);

        printk("thread1 yielding this round in %lld ms\n", delta_time);
        k_msleep(20);
    }   
}

Note that this thread will get less time to process emulate_work() since it is of lower priority.

5. Before the thread entry functions, define an inline function to emulate work that processes a loop without yielding or sleeping.

static inline void emulate_work()
 
{
	for(volatile int count_out = 0; count_out < 150000; count_out ++);
}

On a 64 MHz nRF52840, this function should take about 24 ms to finish.

6. Build the application and flash it on your development kit. Using a serial terminal you should now see the below output:

*** Booting Zephyr OS build v3.0.99-ncs1  ***
thread0 yielding this round in 26 ms
thread0 yielding this round in 26 ms
thread1 yielding this round in 55 ms
thread0 yielding this round in 26 ms
thread0 yielding this round in 26 ms
thread1 yielding this round in 55 ms
thread0 yielding this round in 26 ms
thread0 yielding this round in 26 ms
thread1 yielding this round in 57 ms
thread0 yielding this round in 26 ms
thread0 yielding this round in 26 ms
thread1 yielding this round in 56 ms
thread0 yielding this round in 26 ms
thread0 yielding this round in 26 ms
thread1 yielding this round in 57 ms
thread0 yielding this round in 26 ms
thread0 yielding this round in 26 ms
thread1 yielding this round in 55 ms
thread0 yielding this round in 26 ms
thread0 yielding this round in 26 ms
thread1 yielding this round in 57 ms

You can see that the higher priority thread0 completes the task emulate_work in about 25-26 ms but thread1 takes more than double that time. This is because thread0 keeps blocking thread1.

The timeline of threads should look something like below:

Timeline of threads with different priority

Offloading work from high priority task

Since thread0 is processing non-urgent work, it is not good practice to block other threads just to perform this work. Let’s offload the non-urgent emulate_work() into a lower priority workqueue thread.

7. We need to associate our work (emulate_work()) as a work item and push it to a specific workqueue. This is done by creating a work_info structure and a function, offload_function() that should only run emulate_work().

struct work_info {
    struct k_work work;
    char name[25];
} my_work;

void offload_function(struct k_work *work_tem)
{
	emulate_work();
}

8. In the entry function for thread0, start the workqueue using k_work_queue_start(). Then initialize the work item using k_work_init() to connect the work item to its handler offload_function().

k_work_queue_start(&offload_work_q, my_stack_area,
                   K_THREAD_STACK_SIZEOF(my_stack_area), WORKQ_PRIORITY,
                   NULL);

strcpy(my_work.name, "Thread0 emulate_work()");
k_work_init(&my_work.work, offload_function);

9. Instead of running emulate_work in the while-loop, submit a work item to the workqueue using k_work_submit_to_queue()

k_work_submit_to_queue(&offload_work_q, &my_work.work);

thread0 is now offloading the processing of emulate_work() into the lower priority worker thread which means that it should process less in this high priority context before it goes to sleep (for 20 ms). This, in turn, should translate to more processing time for thread1 (by fewer interruptions from thread0).

10. Build the application and flash it on your development kit. Using a serial terminal you should see the below output:

*** Booting Zephyr OS build v3.0.99-ncs1  ***
thread0 yielding this round in 0 ms
thread0 yielding this round in 0 ms
thread1 yielding this round in 31 ms
thread0 yielding this round in 0 ms
thread0 yielding this round in 0 ms
thread1 yielding this round in 3thread0 yielding this round in 0 ms
3 ms
thread0 yielding this round in 0 ms
thread0 yielding this round in 0 ms
thread1 yielding this round in 30 ms
thread0 yielding this round in 0 ms
thread0 yielding this round in 0 ms
thread1 yielding this round in 3thread0 yielding this round in 0 ms
3 ms
thread0 yielding this round in 0 ms
thread0 yielding this round in 0 ms
thread1 yielding this round in 29 ms
thread0 yielding this round in 0 ms
thread0 yielding this round in 0 ms
thread1 yielding this round in 33 ms
thread0 yielding this round in 0 ms
thread0 yielding this round in 0 ms

The timeline of the threads looks something like this

Timeline of threads with different priority when offloading work

As you can see, thread0 now completes its round within less than a millisecond before it sleeps, giving other lesser priority threads more time to run. This is acceptable for thread0 since it can live with postponed execution of emulate_work(). Also, notice now that thread1 takes much less time to finish its round of processing the work as compared to the scenario where thread0 was not using the workqueue to offload work. This is an example of good architecture as we only keep urgent work to be processed in higher priorities and non-urgent work is offloaded to the appropriate lower priority. As an application designer on the RTOS, you should be aware of the kernel services provided to the application and make best use of it so as to avoid unnecessary latencies.

You can download the solution for Exercise 3 below.