From the perspective your more traditional software programming languages, multi-tasking in SystemVerilog is weird. On one hand, the language includes a special syntax (
join) that make it really easy to write code that “runs” in parallel. On the other hand, SystemVerilog “processes” aren’t really processes, or even threads, in the traditional sense. Even if you “get” the SystemVerilog model of cooperative cooperative multi-tasking, there are a number of pitfalls you can run into. One such gotcha that I recently ran into was caused by what appeared to be an innocuous
wait fork statement.
An Unexpected Infinite Wait
The issue was in an old (I mean really old, pre-UVM) test case that hadn’t been run in a long time, and appeared to hang in a task that issued an bus read transaction. From inspecting the simulation waves and logs, I could tell that the bus transaction was completing without issue, but the task was still never returning. The basic structure of this the test code is shown above, a process is forked off, then the
do_read() task is called. This obviously calls the content of
do_read() into question, after digging through a few translation layers (did I mention this code was old), I found the code that actually did the bus transaction, which I’ll call
do_read_internal(), looked something like this:
Now, if you know what you’re looking for you may already see the issue. But before I give it all away, suffice it to say, the problem here is the use of the
wait fork statement. So let’s talk about what that last
wait fork statement is actually doing.
A Not So Brief Introduction to
Lets start in the obvious place, what does the SystsemVerilog LRM (Language Reference Manual) say about
wait fork? This is summary of the statement that we can find in section 9.6.1 Wait fork statement:
wait forkstatement blocks process execution flow until all immediate child subprocesses (processes created by the current process, excluding their descendants) have completed their execution.
This seems fairly simple, it is a statement that makes a process wait until all sub-processes that were started from the current process have finished. The problem with this construct is that it’s not always obvious which processes “all immediate child subprocesses” refer to. Lets start with a simple example:
In this case the
join_none block is going to spawn two subprocesses which wait for a period of time and then print a message. Because a
join_none was used, execution of the
main_process block will not wait for these subprocesses to finish before continuing. Then the
wait fork statement will wait for both the two subprocesses running the
subprocess_2 blocks to finish, and then print “Done”. Easy enough. Now, what if we make it a bit more complicated and move the
wait fork into another task.
In this case we start subprocesses
subprocess_2, and don’t wait for them to finish, just like the previous example. However we now call a task
task_1() which also spawns two new processes
subprocess_4, then executes a
wait fork statement, before returning and printing “Done”. If
wait fork followed normal scoping rules like one might naively expect it to wait for subprocesses 3 and 4 only, but if we look at the output from this example, we see this is not the case.
The reason that the
wait fork waits for all four of the spawned subprocess to finish, is that simply that they are all subprocesses of the process from which
wait fork was called, the one started from the
main_process initial block. This is because calling a task or another SystemVerilog scope doesn’t start a new process, in fact the only way that a new process is spawned (from an existing one) is through a fork statement. Because the code running in
task_1 is still part of the same process, “all immediate subprocesses” includes
This final example demonstrates, in a more obvious way, the bug in my problem code above. In this case, the task
task_1 is first called from
main_process which starts two subprocesses
subprocess_1, then immediately returns because the block is terminated with
task_2 is executed, which forks two more processes (
subprocess_3) and finally executes a
wait fork statement. In this case the
wait fork statement is never actually going to finish, because it has to wait for the conspicuously named
infinite_subprocess (which never ends), even though it was started from a different task.
Finally, note that the
wait fork in
task_2 does not have to wait for the process
infinite_sub_subprocess because it is not a direct subprocess of the
infinite_sub_subprocess process is forked from the
subprocess_1 process, but because it uses
join_none, the process does not wait for
infinite_sub_subprocess, and leaves the sub-subprocess running, unattached, from any parent process.
What Went Wrong?
Going back to my hanging test case. Now that we’ve looked at some examples of how
wait fork works, it’s pretty obvious if you know where to look. The main test code starts an infinite process, and then calls a task which start it’s own process and then uses
wait fork to wait for it’s process, but unintentionally ends up waiting on an infinite loop to end. The
do_read_internal task has no visibility to know that an infinite process has even been spawned from it’s process, but
wait fork will wait for it anyway.
The thing I find devious about this bug is that it demonstrates how starting of a subprocess in one place can cause a completely unrelated piece of code to hang, potentially forever. In fact it is entirely possible that the code that starts the infinite loop subprocess is in one library and the
wait fork statement is in another library. All that is required to cause havoc is that they are called from the same process. When I realized this it made me reconsider the safety of using
wait fork, and at this point it seems like it’s only really safe to use in very specific applications.
Unfortunately, there is not a version of
wait fork with a more limited scope (such as waiting for subprocess of the current task or scope). This leaves us with three, not great, alternatives to implement the functionality that our
do_read_internal was going for: starting a non-block process, and then conditionally waiting for it to finish.
The first, and in my opinion best, option in this case is to simply declare an event that gets emitted by the started subprocess on completion. Then to wait for the subprocess to end, the main process can just wait on the event, instead of using
wait fork. Aside from needing to declare a new event variable, this works fairly well if there is only one subprocess to wait on. This implementation would look something like the following:
Another option is to use the SystemVerilog “fine-grain process control” features (see section 9.7 of the LRM). Essentially this feature provides an API to get a reference to a “process” object associated with the current process, which can then be used to wait on or otherwise manage the process. In this case, we can have the subprocess store it’s “process” object in a variable, and then use the
await() task of the process object to wait for the subprocess to end. Something like the following:
The two options above work fine when you are starting a single subprocess, and then want to conditionally wait for it to finish, but what if you are starting an indeterminate number of processes and want to wait for them to all finish? In this case the behavior of
wait fork to wait for all subprocesses is a really convenient feature. However as we have seen, when calling
wait fork you need to ensure you know which subprocesses it will actually wait for. The only way to really do this for sure, is to fork a new process within which we carefully control which processes are started. This does have the overhead of creating an extra another process, but this could be reasonable trade-off in some cases. An example of this implementation is shown below.
In the end, I chose to take the first option and replace the
wait fork with a wait on a
done event, and everything works fine. However, I am also now taking a much more critical look at all other uses of
wait fork in our repository, as I now have a better appreciation for some of the pitfalls of this statement.
There are certainly places where it is mostly safe. Use of
wait fork directly within a fork/join block is generally safe because it’s harder to get an unexpected subprocess. Likewise use within a UVM sequence
body() task also appears to be fairly safe, because sequence bodies are always started in a new process. However, even in these fairly well controlled uses it is important to remember that any task or function that is called could start an unexpected infinite subprocess, and then we’re right back to the same problem.
I’m not saying don’t use
wait fork, but I do think it needs to be used with caution. It is easy to forget, or not be aware of, what all it will wait for.