Parallel processing in Go

About the Author: Lammert Bies is a dad, husband and polyglot. He is developing embedded systems since the eighties. Used machine learning before it had a name. Specializes in interconnecting computers, robots and humans. Was a Google Mapmaker Advocate and speaker on several international Google conferences from 2011 until the plug was pulled on Mapmaker in 2017. Bughunter with Google. Currently spreading artificial intelligence to the wildest locations in production environments. He never stops learning.

Goroutines are a unique feature in the Go programming language. I warned you, this will not be a beginners tutorial, but one of the aspects in modern programming education which gets not enough attention IMHO is the parallel execution of multiple tasks within one program. The clock speed of consumer computers has been limited around a handful of GHz for some years now and physical limitations do no give hope that this will change anytime soon. To provide more processing power to end-users, CPU manufacturers started to glue multiple CPU cores on top of each other, providing more available compute cycles per second.

This is a nice approach if you are running many programs which are not CPU intensive. The CPU scheduler can distribute the individual tasks over the available CPU cores and everything runs very smoothly. But it becomes more difficult, if there is only one task which must be distributed over multiple cores. This can only be done if the program and algorithms in it have been designed for it. And that is often not the case.

The limits of sequential programming

Educational systems love a simple and deterministic approach. Exam questions have typically only one outcome for example. Programs running multiple independent parallel processes internally are not deterministic. Without special precautions, you don’t know which task is completed first and if dependencies between the tasks are properly met. Even important large scale software like the database system MySQL is running internally most tasks in only one process. Multiple requests can be distributed over multiple cores, but if you happen to be a single user on a MySQL server and running one large query, that query is not evenly distributed over the cores.

To be honest, it is not only the educational system which is to blame for this situation. Programming languages–with the notable exception of Ada–have never been designed for multi-tasking. Implementation has been sloppy at best with hardware and operating system dependent libraries for atomic operations, locking and thread management. It takes a lot of programming lines to spawn and control tasks in programming languages like C.

Go is different. Go was designed for multi-tasking. Routines can be started with the statement go. That is right, the statement to start an independent routine in Go is named after the language itself. That shows the importance of parallel processing in the mind of the designers of the language.

The Concept of Goroutines

Routines which run parallel tasks inside a Go program are called goroutines. Goroutines are just functions where the calling routine is continuing with next statements without waiting for the goroutine to complete. This is a nice concept, but once the goroutine is started, it gets out of sight. We won’t know if the routine is completed, needs some input or send some output or tries to access data which is also accessed by other routines. While starting a goroutine is simple, using it in the right way may not be that simple after all.

Our first Goroutine Example

Simple examples say often more than thousands of words. In this example, I will create a goroutine, start it twice, and then wait until both routines have completed.

package main

// Copyright (C) 2021 - Lammert Bies
// Distributed under terms of the MIT license.

import (
        "fmt"
        "sync"
        "time"
        "math/rand"
)

func athlete( name string, wg *sync.WaitGroup ) {

        defer wg.Done()

        fmt.Printf( "%s started\n",  name )

        period   := rand.Intn( 10000 )
        duration := time.Duration( period )
        time.Sleep( duration * time.Millisecond )

        fmt.Printf( "%s finished after %d milliseconds\n", name, period )

}  /* athlete */

func main() {

        var wg sync.WaitGroup

        rand.Seed( time.Now().UTC().UnixNano() )
        fmt.Printf( "On your mark, Get set, GO!\n" )

        wg.Add( 2 )
        go athlete( "John", &wg )
        go athlete( "Jane", &wg )

        wg.Wait()
        fmt.Printf( "All athletes have finished\n" )

}  /* main */

This Go program simulates two athletes competing in a running race. After the GO! signal of the referee (no pun intended) both athletes start and need each a random amount of time to reach the finish. When both athletes have reached the finish, the program executes.

Even though this is just a small example program, it contains a lot of important concepts and I will step through them one by one.

Waiting a random time

A lot of code is consumed by having each athlete using a random time between start and finish. Random functionality is provided in the package rand, but we have to do some initialization and data conversion before random data can be used to feed a timer. Let’s first dig through the random generator code. We’ll tackle the parallel programming afterwards.

As most compiler provided software random generators, the generator in Go provided by the math/rand package is a pseudo-random generator where the result of the algorithm is not obvious, but still deterministic. Running the our program multiple times would result in the same outcome. Fortunately the package provides a rand.Seed() function which allows us to change the starting point of the generator. We’ll feed it with the current time in nano seconds. There is not much chance this will result in the same outcome of the program when it is run multiple times.

Inside the athlete() function, we will wait a random time between 0 and 10 seconds between start and finish. Now it is becoming interesting, because many languages allow us to feed the integer outcome of a random generator directly to a timer function. Not in Go which has a very strong type-check functionality. Duration time values in Go are internally stored as int64. representing an elapsed time in nanoseconds. But we cannot just assign an integer value to it. We have to convert it with time.Duration(), and then multiply it with time.Millisecond() to feed the time.Sleep() function. I admit, there are much shorter ways in C to do it, but when reading this code after 10 years, you will still be able to figure out what is going on. With a piece of C code, you will probably be lost re-reading your code again after some time has passed.

Note that timing is important in the type of programming I usually do, and the time functions are covered in other parts of this tutorial. So don’t panic if you have still doubts about how time functions in Go work.

Calling parallel functions, waiting for completion

The main goal of the example program is explaining the parallel execution of functions. Two athletes are competing in our running race, John and Jane. They both start running at roughly the same time after the GO! command of the starter. The moment they will finish is however not determined yet when their function is called. But we would like to end the program only after both runners have passed the finish line. A simple sleep timer with a value longer than the maximum time produced by the random generator could be used but it has two problems. First of all, if the running times of the athletes are much shorter than the maximum time, we will wast time waiting for something which has already happened. If on the other hand the time needed for an athlete function is close to the maximum and some other processing in the Go program happens which delays the execution of the last parts of the athlete function, we might stop the program before the finish message of the athlete functions is displayed. Remember, Go has a lot of processing going on over which we have only limited control like their garbage collector. That may not be a problem in a small example like this, but in a more complex environment random delays caused by these processes may cause severe problems.

Waiting for goroutines with the sync package

We will therefore add the functionality to wait for the goroutines to complete by using the sync package. This package has counters which count how many goroutines are active. If the number is down to zero, it is save to exit the main program.

Initializing the number of running goroutines

The first thing we have to do is define a wait group. This is a group of processes we want to wait for. We first tell the wait group that we will launch 2 go routines by calling wg.Add(2). It is not necessary that we declare the total number of goroutines to wait for before any goroutine is started. It is also fine to call wg.Add(1) every time just before a goroutine is started. This can be a better approach, when for example starting goroutines in a loop where you don’t know the total amount of running goroutines beforehand.

The only thing which is important though is that you call the sync.Add() method before the corresponding goroutine is actually launched. The reason is that we must be absolutely sure that the sync.Add() method is called before the sync.Done() method is used inside the goroutine.

Telling a goroutine finished

The sync.Done() method must be called inside the goroutine when the function is finished. It is tempting to put it at the end of all the processing statements just before the closing curly bracket of the athlete() function. But this can lead to difficult to detect problems if the function gets more complex and has multiple exit pathways. We could end up in some rare condition where the function ends, but never calls sync.Done(). And trust me, I have been in this situation before with C and inter-process locking mechanisms. It is a horror-situation to debug. Normal debugging software is fine for sequential process flows, but once you have some hands full of parallel processes running in real-time, you are in uncharted territory. Only luck, or sometimes completely rewriting a routine from scratch can fix these errors.

Luckily the designers of Go have found a very elegant way to cope with this problem with the defer statement. Basically, defer is the equivalent of the on_exit() function in C, but with the important difference that it does not only work in the function main(), but in all functions in which it is called. It defines which actions have to be taken when a routine ends, without the need to be explicit about the reason why the routine ended. In our case, when the athlete() function ends through the normal flow of the code or through some error, the sync.Done() method is called before the goroutine completely seizes to exist.

Waiting for full completion of all goroutines

In our main routine we use the method sync.Wait() to wait for the two started goroutines to be completed. After that we put a message on the screen, and end the execution.

Always keep the office door closed. This puts visitors on the defensive and also makes it look as if you are always in an important conference.

SPARK'S FIFTH RULE FOR MANAGERS