Variables in Go
- Explicit Declaration of Variables
- Implicit Declaration of Variables
- Declaration and Initialization of Multiple Variables
- Shadowing Global Variables in Go
Variables in C and Go have much in common. They have to be declared before they can be used and scope can be limited to global access, inside functions or inside blocks. There are however some mainly syntactically differences which a C programmer should be aware of.
Explicit Declaration of Variables
In C, the standard way to declare a variable is by mentioning a type, followed by the name of the variable. Go uses a somewhat different approach which they loosely borrowed from the Pascal programming language. You first use the reserved word var, followed by the variable name and the type. I personally like this approach. The var keyword is highlighted in most syntax aware code editors and you right away understand what is happening.
Types can be used as in C simple predefined types like int, types defined in other packages like time.Duration or user defined structures. Variables can be initialized when declared, just as in C, but there is a significant difference with C in the semantics of this.
Variables not explicitly initialized during declaration have undefined contents until a value is assigned to them somewhere in the program flow. Modern compilers generally complain when variables with unassigned contents are used in a program. In the past when compilers were more sloppy in testing this, uninitialized variables were a huge attack surface for hackers. In most architectures local variables are allocated on the stack and when they are not initialized, they first contain the contents of whatever was the latest use of that stack position. If that happened to be a password or some other security related information, you can imagine what a smart hacker can do.
Go has fixed this security hole by design. Every variable which is not explicitly declared is initialized with a zero value. This does not always mean that the content is really filled with zero bits. It may depend on the type of variable how zero is represented at the lowest hardware level. But for us programmers, it is good enough to know that all variables are initialized to some value by default.
Implicit Declaration of Variables
When you had your first glance at Go source code, you may have noted another Pascal similarity, the := assignment operator. Pascal uses this operator throughout the definition for assignment purposes. But Go only uses it sometimes which may be confusing at first. Sometimes you see the := operator, but in other instances also the in C used =.
The reason is that Go uses the C style assignment operator when a variable has already been declared, and the Pascal style operator for an implicit declaration. When using the := operator, the variable is declared on the fly, without the need to use a line with var keyword somewhere earlier in the code. It obviously creates shorter code, especially when a variable is only used inside a small block like a for loop. Personally I don’t recommend it to used it in large functions where a variable is used at multiple locations in the code. When a function is so large that it makes sense to declare the same variable name in two separate location, the problem is not your variables, the problem is your function getting too big. It is better to split the function in two separate functions in that case.
Implicitly declared variables inherit the type of the expression used to initialize them. If the type can not be fully determined, Go uses the most broadly used version. This may not in all situations the type you really want. If for example you want an 8 bit sized integer variable, you have to declare the variable explicitly before, or use a cast in the assignment expression.
Declaration and Initialization of Multiple Variables
It is possible to explicitly declare multiple variables on one line, and also assign multiple values to them in that declaration. You can see it in the example code below, but personally I would recommend against using it. You only save a newline character and some spaces, at the cost of decreased readability of the code. Imagine that you have to debug a piece of code at three o’clock in the night in an application which must be running again at 6 in the morning when production at your client’s facility starts. The last thing you want is this type of condensed code while you’re trying to keep your eyes open with shots of strong coffee. It’s actually the same reason why Linus Torvalds recommends a TAB spacing of 8 characters. Night work as a programmer comes with its own set of rules.
package main
// Copyright (C) 2021 - Lammert Bies
// Distributed under terms of the MIT license.
import (
"fmt"
"unsafe"
)
var a int = 10
var b int8
var c, d int = 4, 5
func subf1() {
a += 2;
fmt.Printf( "subf1(): a = %d, b = %d, c = %3d sizeof(%d), d = %3d sizeof(%d)\n", a, b, c, unsafe.Sizeof(c), d, unsafe.Sizeof(d) )
} /* subf1 */
func subf2() {
a := a + 3;
c := 100;
d := int8(100);
fmt.Printf( "subf2(): a = %d, b = %d, c = %3d sizeof(%d), d = %3d sizeof(%d)\n", a, b, c, unsafe.Sizeof(c), d, unsafe.Sizeof(d) )
} /* subf2 */
func main() {
subf1()
subf2()
fmt.Printf( "main() : a = %d, b = %d, c = %3d sizeof(%d), d = %3d sizeof(%d)\n", a, b, c, unsafe.Sizeof(c), d, unsafe.Sizeof(d) )
} /* main */
This seems just like another small example Go program with variable declarations, but it shows a few very scary programming practices. In the global part of the source, we define four variables. One is declared as an initialized integer, the second is an initialized small integer, and in the third declaration two variables are declared with distinct initialization values. The main() function calls two sub-functions, then prints the contents on the screen and exits.
In the subf2() function two types of short variable declarations are used. In the simple version Go determines based on the type of the expression which type the new variable should have. In the creation of the local variable d we use a cast to an int8, which specifies the size of the new variable to be an 8 bits integer using one byte of address space.
The first scary thing happens in the fmt.Printf() statements. In order to show the size of the integer types, I had to use the unsafe.Sizeof() function. And usage of this package is…well…unsafe. They way the content of this package executes on different systems is undefined, and in some architectures the functionality may not be available at all. My advice is that if you need the unsafe package, go back to the drawing board and rethink your program structure.
Shadowing Global Variables in Go
But the second scary thing is much more concerning because it shows an inherit design flaw in the Go language. To clarify the problem, I’ll first show the output of this program on my development system,
$ go run variables.go
subf1(): a = 12, b = 0, c = 4 sizeof(8), d = 5 sizeof(8)
subf2(): a = 15, b = 0, c = 100 sizeof(8), d = 100 sizeof(1)
main() : a = 12, b = 0, c = 4 sizeof(8), d = 5 sizeof(8)
The variable a is initialized to 10. The call to subf1() adds 2 to it resulting in twelve. Then the call to subf2() adds 3 to it resulting in 15. But back in main() the value falls back to 12. The reason is that the short assignment operator := creates a new variable with the scope limited to the function subf2(), initialized by the global variable a which accidentally happens to have the same name. The local variable a shadows the global variable a without any compiler or runtime warning.
In my opinion as a 30+ years developer, this is a bug in the Go design. When using gcc to compile my C programs, I consequently use the -Wshadow=local compiler flag for this specific reason, to catch programming errors where I accidentally use local variables which shadow existing global variables. Go doesn’t do this, worse even, it allows to initialize the local variable with the global variable with a simple typo. Using the := operator instead of the = operator which differ in only one character.
When starting with Go, I thought it would prevent me automagically from this type of programming errors which are relatively common in C. Turns out, Go may save you from some problems by design, but not from all.
In theory, there is no difference between theory and practice.
But, in practice, there is.
JAN L.A. VAN DE SNEPSCHEUT
|