1. The Emergence of Go Language
Before diving into the basic syntax of Go, let's understand when and why Go was created, along with its key features.
Go was initially designed and invented by Robert Griesemer, Ken Thompson, and Rob Pike of Google in 2007. Their goal was to create a language that would be suitable for the networked and multi-core era, inspired by C. As a result, Go is often described as a "C-like language" or "C for the 21st century." Indeed, Go inherits many programming concepts from C, such as similar expression syntax, control flow structures, basic data types, pass-by-value for function arguments, and pointers. However, Go is also the most thorough departure from C: it discards C's flexible but dangerous pointer arithmetic, redesigns some of C's operator precedences, and refines many aspects of C in subtle but important ways.
2. Go Version of Hello World
In this section, we'll introduce the basic components of a Go program by using the classic "Hello World" example.
package main import "fmt" func main() { // Output Hello World to the terminal fmt.Println("Hello world!") }
Similar to C, a Go program consists of:
Package declaration: When writing a source file, you must declare which package the file belongs to on the first non-comment line, e.g.,
package main
.Importing packages: This tells the Go compiler which packages the program needs, e.g.,
import "fmt"
imports thefmt
package.Functions: Similar to C, Go has functions that perform specific tasks. Every executable Go program must have a
main
function.Variables: Go variables consist of letters, numbers, and underscores, where the first character cannot be a number.
Statements/Expressions: In Go, each line represents the end of a statement. Unlike other C-family languages, statements do not require semicolons (
;
) because the Go compiler automatically handles this.Comments: Go supports both single-line comments (
//
) and multi-line comments (/* */
). Multi-line comments are typically used for documentation or commenting out code blocks.
Note: Identifiers are used to name program entities like variables and types. An identifier is a sequence of letters, numbers, and underscores, but the first character must be a letter or an underscore (not a number).
If an identifier (including constants, variables, types, function names, and struct fields) starts with an uppercase letter, it is exported (i.e., accessible from other packages).
If it starts with a lowercase letter, it is unexported (i.e., visible only within the package, similar to "protected" in object-oriented languages).
3. Data Types
In Go, data types are used to declare functions and variables.
Data types help divide data into different sizes to efficiently manage memory. Specific classifications include:
Boolean Type: The boolean type only has two possible values:
true
orfalse
.Numeric Types: Integer (
int
) and floating-point (float
) types. Go supports both, as well as complex numbers, using two's complement for bitwise operations.String Type: A string is a sequence of fixed-length characters. Go strings are made up of single-byte characters, and the bytes are encoded in UTF-8 to represent Unicode text.
Derived Types:
(a) Pointer types
(b) Array types
(c) Struct types
(d) Channel types
(e) Function types
(f) Slice types
(g) Interface types
(h) Map types
3.0 Defining Variables
The general format for declaring a variable is using the var
keyword, like: var identifier typename
. For example, the following code defines a variable of type int
.
package main import "fmt" func main() { var a int = 27 fmt.Println(a) }
3.0.1 If a Variable is Not Initialized
In Go, if a variable is declared with a type but not initialized, it will have a zero value. The zero value is the default value set by the system when the variable is not initialized.
Type | Zero Value |
---|---|
Numeric | 0 |
Boolean | false |
String | "" (empty string) |
3.0.2 If a Variable Has No Type Specified
If no type is specified for a variable, Go will infer the type from the variable's initial value. For example:
package main import "fmt" func main() { var d = true fmt.Println(d) }
3.0.3 The :=
Operator
If we define a variable and use the :=
operator to initialize it, a compile-time error will occur because this operator is for short variable declarations.
Usage format: typename := value
For example, intVal := 1
is equivalent to:
var intVal int intVal = 1
3.0.4 Declaring Multiple Variables
You can declare multiple variables of the same type (non-global variables) as shown below:
var x, y int var c, d int = 1, 2 g, h := 123, "hello"
Global variables can be declared as follows:
var ( vname1 v_type1 vname2 v_type2 )
Example:
var ( a int b bool )
3.0.5 Anonymous Variables
An anonymous variable is represented by an underscore (_
), a special identifier. It can be used in variable declarations or assignments (any type can be assigned to it), but any value assigned to this identifier will be discarded and cannot be used later. You cannot use this identifier for assignment or operations with other variables.
Example code:
func GetData() (int, int) { return 10, 20 } func main() { a, _ := GetData() _, b := GetData() fmt.Println(a, b) }
Note: Anonymous variables do not consume memory space and are not allocated memory. They can be used multiple times without conflicts.
3.0.6 Variable Scope
Scope refers to the range within the source code where a declared identifier (representing constants, types, functions, or packages) is accessible. In Go, the scope of variables can be classified into three types based on where the variable is defined:
Local Variables: A variable defined within a function is called a local variable. Its scope is limited to the function body. Function parameters and return values are also local variables. These variables exist while the function is being called and are destroyed when the function call ends.
Global Variables: A variable defined outside of any function is a global variable. It only needs to be defined in one source file but can be used in all source files, and even imported external packages can use it. A global variable must be declared with the
var
keyword. If you want to use a global variable from an external package, its first letter must be capitalized.Formal Parameters: Variables defined within a function definition are called formal parameters. These are the variables listed in the parentheses after the function name. Formal parameters only come into effect when the function is called and are destroyed after the function finishes. When the function is not called, the function’s formal parameters do not occupy actual memory space and do not have actual values. Formal parameters act as local variables within the function.
3.1 Basic Data Types
Type | Description |
---|---|
uint8 / uint16 / uint32 / uint64 | Unsigned 8 / 16 / 32 / 64-bit integers |
int8 / int16 / int32 / int64 | Signed 8 / 16 / 32 / 64-bit integers |
float32 / float64 | IEEE-754 32 / 64-bit floating-point numbers |
complex64 / complex128 | 32 / 64-bit real and imaginary numbers |
byte | Similar to uint8 |
rune | Similar to int32 |
uintptr | Unsigned integer used to store a pointer address |
These are the basic data types in Go. With data types, we can define variables. In Go, variable names consist of letters, numbers, and underscores, with the first character being a letter or an underscore (not a number).
3.2 Pointers
Similar to C, Go allows programmers to decide when to use pointers. A variable is essentially a convenient placeholder that holds the memory address of a data item. The address-of operator in Go is &
, and it is used before a variable to return its memory address.
A pointer variable is used to store the memory address of another variable.
3.2.1 Pointer Declaration and Initialization
Just like basic data types, pointers need to be declared before they can be used. The declaration format is:var var_name *var_type
, where var_type
is the pointer type, var_name
is the pointer variable name, and *
indicates that the variable is a pointer.
Example:
var ip *int // Pointer to an integer var fp *float32 // Pointer to a float
To initialize a pointer, we assign it the memory address of a corresponding variable:
var a int = 20 // Declare a variable var ip *int // Declare a pointer variable ip = &a // Assign the address of variable `a` to pointer `ip`
3.2.2 Null Pointer
When a pointer is defined but not assigned any variable, its value is nil
, also known as a null pointer. It conceptually represents the same as null
or NULL
in other languages, denoting a zero or empty value.
3.3 Arrays
Just like in C, Go provides an array data structure. An array is a sequence of data items of the same type, with a fixed length. The type of the data items can be any primitive type such as integers, strings, or custom types.
3.3.1 Declaring Arrays
In Go, you need to specify both the element type and the number of elements in the array. The syntax is:
var variable_name [SIZE] variable_type
For example, to define a one-dimensional array:
var balance [10] float32
3.3.2 Initializing Arrays
There are multiple ways to initialize an array in Go:
Direct initialization:
var balance = [5]float32{1000.0, 2.0, 3.4, 7.0, 50.0}
Using a shorthand initialization while declaring the array:
balance := [5]float32{1000.0, 2.0, 3.4, 7.0, 50.0}
When the array length is not known, the compiler infers the length based on the number of elements, indicated by
...
:var balance = [...]float32{1000.0, 2.0, 3.4, 7.0, 50.0}
orbalance := [...]float32{1000.0, 2.0, 3.4, 7.0, 50.0}
If you specify indices for some elements, the syntax is:
balanced := [5]float32{1:2.0, 3:7.0}
Note:
The number of elements in the array must not exceed the number specified in the square brackets (
[]
).If you omit the number inside the brackets, Go will set the array size based on the number of elements.
3.3.3 Meaning of Array Names in Go
In C, an array name represents the address of the first element of the array, but in Go, an array name refers to the entire array and represents the complete value of the array. An array variable refers to the entire array.
In Go, when an array variable is assigned or passed to a function, the entire array is copied. This can be costly for large arrays, so pointers to arrays are often used to avoid the overhead of copying.
3.3.4 Array Pointers
We can define an array pointer by combining the knowledge of arrays and pointers, as shown in the following code:
var a = [...]int{1, 2, 3} // `a` is an array var b = &a // `b` is a pointer to the array
Array pointers prevent space wastage when passing arrays as parameters. You can also use an array pointer to iterate over the array with for range
, as shown below:
for i, v := range b { // Iterate over the array using the array pointer fmt.Println(i, v) }
More on Go loops will be covered in later sections.
3.4 Structs
With arrays, we can define multiple variables of the same type, but this can be limiting. Structs allow us to define multiple variables of different types.
3.4.1 Declaring Structs
Before declaring a struct, we first need to define a struct type using type
and struct
. type
is used to name the struct, and struct
defines the new data type. The syntax is as follows:
type struct_variable_type struct { member definition member definition ... }
Once the struct type is defined, we can declare a struct variable using the following syntax:
variable_name := struct_variable_type {value1, value2, ..., value_n}
or with named keys:
variable_name := struct_variable_type { key1: value1, key2: value2, ..., key_n: value_n}
3.4.2 Accessing Struct Members
To access struct members, use the dot (.
) operator. The syntax is:struct_variable_name.member_name
.
Example:
package main import "fmt" type Books struct { title string author string } func main() { var book1 Books book1.title = "Go Programming Basics" book1.author = "mars.hao" }
3.4.3 Struct Pointers
The definition and initialization of struct pointers are similar to those of regular pointers. You can store the address of a struct variable in a pointer variable.
Syntax for defining a struct pointer:var struct_pointer *Books
Initialization is the same as other pointers:struct_pointer = &book1
. However, unlike C, struct pointers access struct members using the .
operator as well. The syntax is:struct_pointer.title
3.5 Strings
A string is an immutable sequence of bytes, typically used to contain human-readable text data. Unlike arrays, the elements of a string are immutable, making it a read-only byte array. Although the length of a string is fixed, the length is not part of the string's type.
3.5.1 String Definition and Initialization
The underlying structure of a Go string is defined in reflect.StringHeader
, as shown below:
type StringHeader struct { Data uintptr Len int }
In other words, a string structure consists of two pieces of information: the first is the underlying byte array the string points to, and the second is the length of the string in bytes.
Strings are essentially structs, so the assignment of a string is a copy of the reflect.StringHeader
structure, and it does not involve copying the underlying byte array. Therefore, we can consider a string array as a struct array.
Strings are similar to arrays, and the built-in len
function returns the length of the string.
3.5.2 UTF-8 Encoding in Strings
According to the Go language specification, Go source files are encoded in UTF-8. Therefore, string literals in Go source files are generally UTF-8 encoded (escape characters are not subject to this restriction). When discussing Go strings, we typically assume the string corresponds to a valid UTF-8 encoded sequence of characters.
Go strings can store any binary byte sequence, and even UTF-8 sequences can encounter bad encoding. If an invalid UTF-8 encoding is encountered, a special Unicode character \uFFFD
(the replacement character) will be generated. This character may display differently in various software applications, but in print, it usually appears as a black hexagon or diamond shape with a white question mark �
inside.
In the following example, we deliberately corrupt the second and third bytes of the first character. As a result, the first character will print as "�", and the second and third bytes will be ignored. The rest of the string, "abc", will still be decoded and printed correctly (one of the advantages of UTF-8 encoding is that errors in encoding do not propagate):
fmt.Println("\xe4\x00\x00\xe7\x95\x8cabc") // �界abc
However, when iterating through this corrupted UTF-8 string with for range
, the second and third bytes of the first character will still be individually iterated, but the iterated values will be corrupted and equal to 0:
// 0 65533 // \uFFFD, corresponding to � // 1 0 // Null character // 2 0 // Null character // 3 30028 // 界 // 6 97 // a // 7 98 // b // 8 99 // c
3.5.3 Forced Type Conversion of Strings
As mentioned earlier, Go source code is typically UTF-8 encoded. If you don't want to decode the UTF-8 string and prefer to iterate through the raw byte sequence:
You can forcibly convert the string to a
[]byte
byte sequence and then iterate over it (this conversion generally does not incur runtime overhead).You can iterate through the string using the traditional index-based method.
In addition, string-related forced type conversions mainly involve converting to []byte
and []rune
types. Each conversion may incur the cost of reallocating memory, and in the worst case, their time complexity is O(n)
.
However, the conversion between strings and []rune
is more special because this type of conversion generally requires the underlying memory structure of both types to be as consistent as possible. Since the internal structures of []byte
and []int32
are completely different, this conversion may involve reallocation of memory.
3.6 Slices
Simply put, slices are a simplified version of dynamic arrays. Since the length of a dynamic array is not fixed, the length of a slice cannot be part of its type either. Although arrays have their uses, their type and operations are not flexible enough, whereas slices are widely used due to their flexibility.
The key to efficient slice operations is minimizing memory allocations, ensuring that append
operations (which are involved in subsequent insertions and deletions) do not exceed the slice's capacity (cap
), thereby reducing the number of memory allocations and the size of each allocation.
3.6.1 Defining Slices
Let's first look at the structure definition of a slice in reflect.SliceHeader
:
type SliceHeader struct { Data uintptr // Points to the underlying array Len int // Slice length Cap int // Slice capacity }
Like arrays, the built-in len
function returns the length of the slice, and the built-in cap
function returns the capacity of the slice, which must be greater than or equal to its length.
A slice can be compared to nil
, and it is considered nil
only when its underlying data pointer is nil
. In this case, the slice's length and capacity information will be invalid. If the slice's underlying data pointer is nil
but its length and capacity are non-zero, this indicates that the slice has been corrupted.
As long as the slice's underlying data pointer, length, and capacity remain unchanged, traversing the slice and accessing or modifying its elements is the same as with an array. When assigning a slice or passing it as a parameter, the operation is similar to that of an array pointer. It only copies the slice header information (reflect.SliceHeader
) and does not copy the underlying data. The biggest difference with arrays is that the type of a slice is not dependent on its length; any slice containing elements of the same type corresponds to the same slice type.
To define a slice, you can declare it as follows:
When assigning a value to a slice or passing it as a parameter, it behaves similarly to an array pointer. The slice header information (reflect.SliceHeader
) is copied, but the underlying data is not.
3.6.2 Adding Elements
The append()
function is a built-in generic function that can add elements to a slice.
To append N
elements at the end:
var a []int a = append(a, 1) // Append 1 element a = append(a, 1, 2, 3) // Append multiple elements, manual unpacking a = append(a, []int{1,2,3}...) // Append a slice, slice needs unpacking
Note: When appending at the end, if the capacity is insufficient, memory needs to be reallocated, which can result in significant memory allocation and data copying overhead. Even if the capacity is sufficient, you still need to update the slice itself using the return value of append()
because the length of the new slice has changed.
To add elements at the start:
var a = []int{1, 2, 3} a = append([]int{0}, a...) // Add 1 element at the start a = append([]int{-3, -2, -1}, a...) // Add a slice at the start
Note: Adding elements at the beginning typically causes memory reallocation and will copy all existing elements. Therefore, performance when adding elements at the start of a slice is generally much worse than appending to the end.
Chain append operations:
var a []int a = append(a[:i], append([]int{x}, a[i:]...)...) // Insert x at position i a = append(a[:i], append([]int{1, 2, 3}, a[i:]...)...) // Insert a slice at position i
In each append operation, the second append
call creates a temporary slice and copies the contents of a[i:]
into the new slice, which is then appended to a[:i]
.
Append and copy combination:
a = append(a, 0) // Extend the slice by 1 space copy(a[i+1:], a[i:]) // Move a[i:] one position to the right a[i] = x // Set the newly added element
In the third operation, a temporary object is created, but we can avoid this by using the copy
function, which reduces the need for creating intermediate slices.
3.6.3 Deleting Elements
There are three cases depending on the position of the element to be deleted:
Deleting from the beginning: Directly move the data pointer, as shown:
a = []int{1, 2, 3, ...} a = a[1:] // Delete the first element a = a[N:] // Delete the first N elements
Deleting from the middle: To delete an element from the middle, the remaining elements must be moved. This can be done using
append
orcopy
:
a = []int{1, 2, 3, ...} a = append(a[:i], a[i+1:]...) // Delete the i-th element a = a[:copy(a[:i], a[i+1:])] // Delete the i-th element using copy
Deleting from the end:
a = []int{1, 2, 3, ...} a = a[:len(a)-1] // Delete the last element a = a[:len(a)-N] // Delete the last N elements
Deleting elements from the end of a slice is the fastest operation.
3.7 Functions
A function is a collection of program instructions (statements) used to perform a specific task.
3.7.1 Function Classification
In Go, functions are first-class objects, meaning they can be stored in variables. Functions are mainly categorized into named and anonymous functions. Package-level functions are generally named functions, and named functions are a special case of anonymous functions. When an anonymous function references variables from an external scope, it becomes a closure. Closures are the core concept of functional programming languages.
Example code:
Named Function: Similar to ordinary functions in C, named functions have a function name, return values, and function parameters.
func Add(a, b int) int { return a + b }
Anonymous Function: An anonymous function is a function without a name. It consists of an unnamed function declaration and a function body.
var Add = func(a, b int) int { return a + b }
Explanation of Terms:
Closure Function: A closure is a function object that not only represents a function but also encapsulates a scope. This means that no matter where the function is called, it will prioritize using the scope it was originally created in.
First-Class Object: In languages that support closures, functions are first-class objects. This means functions can be stored in variables, passed as parameters to other functions, and dynamically created and returned by functions.
Package: In Go, every file belongs to a package. Go uses packages to organize files and manage the structure of project directories.
3.7.2 Function Declaration and Definition
The syntax for defining a function in Go is as follows:
func function_name([parameter list])[return types] { // function body }
Explanation:
func
: Declares the function.function_name
: The function name.parameter list
: The list of parameters.return types
: The return types of the function.function body
: The block of code that defines the function.
3.7.3 Function Parameters
Go functions can have multiple parameters and multiple return values. Parameters and return values are passed by value, meaning data is exchanged between the caller and the callee. Additionally, Go supports variadic functions, which can accept a variable number of arguments. Variadic arguments must appear last in the parameter list, and they are passed as a slice.
When a variadic parameter is an empty interface type, whether the caller unpacks the variadic parameter or not can lead to different results. Let’s explain what unpacking means with the following code:
func main() { var a = []int{1, 2, 3} Print(a...) // Unpacking Print(a) // No unpacking } func Print(a ...int) { fmt.Println(a...) }
When passing a...
, the slice a
is unpacked, effectively calling Print(1, 2, 3)
. When passing a
without unpacking, it is equivalent to calling Print([]int{1, 2, 3})
.
3.7.4 Function Return Values
In Go, function return values can also be named, just like parameters.
Example:
func Find(m map[int]int, key int) (value int, ok bool) { value, ok = m[key] return }
If return values are named, they can be modified by name within the function, or the return values can be modified later using defer
statements after the return
statement. Here is an example:
func main() { for i := 0; i < 3; i++ { defer func() { println(i) } } } // Output of this function will be: // 3 // 3 // 3
In the code above, if there were no defer
, the return value would be 0, 1, and 2. However, the defer
statement is executed after the function returns, meaning the value of i
when the function finishes will be 3, causing the println
to output 3 each time.
The defer
statement delays the execution of an anonymous function, which captures the local variable i
from the outer function. This is known as a closure. A closure does not access the captured variable by value, but by reference.
This behavior can cause issues. A fix for this is to define a new local variable for each iteration inside the loop, as shown in the modified code:
func main() { for i := 0; i < 3; i++ { i := i // Define a new local variable i inside the loop defer func() { println(i) }() } }
Alternatively, you can pass i
as a parameter to the deferred function to avoid capturing the loop variable:
func main() { for i := 0; i < 3; i++ { // Pass i as an argument to the defer statement defer func(i int) { println(i) }(i) } }
3.7.5 Recursive Calls
In Go, functions can call themselves directly or indirectly, supporting recursive calls. The recursion depth in Go is logically unlimited, and there are no stack overflow errors because Go's runtime dynamically adjusts the size of the function call stack as needed. This involves knowledge of goroutines and dynamic stacks, which will be explained in future posts.
The syntax for recursion is similar to that in C:
func recursion() { recursion() // Function calls itself } func main() { recursion() }
3.8 Methods
A method is generally a feature of object-oriented programming (OOP). In C++, methods correspond to member functions of a class object, which are associated with a specific object in the virtual table. However, in Go, methods are associated with a type, which allows static binding of methods to be completed at compile time. In object-oriented programming, methods are used to express operations corresponding to an object's attributes. This allows users of the object to interact with it through methods, rather than directly manipulating the object itself.
Here is an implementation of a set of functions in C language:
// File object type File struct { fd int } // Open a file func OpenFile(name string) (f *File, err error) { // ... } // Close a file func CloseFile(f *File) error { // ... } // Read file data func ReadFile(f *File, offset int64, data []byte) int { // ... }
The above three functions are ordinary functions, which need to occupy the name space in the package. However, CloseFile
and ReadFile
functions are specifically for the File
type object. We would prefer to tightly bind these functions to the type they operate on.
So, in Go, we modify it as follows:
// Close the file func (f *File) CloseFile() error { // ... } // Read file data func (f *File) ReadFile(offset int64, data []byte) int { // ... }
By moving the first parameter of CloseFile
and ReadFile
to the beginning of the function definition, these two functions become methods exclusive to the File
type (rather than methods of a File
object).
From a code perspective, this is a minor change, but from a programming philosophy standpoint, Go has already entered the realm of object-oriented languages. We can add one or more methods to any custom type. The methods for a given type must be in the same package as the type definition, so it is not possible to add methods to built-in types like int
(because the method definition and the type definition would not be in the same package). For any given type, each method must have a unique name, and like functions, methods do not support overloading.
3.9 Interfaces
3.9.1 What is an Interface?
Go provides another data type called an interface, which groups together methods with common behaviors. Any type that implements these methods is considered to have implemented the interface.
The interface type in Go is an abstraction and generalization of the behavior of other types. Since the interface type is not tied to specific implementation details, it allows for greater flexibility and adaptability. Many object-oriented languages have a similar concept of interfaces, but Go's unique feature is that it supports implicit implementation of interfaces, which is known as the "duck typing" approach.
Duck typing means: if something walks like a duck and quacks like a duck, it can be treated as a duck. Similarly, in Go, if an object behaves like an implementation of a certain interface, it can be used as that interface type, even if it does not explicitly declare it.
For example, in C, when using printf
to output to the terminal, you can only print a limited set of variable types. But in Go, you can use fmt.Printf
, which internally calls fmt.Fprintf
to print to any custom output stream object, including network streams or compressed files. Moreover, the data printed is not limited to the built-in basic types; any object that implements the fmt.Stringer
interface can be printed. Even objects that do not implement fmt.Stringer
can still be printed using reflection.
3.9.2 Struct Type
An interface is actually a struct containing two members. One member is a pointer to the actual data, and the other contains the type information. An empty interface and an interface with methods differ slightly in their data structure, as shown below for an empty interface:
struct Eface { Type* type; void* data; };
Where Type
refers to:
struct Type { uintptr size; uint32 hash; uint8 _unused; uint8 align; uint8 fieldAlign; uint8 kind; Alg *alg; void *gc; String *string; UncommonType *x; Type *ptrto; };
And for an interface with methods, the data structure is:
struct Iface { Itab* tab; void* data; };
Where Iface
refers to:
struct Itab { InterfaceType* inter; Type* type; Itab* link; int32 bad; int32 unused; void (*fun[])(void); // Method table };
3.9.3 Assigning Concrete Type to Interface Type
When assigning a concrete type to an interface (abstract type), a type conversion is required. What happens during this conversion process?
If the conversion is to an empty interface, an
Eface
is returned, where thedata
pointer points to the original data, and thetype
pointer points to theType
structure of the data.If the conversion is to an interface with methods, a check is performed to ensure the concrete type implements all the methods declared in the interface. This check is done during compile time by comparing the method tables of the concrete type and the interface type.
Method tables of concrete types: The UncommonType
of Type
contains a method table where all the methods implemented by a concrete type are collected.
Method tables of interface types: The Itab
of Iface
contains a method table with the methods declared by the interface. The func
field of Itab
is also a method table, with each entry being a function pointer (i.e., only the implementation, not the declaration).
These method tables are sorted, and a single scan can be used to compare them to check whether the Type
implements all methods declared by the interface. The function pointers from the Type
method table are then copied to the Itab
's fun
field.
3.9.4 Obtaining the Concrete Type Information of Interface Type Data
When converting an interface type back to a concrete type (i.e., reflection), type conversion is also involved. The reflect
package in Go provides TypeOf
and ValueOf
functions to obtain the Type
and Value
of an interface variable.
3.10 Channel
3.10.1 Related Structure Definitions
In Go, channels can be stored in variables, passed as parameters to functions, and returned as function results. Let's first take a look at the structure definition of a channel:
struct Hchan { uintgo qcount; // Total number of data in queue q uintgo dataqsize; // Size of the circular queue q uint16 elemsize; // Current usage of the queue bool closed; uint8 elemalign; Alg* elemalg; // Interface for element type uintgo sendx; // Send index uintgo recvx; // Receive index WaitQ recvq; // Wait queue for goroutines blocked on recv WaitQ sendq; // Wait queue for goroutines blocked on send Lock; };
The core part of the Hchan
structure is the circular queue where the channel data is stored. The function of related data has been noted afterward. There is no data field in this structure; if it is a buffered channel, the buffered data is allocated immediately following the Hchan
structure.
Another important part is the two linked lists, recvq
and sendq
. One contains goroutines blocked due to a read operation on the channel, and the other contains goroutines blocked due to a write operation. If a goroutine is blocked on a channel, it will be placed in either the recvq
or sendq
. WaitQ
is the definition of a linked list, with a head and tail node, where each member is a SudoG
structure, defined as follows:
struct SudoG { G* g; // g and selgen constitute uint32 selgen; // A weak pointer to g SudoG* link; int64 releasetime; byte* elem; // Data element };
The most important fields in this structure are g
and elem
. elem
is used to store the data for the goroutine. When reading from the channel, data is copied from the Hchan
queue to the elem
field of SudoG
. When writing to the channel, data is copied from the elem
field of SudoG
to the Hchan
queue.
3.10.2 Blocking Read/Write Channel Operations
The write operation code is as follows, where c
is the channel and v
represents the data:
c <- v
The basic blocking write operation is implemented in the runtime library by a runtime.chansend
function, as follows:
void runtime·chansend(ChanType *t, Hchan *c, byte *ep, bool *pres, void *pc)
Where ep
refers to the address of the variable v
, and the calling convention is that the caller is responsible for allocating space for ep
and simply needs to pass the variable's address. pres
is used for channel operations within a select
statement.
The core functions for blocking read operations are as follows:
chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool)
And
chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected)
The difference between these two lies in whether a boolean value is returned, indicating whether data can be read from the channel.
The blocking read/write operations are similar, so they are not explained further. Here are three additional details:
The blocking read/write operations mentioned above also have corresponding non-blocking operations, which are implemented using
select-case
.An empty channel, defined by setting a channel to
nil
or not initializing it usingmake
, will block indefinitely when read or written to.A closed channel will never block and will return the zero value for the channel's data type. First,
closed
is set to 1, then therecvq
(read waiting queue) is processed, setting eachelem
inSudoG
to the type's zero value, followed by processing thesendq
(write waiting queue), setting eachelem
tonil
, and finally waking up all collectedSudoG
structures.
3.10.3 Non-Blocking Read/Write Channel Operations
As mentioned above, non-blocking operations are actually implemented using select-case
, which, at compile time, is converted to an if-else
structure.
For example:
select { case v = <-c: ...foo default: ...bar }
This will be compiled into:
if selectnbrecv(&v, c) { ...foo } else { ...bar }
The function selectnbrecv
calls runtime.chanrecv
, with an additional parameter telling it not to block when the operation cannot be completed but to return a failure instead.
However, the execution order of case
in select
is random, unlike the sequential execution of cases in a switch
. Each select
corresponds to a Select
structure, with an array of Scase
recording each case
. Scase
includes Hchan
, and the pollorder
array is used to randomly shuffle the elements, so the order of Scase
is randomized.
3.11 Map
The underlying principle of maps is a hash table. The structure definition is as follows:
type Map struct { Key *Type // Key type Elem *Type // Value (elem) type Bucket *Type // Hash buckets Hmap *Type // Underlying hash table metadata Hiter *Type // Iterator for traversing the hash table }
The specific data structure of Hmap
is:
type hmap struct { count int // Current number of elements in the map flags uint8 // Map status (being traversed/being written) B uint8 // Log base 2 of the number of buckets (bucket count is always a power of 2) noverflow uint16 // Number of overflow buckets, approximate when B >= 16 hash0 uint32 // Random hash seed buckets unsafe.Pointer // Pointer to current buckets oldbuckets unsafe.Pointer // Pointer to old buckets when resizing nevacuate uintptr // Progress indicator for bucket adjustments extra *mapextra // Represents overflow buckets }
Most of the hmap
structure is related to hash buckets and overflow buckets. The bmap
structure looks like this:
type bmap struct { topbits [8]uint8 // High 8 bits of key hash keys [8]keytype // All keys in the hash bucket elems [8]elemtype // All values in the hash bucket overflow uintptr }
We find that a bmap
hash bucket typically holds 8 key-value pairs. If there are more than 8 key-value pairs, new buckets
are allocated and linked with the existing ones.
The relationship is illustrated as follows:
When inserting data, the hash value is first computed for the key. The low 8 bits of the hash value are used as the index for the buckets
array, and the high 8 bits of the hash value are stored in the tophash
field of the bucket.
Characteristics:
Maps are unordered (due to unordered insertions and resizing causing changes in element order), and every time a map is printed, the order may be different. You cannot access a map via an index, but only through keys.
The size of a map is not fixed, just like slices, and it is a reference type.
The built-in
len
function can be used on maps to return the number of keys in the map.Map keys can be any comparable type, such as boolean, integer, floating-point, complex, and string types.
Initialization can be done as follows:
var a map[keytype]valuetype
Where keytype
is the key type and valuetype
is the corresponding value type.
Alternatively, it can be initialized using make
:
map_variable = make(map[key_data_type]value_data_type)
Or initialized with an initial value:
var m map[string]int = map[string]int{"hunter":12,"tony":10}
3.11.1 Inserting Data
The code for inserting data into a map is as follows:
map_variable["mars"] = 27
The insertion process is as follows:
Calculate the hash value based on the key.
Use the low bits of the hash value modulo
hmap.B
to determine the bucket position.Check if the key already exists; if it does, update the value.
If the key is not found, insert the key-value pair.
3.11.2 Deleting Data
The delete(map, key)
function deletes an element from the map. The parameters are the map and its corresponding key. The function does not return any value. The relevant code is:
countryCapitalMap := map[string]string{"France": "Paris", "Italy": "Rome", "Japan": "Tokyo", "India": "New Delhi"} delete(countryCapitalMap, "France")
3.11.3 Looking Up Data
To look up a value by key in a map, use the syntax:
map[key]
If the key does not exist, the default value for the value type (e.g., empty string for strings, 0 for integers) is returned. The program will not throw an error.
You can use the "ok idiom" to check if the key exists:
value, ok := map[key]
Where value
is the returned value, and ok
is a boolean value indicating whether the key-value pair exists.
The lookup process in the map is as follows:
The key is hashed using a hash function.
The low 8 bits of the hash value determine which bucket the data belongs to.
After finding the bucket, the high 8 bits of the hash value are compared with the stored high bits in the bucket.
If the hashes match, the key is compared, and the corresponding value is returned.
If the hash high bits do not match or the key is not found, it checks the overflow chain (if any) until the end of the chain.
If the key is not found, the zero value for the value type is returned.
3.11.4 Resizing
Hash tables trade space for speed, and access time is directly related to the load factor. So, when a hash table is too full, it needs to be resized.
If the size of the table before resizing is 2^B
, after resizing, the new size becomes 2^(B+1)
. The resizing condition is based on the load factor (number of keys/bucket count) being greater than 6.5 or when the number of overflow buckets exceeds 32768.
Incremental resizing is triggered when the load factor exceeds 6.5. A new bucket is created, and data is reallocated from the old buckets.
Equal resizing occurs when the load factor is acceptable, but the number of overflow buckets is high. This reorders the data more compactly, reducing overflow bucket usage and improving access speed.
4. Common Statements and Keywords
Next, let's explore the basic concepts of Go language statements.
4.1 Conditional Statements
Similar to C language, the related conditional statements are shown in the table below:
Statement | Description |
---|---|
if statement | The if statement consists of a boolean expression followed by one or more statements. |
if...else | The if statement can be followed by an optional else statement, which is executed when the boolean expression is false. |
switch statement | The switch statement is used to execute different actions based on different conditions. |
select statement | The select statement is similar to a switch, but it randomly executes one of the available cases. If no case is available, it blocks until a case becomes available. |
if statement
The syntax is as follows:
if booleanExpression { /* Executes if boolean expression is true */ }
if-else statement
if booleanExpression { /* Executes if boolean expression is true */ } else { /* Executes if boolean expression is false */ }
switch statement
The variable v
can be of any type, and val1
and val2
can be any values of the same type. The type is not limited to constants or integers, nor is the final result restricted to an expression of the same type.
switch v { case val1: ... case val2: ... default: ... }
select statement
Select is a control structure in Go, similar to a switch used for communication. Each case must be a communication operation, either sending or receiving. It randomly executes one of the runnable cases. If no case is runnable, it blocks until one becomes runnable. A default clause should always be runnable.
select { case communicationClause: statement(s); case communicationClause: statement(s); /* You can define any number of cases */ default: /* Optional */ statement(s); }
Note:
Each case must be a communication operation.
All channel expressions will be evaluated, and all sent expressions will be evaluated.
If any communication is possible, it will be executed, and others will be ignored.
If multiple cases can run, select will randomly choose one to execute.
If no case is runnable: if a default clause exists, it will be executed. Otherwise, select will block until a communication becomes possible, thus avoiding starvation.
4.2 Loop Statements
4.2.1 Loop Handling Statements
In Go, loops are implemented using for
, with three possible forms:
Syntax: Similar to C language's
for
:for init; condition; post {}
Similar to C language's
while
:for condition {}
Similar to C language's
for(;;)
:for {}
Additionally, for
loops can directly use range
to iterate over slices, maps, arrays, and strings, as shown below:
for key, value := range oldmap { newmap[key] = value }
4.2.2 Loop Control Statements
Control Statement | Description |
---|---|
break | Exits the loop or switch statement |
continue | Skips the remaining statements of the current iteration and continues to the next iteration |
goto | Transfers control to a labeled statement |
break
The break
statement is used to exit loops, just like in C language. In nested loops, labels can be used to specify which loop to break.
Example code:
a := 0 for a < 5 { fmt.Printf("%d\n", a) a++ if a == 2 { break } } /* Output: 0 1 2 */
continue
The continue
statement is similar to break
, but instead of exiting the loop, it skips the current iteration and continues with the next one. In nested loops, labels can be used to specify which loop to continue.
Example code (without label):
fmt.Println("---- continue ---- ") for i := 1; i <= 3; i++ { fmt.Printf("i: %d\n", i) for i2 := 11; i2 <= 13; i2++ { fmt.Printf("i2: %d\n", i2) continue } } /* Output: i: 1 i2: 11 i2: 12 i2: 13 i: 2 i2: 11 i2: 12 i2: 13 i: 3 i2: 11 i2: 12 i2: 13 */
Example code (with label):
fmt.Println("---- continue label ----") re: for i := 1; i <= 3; i++ { fmt.Printf("i: %d", i) for i2 := 11; i2 <= 13; i2++ { fmt.Printf("i2: %d\n", i2) continue re } } /* Output: i: 1 i2: 11 i: 2 i2: 11 i: 3 i2: 11 */
goto
The goto
statement unconditionally transfers control to the specified label. It is usually used with conditional statements to implement conditional jumps, loops, or to exit loops. However, the use of goto
is discouraged to avoid making the program flow unclear.
Example code:
var a int = 0 LOOP: for a < 5 { if a == 2 { a = a + 1 goto LOOP } fmt.Printf("%d\n", a) a++ } /* Output: 0 1 2 3 4 */
In the code above, LOOP
is a label. When the goto
statement is executed, the control flow will jump to the line marked by the LOOP
label
4.3 Keywords
In this section, we provide a direct list of Go keywords for your understanding:
Keyword | Usage |
---|---|
import | Imports the corresponding package file. |
package | Creates a package file, used to mark which package the file belongs to. |
chan | Channel, used for communication between goroutines. |
var | Variable control, used for short variable declaration (the := symbol can only be used inside functions, not globally). |
const | Constant declaration; const and var can appear together. |
func | Defines functions and methods. |
interface | Interface, a type that has a set of methods which define the behavior of the interface. |
map | Hash table. |
struct | Defines a structure. |
type | Declares types or defines aliases. |
for | for is the only loop structure in Go, as introduced earlier. |
break | Terminates and exits the loop. |
continue | Continues to the next iteration of the loop. |
select | Selects between multiple channel operations simultaneously. |
switch | Multi-branch selection, as introduced earlier. |
case | Used with switch . |
default | The default choice in a switch structure. |
defer | Used for resource release, called just before the function returns. |
if | Branch selection. |
else | Used with if . |
go | Starts a new goroutine via go func() . |
goto | Jumps to a labeled block of code (not recommended). |
fallthrough | Used in switch to continue to the next case. |
range | Used to iterate over slice data types. |
return | Marks the return value of a function. |