-
Notifications
You must be signed in to change notification settings - Fork 662
Modules
Since version 1.13, Go modules have been enabled by default and they are the best way to maintain a durable and reproducible build of Go code. The module system relies on two files being present in every project:
A go.mod file with a list of dependencies (direct and indirect) and the version of the dependency being used. A go.sum file with the hash value for each dependency. The hash is produced from all the code related to the dependency and provides a mechanism to validate new downloads of the dependency are exactly the same.
When starting a new Go project, it’s best to create a repo first and then clone that repo to the local machine. After cloning, initialize the project to use the Go module system by running the go mod init command.
$ go mod init github.com/ardanlabs/service
This command takes a module name or the name can be left out and the tooling will use the name associated with the cloned repository. After running the command, a single file named go.mod will be created.
01 module github.com/ardanlabs/service
02
03 go 1.20
The initial go.mod file will consist of the module’s name and the version of Go used to create the file. The Go version represents the minimal version of Go that is required to build the code in this project. The Go tooling can also use this version to ensure that the tooling runs with the default behavior related to that version of Go.
To see the Go environment settings, the Go tooling has a command named go env.
$ go env
...
GOARCH="arm64"
GOMODCACHE="/Users/bill/code/go/pkg/mod"
GOOS="darwin"
GOPATH="/Users/bill/code/go"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_arm64"
GOVERSION="go1.20.2"
...
This listing provides some of the more important environment variables that are listed when running go env.
Here are some guidelines to follow when working with the module system:
Keep the go.mod file at the root of the project/repository to ensure any 3rd party tools and editors work properly. Don’t open the editor until the go.mod file is created. Do this to make sure the gopls service, that runs in the background, is running in the correct mode.
The Go tooling is capable of fetching code from an import path that is URL based.
01 package main
02
03 import "github.com/ardanlabs/conf"
04
05 func main(){
06 conf.New()
07 }
The program imports a package named conf from the module github.com/ardanlabs/conf. The Go tooling can read that import path and download the source code onto the local machine.
Note: To prevent my editor from removing the import statement before I can ask the Go tooling to download the code, I pretend there is a function named New before saving the file. If I don’t use the conf namespace before hitting save, other tooling from the editor will remove the import.
To download the code for the conf package, run the go mod tidy command from the command line inside the project’s folder.
$ go mod tidy
go: finding module for package github.com/ardanlabs/conf
go: downloading github.com/ardanlabs/conf v1.5.0
go: found github.com/ardanlabs/conf in github.com/ardanlabs/conf v1.5.0
go: downloading github.com/google/go-cmp v0.3.1
The output for the tidy command is shown. The tooling decides to download version 1.5.0 of the conf module and version 0.3.1 of the go-cmp module. Once the command is complete, all the source code needed to use the package will be available inside the module cache.
The Go tooling reads the environment variable $GOMODCACHE to determine where to store all the code downloaded for the different dependencies needed for all the Go projects on the machine. The code is stored by the module name and the version (tag) that was requested by the project. Use "go env GOMODCACHE" to view the location on any machine.
~/code/go/pkg/mod/github.com/ardanlabs
$ ls -l
drwxr-xr-x 7 bill staff 224B Mar 9 16:35 conf
dr-xr-xr-x 14 bill staff 448B Jan 23 11:55 [email protected]
$ ls -l conf/
dr-xr-xr-x 15 bill staff 480B Feb 5 13:05 [email protected]
dr-xr-xr-x 15 bill staff 480B Feb 11 14:34 [email protected]
dr-xr-xr-x 15 bill staff 480B Jan 23 12:42 [email protected]
dr-xr-xr-x 15 bill staff 480B Feb 15 21:33 [email protected]
dr-xr-xr-x 15 bill staff 480B Mar 9 16:35 [email protected]
All of the versions – of the module github.com/ardanlabs/conf – are presented from my local machine. Each version is separated in their own folder.
A module mirror helps facilitate the downloading of dependencies (source code) by being a proxy for the different version control systems (VCS) managing and maintaining the project’s code and dependencies. It also maintains a mirror copy of any source code that is downloaded from the different VCS for future requests. A module mirror is often called a proxy server since it handles the requests to a VCS on behalf of the system needing to download a dependency.
The Go team has provided a public module mirror which by configuration is the default module mirror most projects will use. The Go team has published the spec for the module mirror so others can build their own. To date, there are two other module mirrors that exist and can be used by Go projects. One is the open source Athens project and the other is from the company JFrog.
Inside the module mirror, the source code for a Go module is bundled into a zip file that is indexed by the repository name and version. The version can be a tag or commit id. Each bundle is also hashed and that hash value is recorded to lock that source code in time. The Go team has the checksum database to record these hash values for public use.
Thanks to the module files, module mirror, and checksum database, the following problems can be solved for all Go projects:
- A specific version of a module can be requested and downloaded.
- A record of what modules and the versions are being used can be maintained by the project and respected by the compiler.
- A specific version of a module can be checked for any changes when downloaded again. Regardless where the code is downloaded from, including downloading the code directly from the VCS.
- A specific version of a module can be downloaded more efficiently when using a module mirror since the code is already bundled into a single compressed file.
- Rules on when to use different VCS environments for both public and private modules can be configured.
- Upgrading modules over time to gain access to bug and security fixes, plus new features.
2.2.1 Go’s Default Module Mirror
The Go tooling uses the GOPROXY environment variable to determine if a module mirror should be used when downloading modules and where the module mirror is located on the network. The variable can contain a comma-delimited list of URLs pointing to a set of module mirrors. The value “direct” can also be provided to tell the tooling to access the VCS directly.
GOPROXY="https://proxy.golang.org,direct"
The default value for the GOPROXY variable.
- https://proxy.golang.org : The module mirror maintained by Google and the Go team. This mirror will provide access to publicly available Go modules, such as those published on Github.
- direct : This value tells the Go tooling to fetch the source code for the module directly from the VCS.
The Go tooling will traverse the values specified in this variable and move to the next entry on the list if the HTTP request returns a 404 (Not Found) or 401 (Gone).
To prevent calls to a public module mirror for modules that are located in a private VCS, the GONOPROXY variable can be used. This variable takes a value that is used to match against the import URL that is being looked up. The variable also accepts regular expressions.
If the module mirror shouldn’t be accessed, then the checksum database probably shouldn’t be accessed as well. The GONOSUMDB variable exists to prevent the same public checksum access for private modules. As a convenience, the Go tooling provides the GOPRIVATE variable to set both GONOPROXY and GONOSUMDB at the same time with the same values. 2.2.2 Go Tidy Semantics
When the go mod tidy command is used, the Go tooling has a specific workflow it performs to figure out the right version of a module to download.
$ go mod tidy
go: finding module for package github.com/ardanlabs/conf
> curl https://proxy.golang.org/github.com/ardanlabs/conf/@v/list
go: downloading github.com/ardanlabs/conf v1.5.0
> curl https://proxy.golang.org/github.com/ardanlabs/conf/@v/v1.5.0.zip
go: found github.com/ardanlabs/conf in github.com/ardanlabs/conf v1.5.0
The HTTP calls that are made by the Go tooling for the different stages that are outputted by the go mod tidy command.
The list call will provide the list of bundles by version that currently exist in the module mirror for the specified module.
$ curl https://proxy.golang.org/github.com/ardanlabs/conf/@v/list
v1.3.0
v1.0.0
v1.3.3
v1.3.6
v1.3.1
v1.5.0
v1.3.4
v1.3.2
v1.2.1
v1.4.0
v1.2.0
v1.2.2
v1.3.5
v1.0.1
v1.1.0
Display the request made by the HTTP call. Since this module will be a direct dependency for the project, the Go tooling will select the greatest version from the list, which is v1.5.0. Then the Go tooling will fetch the bundle for that version, unzip the bundle into a folder for that module and version, then delete the bundle file.
$ mkdir -p $(go env GOMODCACHE)/mod/github.com/ardanlabs
$ curl --output v1.5.0.zip \
https://proxy.golang.org/github.com/ardanlabs/conf/@v/v1.5.0.zip
$ cd $(go env GOMODCACHE)/mod/github.com/ardanlabs
$ unzip v1.5.0.zip -d v1.5.0/
$ rm -f v1.5.0.zip
The different command line calls that represent what the Go tooling will perform once the zip file is downloaded.
Using a repository tag as a version identifier provides no guarantee that every time code is downloaded for a given version, the code will be the same. To provide that guarantee the module system generates hash values.
Imagine a situation where module X at version Y is being downloaded for the first time for a project. Once the module is uncompressed on disk, the Go tooling will generate two hash values and add those values to a file named go.sum. If the file doesn’t exist yet, it will be created.
When these values are added to the go.sum file, the Go tooling will reach out to the checksum database that is owned and controlled by the Go team. The checksum database maintains all of the hash values for all modules and versions that have ever been requested from the module mirror. The hash values always reflect the first time the module mirror downloads the code. This database acts as the global point of truth for detecting changes to module code.
The Go tooling will compare the hash values it generates locally with what has been recorded in the checksum database. If they match, there is a guarantee the code that was just downloaded is the same as it was the first time the module mirror saw it. If they don’t match, the owner of the module has tried to push a change without re-versioning the code and the code can’t be trusted. If access to the checksum database is turned off, then the system is reliant on the first time the module was downloaded for the project.
The checksum database to use is determined by the environment variable GOSUMDB. By default, this value is set to sum.golang.org. Requests to the checksum database can be disabled by setting the environment variable to off or invoking the go get command with the -insecure flag.
The Go module system has the capability to maintain a copy of the source code for the modules being used inside the project using a folder named vendor. This folder will be at the root of the project and will contain a manifest file alongside the code.
$ go mod vendor
$ ls -al vendor
total 16
drwxrwxr-x 3 go go 4096 feb 12 17:05 .
drwxrwxr-x 3 go go 4096 feb 12 17:05 ..
drwxrwxr-x 3 go go 4096 feb 12 17:05 github.com
-rw-rw-r-- 1 go go 82 feb 12 17:05 modules.txt
The content of the vendor folder after running the vendor command. On build, the Go tooling will read the code found in the vendor folder to compile the program, instead of reading it from the module cache folder.
The module.txt file stores the names of the packages present in the vendor folder. The Go tooling will prompt to “re-vendor” if there is a mismatch between the dependencies listed in go.mod and the modules.txt file.
01 # github.com/ardanlabs/conf v1.5.0
02 ## explicit; go 1.13
03 github.com/ardanlabs/conf
The contents of the module.txt file after the conf module is added to the project. Line 2 tells the Go tooling that this module requires a minimum of version 1.13 of Go.
Every project should vendor until it’s no longer practical or reasonable to do so. One reason not to choose vendoring is when the size of the project’s dependencies greatly exceeds the size of the code base.
One advantage of vendoring is the ability to easily access, read, and debug the module code without looking for the code in the module cache. This allows adding debug statements to any module code without infecting other local projects importing the same modules. Think of the vendor folder as a sandbox for module development.
Another benefit of vendoring is the ability to get a diff between the different versions of module code being upgraded. This provides the option to look at the changes being made before committing to them. Lastly, vendoring is useful for development environments with poor network quality. The idea is when pulling the latest code for the project from the VCS, all the code needed to build the project is provided. No other network calls are required to pull the dependencies.
Currently, the Go tooling is selecting version 1.5.0 of the conf package. If the list of tags is checked on the repo, it will show version 3.1.5 is actually the latest greatest version. Why did the Go tooling select version 1.5.0 over 3.15?
The Go tooling uses an import path naming convention to differentiate between different major versions of a module. The convention states to add the major version number to the end of the import path. If no major version number is specified, the module system assumes version v0 and v1 is being requested. This is done to ensure backwards compatibility.
To download the latest greatest major version 3 of the conf module, the import path requires that v3 be specified.
01 package main
02
03 import "github.com/ardanlabs/conf/v3"
04
05 func main(){
06 conf.New()
07 }
On line 3, /v3 is added to the import path. With this change, the Go tooling will request a list of v3 versions of the module and the latest greatest major version 3 of the module can be selected by the tooling.
$ curl https://proxy.golang.org/github.com/ardanlabs/conf/v3/@v/list
v3.1.1
v3.0.1
v3.1.2
v3.0.0
V3.1.3
A manual request to the module mirror is performed to see the current list of major version 3 tags.
01 module github.com/ardanlabs/conf/v3
02
03 go 1.13
04
05 require (
06 github.com/google/go-cmp v0.3.1
07 gopkg.in/yaml.v2 v2.4.0
08 )
The go.mod file of the conf module. Notice how /v3 is also in the module’s name on line 1. This tells the Go tooling that the tag represents a major 3 version of the module. The import used in the code must match what is listed in the module file for the module.
It’s important to keep in mind that every major version of a module will have a different /v name. After changing the import and running go mod tidy again, the contents of the go.mod will change to reflect the new version of the module to be used.
$ go mod tidy
go: finding module for package github.com/ardanlabs/conf/v3
go: downloading github.com/ardanlabs/conf/v3 v3.1.3
go: found github.com/ardanlabs/conf/v3 in github.com/ardanlabs/conf/v3 v3.1.3
go: downloading gopkg.in/yaml.v2 v2.4.0
The Go tooling downloaded major version 3 of the conf module after appending v3 to the import path. 2.5.1 Minimal Version Selection
Eventually two modules being used for the project will have a dependency on the same third module, but at different versions. How does the Go tooling determine which version of the same module to use, since it can only select one version to build against?
Go leverages an algorithm built by the Go team called Minimal Version Selection (MVS). Essentially, the Go tooling will traverse the list of dependencies, and when two dependencies import different versions of the same module, the Go tooling will pick the higher version of the two. This means the latest greatest version of any given module may not be chosen.
The common perception of version numbers is that: the higher the version, the better the code. This may not always be true because semantic versioning is a social contract. Developers break this social contract everyday for one reason or the other. Breaking the contract means breaking builds.
The Go team believes the most stable version of any dependency is the one being listed by a module. The big argument against MVS, is that security fixes can be missed since the latest greatest version is not being selected.
It’s best to upgrade the dependencies on a project often. It’s best to upgrade the dependencies at least once a month, though twice a month is better. When the dependencies get too far behind, the probability of having an upgrade issue when upgrading goes up.
There are two options available when updating dependencies. The first option is to only upgrade the direct dependencies and let MVS upgrade the remaining dependencies. The second option is to upgrade all dependencies to their latest greatest. By updating all the dependencies, the build can break more easily, but if the build is successful and the project’s tests pass, it’s a good first step to identify the upgrade did not affect anything negatively. 2.6.1 Upgrade via command line
The Go tooling has the capability to upgrade dependencies via the command line.
go get -u -v ./...
go: downloading github.com/go-playground/validator/v10 v10.14.1
go: downloading github.com/grpc-ecosystem/grpc-gateway/v2 v2.16.0
go: downloading github.com/sirupsen/logrus v1.9.3
go: upgraded github.com/go-playground/validator/v10 v10.14.0 => v10.14.1
go: upgraded github.com/grpc-ecosystem/grpc-gateway/v2 v2.15.2 => v2.16.0
go: upgraded github.com/sirupsen/logrus v1.9.2 => v1.9.3
I’m running a command to upgrade all of my direct and indirect dependencies.
Contact Bill Kennedy at [email protected] if you are having issues getting the project running.