Internal Tooling: From Script To Well-Equipped

The Problem

Over the past few years, I’ve been intimately involved with the internal tools at my company, specifically those used by product support and other technical teams to aggregate/lookup data. Things like collecting customer info from a database, pulling logs from 1500 cloud servers, or using the Zendesk Talk API to switch everyone to offline at 5PM because they haven’t implemented desk hours yet.

This article describes some of my main concerns with our current process/workflow, and an explanation of the software I designed to bring us into the modern age, and do things "right". The software in question is a suite of tools, collected into a single CLI interface, and a project layout that was designed not only to solve the technical challenges, but make those with minimal Go experience feel at home quickly with how it works.

Let's begin!

After being responsible for maintaining the old tools for just a short while, it quickly became apparant that something needed to change. Everything we had was in shell scripts, which were operating at the extreme edge of what shell scripts were capable of, and they had become basically unreadable. Without clogging up this article with irrelevancy, here are the greatest hits for the things that troubled me most;

  1. Completely avoidable use of sed, some of which looked like this; s/^(([0-9A-Za-z]{1,}-?){1,}):?.([0-9\.]{7,15})[^\[]+(\[.*)$/\4 -- \3 ${type}=\1/
  2. Two separate bash libraries which spread out the work a script is doing to sometimes unmanageable distances. For example a shell script might make a call that goes 5 functions deep, across two different libraries. That’s not too bad, unless it’s all written in bash and you’re trying to identify a bug.
  3. Zero logging. If, in the chaos of what I mentioned in #2, something broke deep in the chain of function calls, you typically just get a garbage output and need to build logging as you troubleshoot to trace the error to a specific point... and if the issue was with one of the many horrible regex strings, it was wise to not have lunch plans.
  4. The scripts were extremely fragile, and was the definition of spaghetti code. All it takes is for someone to sneeze on them, and you’re spending your day trying to trace back a regex that broke, with only a weird or bad output to guide you. Sometimes the issue wouldn’t even be with the script itself, and would be related to another tool entirely, one that we’re just calling. Speaking of logging and errors...
  5. No unit tests. Not a single one. I think this point can speak for itself.
  6. Duplicate repositories exist because we have different systems where tools are required. This means a lot of the code is duplicated across these repos. If we fix a bug in one repo’s library, it needs to be replicated for the other repos too.

These workflow problems (some of which are not mentioned here) became something that I wanted to fix. It seemed like something that needed to be modernised and improved to facilitate where we were going as a company. We needed to implement a better solution, one that was easy to iterate and build upon. The purpose of this article is to demonstrate my approach and solution to the issues (and more..) that I listed above.

As I started thinking about the “core” problem, there were many values I wanted to live by. A lot were basic software engineering principles, but I wanted to take some of them a step further in simplicity. The idea that I was rolling with was one of a "suite" of tools, consolidated into a single application that could be deployed to multiple systems. I understood Go, and knew it would be perfectly suited for my needs, however, Go wasn't widely used in the company; management rightfully raised a concern about what would happen to the project if I left the company or otherwise moved on. This meant that I had to make the process of learning, iterating, and understanding the code as easy as possible, through both documentation, and design.

To explain things as clearly as possible here, what I will do below is outline the core values that drove me, and demonstrate how I designed the application to solve each problem. The program I’ll be using for this is a sanitised and stripped-down version of my actual application, and is aptly named “building-tool-suite” or “bts” for short. All of the code supporting this article can be found here; https://github.com/DiscoRiver/building-tool-suite

The Design

DISCLAIMER: All of the code shown below serves the purpose of demonstrating a design only.

Implementing custom configs

With our previous solution, custom configs were not possible. Every command needed to be run ad-hoc, which was a lot of effort for power users who ended up with command aliases all over the place, which didn’t provide much flexibility. So I decided to closely tie flags and config items together using Viper and Cobra. The priority for which value would be used is as follows (thanks to Viper and Cobra);

CLI Flag --> Config Item --> Default

So if someone had a custom config, they could continue to use the flags ad-hoc, but can now define their default/general state through the config. They can even omit the flag entirely from both the command and config, and the program will simply use a default value. To get an idea for how this would look in code, here is an example from cmd/root.go;

var (
	// Config keys. I use “core” for program-wide config items.
	DebugConfigKey              = “core.Debug”

	// Custom config?
	configFileFlag string

	// Hold debug log value so we can assign it properly
	debugLogs = false

func init() {
	// Get our config before anything else.
	cobra.OnInitialize(initConfig, initLogger)

	rootCmd.PersistentFlags().StringVar(&configFileFlag, “config”, “”, “Custom bts config”)
	rootCmd.PersistentFlags().BoolVar(&debugLogs, “debug”, false, “Enable debug logging”)

	viper.BindPFlag(DebugConfigKey, rootCmd.PersistentFlags().Lookup(“debug”))

	// Set config defaults
	viper.SetDefault(terminator.DBTableHeaderForegroundColourConfigKey, “white”)
	viper.SetDefault(terminator.DBTableHeaderBackgroundColourConfigKey, “”)
}

func initLogger() {
	log.Verbose = viper.GetBool(DebugConfigKey)
}

func initConfig() {
  // Redacted
}

There are a couple of design choices that are worth mentioning here. Firstly, all config keys are declared first, and should be done so as close to where they are used as possible (this means wherever Viper is called to assign the config/flag value such as viper.GetBool()). Typically what I do is define a different config key for each tool/package, while reserving core (or something generic) for program-wide settings, such as the core.Debug we’ve set in the above code.

*Note: You may be wondering why the initLogger function is in root.go and not in the log package. That’s because the log package is within pkg, and is a generic log package (albeit extremely simple for this example), so adding viper flags in there would not make sense, so we define it instead as core in the cmd package. *

Within init() we’re setting our initConfig and initLogger functions to run when the command is executed. Something to note about cobra.OnInitialise is that anything you configure or set with the functions passed won’t be available until the command is executed. For example, because initLogger isn’t run until after the command is executed, log.Verbose won’t be set correctly for the init() function, so we cannot realistically use our logging package here. However, if init() is failing, likely we want to log an error to stdout anyway, so it hasn’t been a huge deal for me, and the simplicity of this design outweighs this particular consideration.

Next we’re defining two persistent flags, one for config and one for debug/verbose logging. I won’t go into details about flags here, however here is the interesting part; We’re then using viper.BindPFlag to bind the debug flag to the config key core.Debug. This means that if we call viper.GetBool(DebugConfigKey), we’ll get 3 possible values; the config item, the flag, or the default value, in that order of priority. This is useful for us because a lot of the config and flag code is abstracted away within Viper/Cobra, leaving us with the simplicity we’re looking for.

Finally we’re defining a couple of default values for our terminator package, which is located within  internal, which are package closely integrated to our application, so it’s acceptable for us to use our viper configuration here.

Sub-Commands

Firstly, we need to define our root command, which is basically how the application is initiated from the terminal. Here is what we define in root.go (you’ll notice we assigned the previously discussed flags to this root command.);

rootCmd = &cobra.Command{
		Use:   “bts”,
		Short: “building-tool-suite (bts) is a sample program outlining the design of internals tools.”,
		Long:  “building-tool-suite (bts) is a sample program outlining the design of internals tools into an easy-to-understand and maintainable package.”,
	}
)

func Execute() error {
	return rootCmd.Execute()
}

Our Execute function is simply how we initiate the app. You can see this is all we call in main.go.

Our rootCmd variable is our main application command. This is where we add global flags, and all sub-commands. This is what we execute when the application is run. Cobra takes care of all our documentation for flags etc, so it makes our program super simple since we just define all of our parameters in the cobra.Command type, which is reflecting on the terminal output.

So, for our sub-commands, there are some consistent design choices we need to look at. Firstly, here is how we’re defining our two sub-commands here, which are extremely simple examples;

// getSomeFile.go

var (
	getSomeFileFlags tools.GetSomeFileCommandFlag

	getSomeFileCmd = &cobra.Command{
		Use:   “getsomefile”,
		Short: “Print out the bytes of the provided file.”,
		Run: func(cmd *cobra.Command, args []string) {
			getSomeFileFlags.GetSomeFile()
		},
	}
)
// parseSomeDBQuery.go

var (
	parseSomeDBQueryFlags tools.ParseSomeDBQueryCommandFlag

	parseSomeDBQueryCmd = &cobra.Command{
		Use:   “parsesomequery”,
		Short: “Print out the bytes of the provided file.”,
		Run: func(cmd *cobra.Command, args []string) {
			parseSomeDBQueryFlags.ParseSomeDBQuery()
		},
	}
)

One key design that I like here is the simplicity of how we call our tools. Within each individual tool’s package or file, we’re defining their command flags as a struct, to make it easy to pass around. Since our examples are so small, here is our tools/getSomeFile.go tool for reference;

// tools/getSomeFile.go

package tools

import (
	“fmt”
	“github.com/discoriver/building-tool-suite/pkg/file”
	“github.com/discoriver/building-tool-suite/pkg/log”
)

type GetSomeFileCommandFlag struct {
	FilePath string
}

func (cmd *GetSomeFileCommandFlag) GetSomeFile() {
	if bytes, err := file.GetFileBytesPtr(cmd.FilePath); err != nil {
		log.Error(“GetSomeFile:: Error reading file bytes: %s”, err)
	} else {
		fmt.Print(bytes)
	}
}

I love this approach because it gives us a consistent place to stick our flags when we assign them. Usually, I will design tools to be initiated with a single function on it’s command flag struct, which then does the work and passes things around to other processes as necessary. To keep our command structure as clean as possible, I think this is the best approach. Now, let’s take a look at our init function for getSomeFileCmd;

// cmd/getSomeFile.go

func init() {
	rootCmd.AddCommand(getSomeFileCmd)

	getSomeFileCmd.Flags().StringVarP(&getSomeFileFlags.FilePath, “file”, “f”, “”, “Path to file”)

	// Required flags.
	getSomeFileCmd.MarkFlagRequired(“file”)
}

An important distinction here is that for sub-commands, we specify getSomeFileCmd.Flags() rather than getSomeFileCmd.PersistentFlags(), because these are flags that should be available to the sub-command, whereas persistent flags are available globally, which is why we’re using them for rootCmd in cmd/root.go.

Additionally, for this sub-command, we’re using getSomeFileCmd.MarkFlagRequired to declared that a flag is required. Cobra handles all of the error handling for us, so if the flag isn’t present, the program will error, specifying that the flag is required. Just another convenience that we don’t need to muddy up our program with, which means we’re still hitting super simplicity. These sub-commands can get pretty beefy when you start building complex tooling, but as long as we follow the same format for our sub-command layout (especially when it comes to init), then we should be able to easily maintain them. Typically I go in this order in my init function for sub-commands;

  • rootCmd.AddCommand() to assign the sub-command to the root.
  • Define command flags.
  • Define required flags.
  • Define config/flag bindings.
  • Define default viper config values.

Some people may prefer to wrap rootCmd.AddCommand() in a defer func().. statement, and it would make logical sense to add the sub-command once we’ve fully defined it, to make it more robust long-term, but it doesn’t actually add anything to what we’re doing here, so I haven’t.

If you find your requirements are more complex, maybe you’d want to make some changes to the flow or design of the sub-commands. For me, however, I used this design as a way to demonstrate something specific, which is the advantage of Go over our previous script repositories, so I need to be careful about complexity for non-Go people.

Database queries

A lot of data is gathered from our databases, and as such I wanted to make my db package as intuitive as possible. There is some code I want to keep private for good reason, but basically as part of an initialisation, we read in a .pgpass file before any execution starts, and test the db connection. If it fails, the application terminates and doesn’t initiate any actual work.

As for initiating queries, from a tool perspective, all they need to run is the actual query, and there is zero configuration needed for the DB as far as the tools are concerned. As an example, let’s look at our entire tools/parseSomeDBQuery.go file;

// tools/parseSomeDBQuery.go

package tools

import (
	“github.com/discoriver/building-tool-suite/internal/db”
	“github.com/discoriver/building-tool-suite/internal/terminator”
	“github.com/discoriver/building-tool-suite/pkg/log”
)

type ParseSomeDBQueryCommandFlag struct {
	ID string
}

func (cmd *ParseSomeDBQueryCommandFlag) ParseSomeDBQuery() {
	rows, err := db.QSomeDBQuery(cmd.ID)
	if err != nil {
		log.Error(“ParseSomeDBQuery:: Couldn’t get results: %s”, err)
	}

	log.Debug(“ParseSomeDBQuery:: Parsing %d rows”, len(*rows))

	printer := terminator.NewTableOutput()
	printer.Print(rows)
}

You’ll see that all we need to do in our tools to initiate the DB query is call db.QSomeDBQuery(cmd.ID), because this handles all of our connection and row parsing. I prefix all of my query functions with Q so it’s easy to identify them with autocomplete in my IDE (we have a lot). The db package itself is also pretty simple. I have the following function to connect to the DB (obviously it’s garbage for this example);

// internal/db/db.go

func connect() *sqlx.DB {
	var d *sqlx.DB
	
	// Connect to DB, exit program on failure.

	return d
}

And our actual query function looked like this;

type SomeDBQueryColumns struct {
	Name string `db:”name” header:”name”`
	Type string `db:”type” header:”type”`
}

func QSomeDBQuery(ID string) (*[]SomeDBQueryColumns, error) {
	query := `select
				name
				,type
				from some_table
				where ID == $1`

	log.Debug(“SQL:: Query: %s”, query)

	db := connect()

	var SomeQueryResult []SomeDBQueryColumns
	if err := db.Select(&SomeQueryResult, query, ID); err != nil {
		return nil, err
	}
	return &SomeQueryResult, nil
}

So you see that the query function handles all the db connections, result parsing etc.

Couple of points to note about the design;

  • Every single query has it’s own struct, containing the desired return columns. Map these using the db tag, and use the header tag for naming the table column when printed using tablewriter.
  • We always log the SQL query so we can see what’s running through the logs. Internally this is okay, and we want users to be able to see exactly what’s running so they’re able to go and run it themselves directly at the DB if something goes wrong with the tool (shit happens, I guess).
  • Don’t forget to use sql.Null<type> as the struct field type when a returned column might be null. I’m mentioning it here because I forget more often than I’d like to admit. Don’t be like me.

Again, this package can be way more complex, and is really dumbed down here so I can talk about it without getting into a bunch of unrelated stuff. For example I’m not closing the DB, I’m arguably building the queries insecurely, and I’m re-using too much code.

Build tags

This is the last important design feature I want to touch upon. In some cases, there could be a bunch of systems that we need to run a tools suite on. For example you may build certain tools that can be run locally, without needing to be on, let’s say, a jump box. Rather than create separate repositories, we can just build our Go program to only include certain tools in certain builds. This is where Cobra really demonstrates it’s value to us.

Jump boxes are usually accessing secure systems, and in my case initialisation was needed (for .pgpass, among other things.), so I decided to build a jump box package that did some initialisation stuff. To make sure this only got run with the jump box build, I did a couple of things (this isn’t in the repo).

Firstly, I created the file build_jump.go in main (I could create a similar file for all other builds too like build_local.go), with the following data;

// +build jump

package main

var STATE = “jump”

Then my main.go would look like this;

package main

import (
	“github.com/discoriver/building-tool-suite/cmd”
	“github.com/discoriver/building-tool-suite/internal/jumpbox”
)


func main() {
	if STATE == “jump” {
		jumpbox.Initialise()
		cmd.Execute()
	} else {
		cmd.Execute()
	}
}

This way we’re only initialising the jump box when it’s using that build.

As I mentioned, we also want to only include certain commands in certain builds. Due to the fact we’re only ever calling our tools from with the cmd package, we only need to specify our build tags for each tool here, and nowhere else. If rootCmd.AddCommand() never gets called because the command isn’t part of the build tag we’re specifying, the tool’s code is never reached, and won’t exist in the final binary. There will be no documentation for it, and no evidence it exists. This makes the builds trivial to maintain and update.

Conclusion

So this is my design, I hope it makes sense. I’m still getting used to writing about my ideas in a structured and useful way, so I may update this article if I come across a mistake, but I don’t expect many people will see it. If you did see it, however, I hope it was able to help you somehow.

Keep on solving problems! Peace.