Stata with Make#

GNU Make is a popular build automation tool widely used in software development. It is free and open-source that is cross-platform (e.g., Linux, macOS, and WinOS). Make is language-agnostic (meta-programming) and can work with mostly just shell commands. This aspect makes GNU Make useful for workflows in Stata, even though that was not the intention of the design. Assuming familiarity with Make, the core issue is to be able to reference Stata as a shell command.

Statement of need#

Why automate a build, including a pipeline in Stata? So that I don’t feel like strangling myself when I find out the limits of my memory about which do file should run to get what and what goes where six months six weeks later. Why Make? GNU Make is almost 50 years old now(?) but is still in development (GNU Make 4.3 was out in 2020 and allows for grouped targets) and will always work.

Referencing Stata as a command#

Ignoring quirks and syntax specific to GNU Make (not what this note is about, another note perhaps), the first thing to do is to set up a variable to reference the path to the Stata executable.

Makefile#
STATA_PATH := "C:\Program Files (x86)\Stata13\StataMP-64.exe"

Change the exact path to the Stata executable as required. := gets the simple (non-recursive) expansion of the previously defined STATA_PATH. The double quotes “…” are there because of the whitespaces in the path.

I can now define a variable that can be referenced whenever I want to execute Stata on some do file using the executable:

Makefile#
EXECSTATA := $(STATA_PATH) -e do

The -e allows Stata to execute without Stata prompting me to click ok for every task. I can now execute some.do file by referencing $(EXECSTATA) some.do. A -b option is also available that will prompt the user to click ok in Stata for every do file that is run.

Make with Stata#

If all I have is one main main.do file that logs everything in session.log, I can use the log file as the target.

Makefile#
session.log: main.do
    $(EXECSTATA) $<

If the main main.do file also calls other do and ado files, I can collect those in variables and chuck them into the dependencies.

Makefile#
SRC_DO := $(wildcard *.do)
SRC_ADO := $(wildcard *.ado)

session.log: main.do $(SRC_DO) $(SRC_ADO)
    $(EXECSTATA) $<

The $< now gaurantees that I run only the first dependency main.do even though there are other dependencies. I can do the same if I depend on some input data files. That’s it. The main catch is referencing the Stata executable.

Everything together: A Makefile for a minimal Stata workflow#

Makefile#
 1.DEFAULT_GOAL := help
 2
 3STATA_PATH="C:\Program Files (x86)\Stata13\StataMP-64.exe"
 4EXECSTATA := $(STATA_PATH) -e do
 5SRC_DO := $(wildcard *.do)
 6SRC_ADO := $(wildcard *.ado)
 7DATA := ...
 8
 9session.log: ## Run main.do and produce session.log
10session.log: main.do $(SRC_DO) $(SRC_ADO) $(DATA)
11    @echo "==> $@"
12    $(EXECSTATA) $<
13
14.PHONY: all
15all: ## Run all do files
16    session.log
17
18.PHONY: help
19help: ## Show this help message and exit
20    @grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-16s\033[0m %s\n", $$1, $$2}'

Tabs

One gotcha is that the tabs must be tabs and not spaces. In some IDEs, indentation using spaces need to switched off if necessary. Otherwise, Makefile will stop with errors like this: Makefile:10: *** missing separator.  Stop..

To see the help:

Bash#
make help
stdout#
all              Run all do files
help             Show this help message and exit

I can now just type make all to run all the do files.

Bash#
$ make all

Closing note

As a closing note, Make is not the only build automation tool available. Even for Stata specifically, Picard wrote PROJECT, a build automation tool available via the Stata SSC.

Resources#


Home Back to homepage.

Notes See more notes.