Using Stata with Markdown

What are your options?

Original collage by Emmanuel Teitelbaum

Frequently when I am working in Stata, I find myself really missing the key features of R Markdown, like the ability to intersperse code with text and share my notes with other people in an attractive dynamic HTML or PDF document. I also find it really helpful from a workflow standpoint to be able to run code snippets in the text editor and to preview the document that I am writing in real time like you can in an R Notebook.

At one point, I thought my solution would just abandon Stata entirely for R. But I find that I still need Stata for certain kinds of analysis, and for some projects there is enough inertia that it makes sense to just keep doing them in Stata.

A little while back, though, I found I just couldn’t stand working in Stata’s .do file editor anymore. So I started a quest to figure out how I could best integrate Stata with Markdown in other environments. Here are some of the options I came across.

Hydrogen in Atom

I really love this setup. Atom is such a cool text editor. You can edit almost any language or document type, the color schemes are attractive and the keyboard shortcuts really help with efficiency.

The best thing about Atom is that you can use the Hydrogen package to run code interactively. You can even run code for multiple kernels/languages in the same document.

To create an interactive document with Stata, you need to install Kyle Barron’s stata_kernel, the Language Stata package and the Language Markdown package. stata_kernel is the Jupyter kernel for Stata that allows the code to run interactively, Language Stata provides Stata lanugage support, and Language Markdown provides support for Markdown (including R Markdown). I also installed Markdown Preview Plus (MPP), which provides a live updated preview of your document.

In case you are not familiar with Atom, each Jupyter kernel that you use is going to be installed in a slightly different way. For the stata_kernel, follow the instructions that Kyle Barron provides. You install Atom packages in Atom by hitting ctrl + shift + p in Windows/Linux or cmd + shift + p in macOS and typing install packages in the search field.

Once you have everything set up, you will be able to intersperse your code with text, run the code interactively, and preview the resulting document like this:

The only shortcoming here is that you cannot easily export the code along with the text to a shareable HTML or PDF document. For this, you can open your Markdown document in R and use the Statamarkdown package.

Statamarkdown in R

With Doug Hemken’s Statamarkdown, you can knit your .Rmd or .RMarkdown file in the usual way to create a document like an .html or .pdf or a blog post. There is a nice tutorial on how to use it here.

At the time I am writing this post, Statamarkdown is good for producing documents but does not work for running code interactively in a notebook. Also, Statamarkdown does not automatically remember what code you ran from one chunk to the next. In order to run a code chunk sequentially that builds on the previous chunk, you have to enable the collectcode = TRUE option. Here is what the output looks like:

  sysuse auto, clear
  summarize mpg weight
  regress mpg weight
    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
         mpg |         74     21.2973    5.785503         12         41
      weight |         74    3019.459    777.1936       1760       4840

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(1, 72)        =    134.62
       Model |   1591.9902         1   1591.9902   Prob > F        =    0.0000
    Residual |  851.469256        72  11.8259619   R-squared       =    0.6515
-------------+----------------------------------   Adj R-squared   =    0.6467
       Total |  2443.45946        73  33.4720474   Root MSE        =    3.4389

------------------------------------------------------------------------------
         mpg | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      weight |  -.0060087   .0005179   -11.60   0.000    -.0070411   -.0049763
       _cons |   39.44028   1.614003    24.44   0.000     36.22283    42.65774
------------------------------------------------------------------------------

Statamarkdown creates a bunch of .do and .log files that you have to back and clean up afterwards. Despite these limitations and minor hassles, Statamarkdown does achieve the desired objective of allowing you to produce Stata ouput in an HTML or PDF document.

Markstat in Stata

Germán Rodriguez’s markstat is probably the best option if you want to produce a dynamic document but stay completely in the realm of Stata. With markstat you intersperse Markdown annotations with Stata code like this:

# Stata and Markdown

Write some Markdown-formatted text and see what happens.

## Run Stata Code

Now try running some Stata code:

  sysuse auto, clear
  summarize mpg weight
  regress mpg weight

## To Do List 

That was a great analysis. Next we will do the following:

1. One thing
2. Two thing
3. Red thing
4. Blue thing

Etc....

The code gets identified with indentations rather than back ticks. You then need to save it as a script (.stmd) file and then process the file by running the markstat command in Stata. You also need to have Pandoc installed.

markstat definitely produces attractive documents and slides and is a better solution than Statamarkdown in R if that is all you need to do.

Other Solutions

There are a few other solutions I looked at but did not end up not using.

Stata is promoting its pystata Python package, which allows you to run Stata in an IPython environment like Jupyter notebooks. There is also Stata’s dyndoc command, which converts a text file into an HTML file or Word document.

I also tried using pystata in conjunction with the reticulate package in R, which I definitely do not recommend!

I hope you find a Stata/Markdown solution that works for you. Let me know what you choose!

Associate Professor of Political Science and International Affairs

My research interests include comparative political economy, data science and the politics of South Asia.