Pandoc is a great tool for document conversion and generation. I’ve written tons of training course and documentation content over the years, and have often used Pandoc to convert Markdown (.md) content to a Microsoft Word (.docx) document. This can be especially helpful since I’ve found writing text content works well with version control using Git and Markdown, then generating the required Word Document once everything is completed.

I thought I would share in this article the necessary GitHub Actions Workflow configuration to automate the use of Pandoc for document generation. This automation can help eliminate the need to have Pandoc and other potential dependencies on your local machine. The GitHub Actions Workflow can be triggered, either manually or setup to auto-trigger on commits, to easily generate the Word Document from the Markdown content in the Git repository.

Additionally, I have found it useful to have the GitHub Action output the generated document to the Artifacts associated with the Workflow run once it is complete.

Let’s take a look at the required steps for the GitHub Action and then the full code!

GitHub Action Workflow Overview

The purpose of this GitHub Actions workflow is to convert a Markdown file, typically a README.md, into a Word document (.docx) using the Pandoc tool. The resulting Word document can be a valuable addition to your project documentation or used for other purposes.

Let’s break down the key components of the workflow:

Job Configuration

jobs:
  convert_via_pandoc:
    runs-on: ubuntu-20.04
    container: 
      image: pandoc/latex:latest
    steps:

The workflow contains a single job named convert_via_pandoc that runs on a Ubuntu 20.04 environment. Then the job is configured to use the latest Pandoc Docker image (pandoc/latex:latest). This will ensure consistent and reproducible document conversion, since this Docker image already has Pandoc and any necessary dependencies already installed.

By using the Pandoc Docker image, the GitHub Actions steps will be run directly on the Docker container that has Pandoc installed.

Job Steps

Checkout Repo Code

The first step checks out the contents of the repository. This ensures that the workflow has access to the necessary files, including the Markdown file to be converted.

- uses: actions/checkout@v4

Run Pandoc

The second step runs the Pandoc command to convert the Markdown file (./README.md) into a Word document (gh-actions-pandoc-readme.docx). The --toc flag generates a table of contents in the resulting Word document.

- name: run pandoc
  run: |
    pandoc -f markdown -t docx -o "gh-actions-pandoc-readme.docx" --toc "./README.md"

Output Word Doc to Artifacts

The final step uploads the generated Word document as a build artifact. This allows users to easily access and download the converted document after the workflow completes.

- uses: actions/upload-artifact@v3
  with:
    name: Word Doc
    path: "gh-actions-pandoc-readme.docx"

Full GitHub Actions Code

Here’s the full code for a GitHub Actions workflow that will use Pandoc to convert the README.md file in the root of the Git repository to a Word Document, then output the generated document (named gh-actions-pandoc-readme.docx) to the Artifacts for the Workflow run.

name: Generate Word Doc

on:
  workflow_dispatch:

jobs:
  convert_via_pandoc:
    runs-on: ubuntu-20.04
    container: 
      image: pandoc/latex:latest

    steps:
      # checkout repo contents
      - uses: actions/checkout@v4

      # run pandoc to generate word doc from markdown
      - name: run pandoc
        run: |
          pandoc -f markdown -t docx -o "gh-actions-pandoc-readme.docx" --toc "./README.md" 

      # output generated file to build artifacts
      - uses: actions/upload-artifact@v3
        with:
          name: Word Doc
          path: "gh-actions-pandoc-readme.docx"

Additionally, the Build5Nines/gh-actions-pandoc GitHub project contains an implementation of this code.

Download Word Doc from Workflow Artifacts

Once the GitHub Actions workflow has be run, the Artifacts for the workflow run will contain the generated Word Document. To download the artifact, navigate tot he GitHub Workflow run, then click on the Artifact created by the Workflow to download it.

GitHub Actions: Run Pandoc to convert Markdown to Word Document 1

Conclusion

This GitHub Actions workflow automates the process of converting a Markdown file to a Word document using Pandoc. By incorporating this workflow into your project, you can enhance documentation processes and create a more accessible format for your project information. The ability to trigger this workflow manually through the GitHub Actions interface provides flexibility, ensuring that document updates can be easily managed as part of your development workflow. You can also setup an automatic trigger based on Git commits or pull requests to automate the document generation.

Chris Pietschmann is a Microsoft MVP, HashiCorp Ambassador, and Microsoft Certified Trainer (MCT) with 20+ years of experience designing and building Cloud & Enterprise systems. He has worked with companies of all sizes from startups to large enterprises. He has a passion for technology and sharing what he learns with others to help enable them to learn faster and be more productive.
Microsoft MVP HashiCorp Ambassador

Discover more from Build5Nines

Subscribe now to keep reading and get access to the full archive.

Continue reading