Literate Agda with Org-mode

2018/11/03 20 mins read Agda Org Emacs
Literate Agda with Org-mode

Literate Agda with Org-mode

Musa's article image
( Read as PDF or see the source )


Literate Programming is essentially the idea that code is enclosed in documentation rather than the comments being surrounded by code. The idea is that software ought to be written like an essay to be read by a human; from this code for the machine can be extracted.

The articles on this blog are meant to be in such a format and as such I use Org-mode as my markup for producing the HTMLs and PDFs.

This article aims to produce an Org-friendly approach to working with the Agda language, which is special in comparison to many other languages: Coding is interactive via holes and it permits almost any sequence of characters as a legal lexeme thereby rendering a static highlighting theme impossible.

The result of this Elisp exploration is that by C-x C-a we can shift into Agda-mode and use its interactive features to construct our program; then return to an Org-mode literate programming style afterwards with C-x C-o ---both translations remember the position we're working at!

1 Agda Syntax Highlighting With org-agda-mode

We produce a new mode in a file named org-agda-mode.el so that Org-mode blocks marked with org-agda will have Agda approximated syntax.

;; To use generic-mode later below.
(require 'generic-x)

1.1 Keywords

We look at the agda2-highlight.el source file from the Agda repository for colours of keywords and reserved symbols such as =, ∀, etc.

(defface agda2-highlight-keyword-face
  '((t (:foreground "DarkOrange3")))
  "The face used for keywords."
    :group 'font-lock-faces)

(setq font-lock-keyword-face 'agda2-highlight-keyword-face)

(defface agda2-highlight-symbol-face
  '((((background light)) (:foreground "gray25"))
    (((background dark))  (:foreground "gray75")))
  "The face used for symbols like forall, =, as, ->, etc."
  :group 'font-lock-faces)

From Agda's “read the docs” website, we obtain the keywords for the language:

(setq org-agda-keywords '("=" "|" "->" "→" ":" "?" "\\" "λ" "∀" ".." "..." "abstract" "codata"
                          "coinductive" "constructor" "data" "do" "eta-equality" "field"
                          "forall" "hiding" "import" "in" "inductive" "infix" "infixl"
                          "infixr" "instance" "let" "macro" "module" "mutual" "no-eta-equality"
                          "open" "overlap" "pattern" "postulate" "primitive" "private" "public"
                          "quote" "quoteContext" "quoteGoal" "quoteTerm" "record" "renaming"
                          "rewrite" "Set" "syntax" "tactic" "unquote" "unquoteDecl" "unquoteDef"
                          "using" "where" "with"))

1.2 The generic-mode Definition

Agda colouring is approximated as defined below, but a convention is made: Function symbols begin with a lower case letter, whereas type symbols begin with a capital letter. Otherwise, I would need to resort to Agda's mechanism for determining whether a name is a type or not:

Parsing is Typechecking!

    'org-agda-mode                      ;; name of the mode
    (list '("{-" . "-}"))               ;; comments delimiter
    ;; font lock list: Order of colouring matters; 
    ;; the numbers refer to the subpart, or the whole(0), that should be coloured.
     ;; To begin with, after "module" or after "import" should be purple
     ;; Note the SPACE below.
     '("\\(module\\|import\\) \\([a-zA-Z0-9\-_\.]+\\)" 2 '((t (:foreground "purple")))) 
     ;; Agda special symbols: as
     '(" as" 0 'agda2-highlight-symbol-face)
     ;; Type, and constructor, names begin with a capital letter  --personal convention.
     ;; They're preceded by either a space or an open delimiter character. 
     '("\\( \\|\s(\\)\\([A-Z]+\\)\\([a-zA-Z0-9\-_]*\\)" 0 'font-lock-type-face)
     '("ℕ" 0 'font-lock-type-face)
     ;; variables & function names, as a personal convention, begin with a lower case
     '("\\([a-z]+\\)\\([a-zA-Z0-9\-_]*\\)" 0 '((t (:foreground "medium blue"))))
     ;; colour numbers
     '("\\([0-9]+\\)" 1   '((t (:foreground "purple")))) ;; 'font-lock-constant-face)
     ;; other faces to consider:
     ;; 'font-lock-keyword-face 'font-lock-builtin-face 'font-lock-function-name-face
     ;;' font-lock-variable-name-face
     nil                                                   ;; files that trigger this mode
     nil                                                   ;; any other functions to call
    "My custom Agda highlighting mode for use *within* Org-mode."     ;; doc string

(provide 'org-agda-mode)

; (describe-symbol 'define-generic-mode)
; (describe-symbol 'font-lock-function-name-face)

I do not insist that org-agda-mode be activated on any particular files by default.

2 (lagda-to-org) and (org-to-lagda)

Agda would not typecheck a non-lagda, or non-agda, file therefore I cannot use Org-mode multiple mode settings. I will instead merely swap the syntax of the modes then reload the desired mode. –It may not be ideal, but it does what I want in a fast enough fashion.

Below we put together a way to make rewrites ⟨pre⟩⋯⟨post⟩ ↦ ⟨newPre⟩⋯⟨newPost⟩ then use that with the rewrite tokens being #+BEGIN_SRC and |begin{code} for literate Agda, as well as their closing partners.

;; “The long lost Emacs string manipulation library”
(require 's)

(defun strip (pre post it)  
  "A simple extraction: it = ⟨pre⟩it₀⟨post⟩ ↦ it₀." 
  (s-chop-prefix pre (s-chop-suffix post it)) )

(defun rewrite-ends (pre post newPre newPost)
  "Perform the following in-buffer rewrite: ⟨pre⟩⋯⟨post⟩ ↦ ⟨newPre⟩⋯⟨newPost⟩.
  For example, for rewriting begin-end code blocks from Org-mode to something
  else, say a language's default literate mode.

  Warning: The body, the “⋯”, cannot contain the `#` character.
  I do this so that the search does not go to the very last occurence of `#+END_SRC`;
  which is my primary instance of `pre`.

  In the arguments, only symbol `\` needs to be escaped.

  Implementation: Match the pre, then any characteer that is not `#`, then the post.
  Hence, the body cannot contain a `#` character!
  In Agda this is not an issue, since we can use its Unicode cousin `♯` instead.
  (let* ((rxPre     (regexp-quote pre))
         (rxPost    (regexp-quote post))
         (altered (replace-regexp-in-string (concat rxPre "\\([^\\#]\\|\n\\)*" rxPost)
                  (lambda (x) (concat newPre (strip pre post x) newPost))
                  (buffer-string) 'no-fixed-case 'new-text-is-literal)))
      (insert altered)

The two rewriting utilities:

(defun lagda-to-org ()
  "Transform literate Agda blocks into Org-mode source blocks.
   Use haskell as the Org source block language since I do not have nice colouring otherwise.
  (let ((here (line-number-at-pos))) ;; remember current line
    (rewrite-ends "\\begin{code}\n" "\n\\end{code}" "#+BEGIN_SRC org-agda\n" "\n#+END_SRC")
    (rewrite-ends "\\begin{spec}\n" "\n\\end{spec}" "#+BEGIN_EXAMPLE org-agda\n" "\n#+END_EXAMPLE")
    ;; (sit-for 2) ;; necessary for the slight delay between the agda2 commands
    (org-goto-line here)    ;; personal function, see my

(defun org-to-lagda ()
  "Transform Org-mode source blocks into literate Agda blocks.
   Use haskell as the Org source block language since I do not have nice colouring otherwise.
  (let ((here (line-number-at-pos))) ;; remember current line
    (rewrite-ends "#+BEGIN_SRC org-agda\n" "#+END_SRC" "\\begin{code}\n" "\\end{code}")
    (rewrite-ends "#+BEGIN_EXAMPLE org-agda\n" "#+END_EXAMPLE" "\\begin{spec}\n" "\\end{spec}")
    (sit-for 1) ;; necessary for the slight delay between the agda2 commands
    (goto-line here)

Handy-dandy shortcuts:

(local-set-key (kbd "C-x C-a") 'org-to-lagda)
(local-set-key (kbd "C-x C-o") 'lagda-to-org)

;; Maybe consider a simple “toggle” instead?

3 Example

Here's some sample fragments, whose editing can be turned on with C-x C-a.

mmodule literate where

data  : Set where
  Zero : 
  Succ : 

double : 
double Zero = Zero
double (Succ n) = Succ (Succ (double n))

{- lengthy
        comment -}

{- No one line comment colouring … Yet -}

open import Data.Nat as Lib

camelCaseIdentifier-01 : Lib.
camelCaseIdentifier-01 = let it = 1234 in it

postulate magic : Set

hole : magic
hole = {!!}

Here's a literate Agda spec-ification environment, which corresponds to an Org-mode EXAMPLE block.

module this-is-a-spec {A : Set} (_≤_ : A → A → Set) where

  maximum-specfication : (candidate : A) → Set
  maximum-specfication c = ?

4 Summary

We now have the utility functions:

Command Action
C-x C-a transform org org-agda blocks to literate Agda blocs
C-x C-o transform literate Agda code delimiters to org org-agda src

This was fun: I learned a lot of elisp! Hopefully I can make use of this, in the small, if not in the large –in which case I'll need to return to the many COMMENT-ed out sections in this document.

5 Sources Consulted