miti.sh/2025-06-20-add-a-pygments-lexer-to-chroma.md at 95e9be6e6068b26e74b0d33949433b3aa0a5c938

Catalin Constantin Mititiuc 95e9be6e60 Fix spelling in blurb

2025-06-22 10:34:25 -07:00

7.0 KiB

Raw Blame History

{ title: "Add a Pygments Lexer to Chroma" blurb: "Pygments and Chroma are syntax highlighting libraries written in Python and Go, respectively. Chroma is missing a language we like, which Pygments already supports. We add support for our language to Chroma by converting the existing lexer from Pygments.

} $index

Introduction

Gitea uses Chroma for syntax highlighting. Chroma is based on the Python syntax highlighter, Pygments, and includes a script to help convert Pygments lexers for use with Chroma. We describe how below.

Setup

We're going to be using the python and golang Docker images. Docker Desktop is not required.

$ docker pull python
$ docker pull golang

Let's set up some aliases to make running the commands easier.

$ alias docker-run='docker run --rm -it -w /opt -v $PWD:/opt'
$ alias docker-run-go='docker-run golang'
$ alias docker-run-py='docker-run python'

Convert a Pygments lexer to a Chroma lexer with `pygments2chroma_xml.py`

$ git clone https://github.com/alecthomas/chroma.git
$ cd chroma

In the Chroma root directory, we run:

$ docker-run-py bash -c \
 "pip install pystache pygments && \
  python _tools/pygments2chroma_xml.py \
    pygments.lexers.scripting.LuaLexer > lexers/embedded/lua.xml && \
  pip list"

We should see this in the output:

Package  Version
-------- -------
pip      25.0.1
Pygments 2.19.2
pystache 0.6.8

This just helps us know what version of Pygments we generated our lexer from. The file lexers/embedded/lua.xml should now contain all the tokenization rules for the Lua language.

::: filename-for-code-block lexers/embedded/lua.xml :::

<lexer>
  <config>
    <name>Lua</name>
    ...

Highlight some code with a Chroma lexer

Chroma provides a simple example test file we can modify to see what syntax highlighting with our new lexer looks like. First, though, we need to create a new Go module by running go mod init:

$ cd ..
$ docker-run-go go mod init main
go: creating new go.mod: module main
go: to add module requirements and sums:
	go mod tidy

We will need required modules, so let's go ahead and run go mod tidy as the output suggests.

$ docker-run-go go mod tidy

We should now have 2 additional files, go.mod and go.sum. go.sum has some package hashes while go.mod should look like this:

::: filename-for-code-block go.mod :::

module main

go 1.25

require github.com/alecthomas/chroma/v2 v2.18.0

require github.com/dlclark/regexp2 v1.11.5 // indirect

Now we can create a main.go file and copy over the code from Chroma's example test file, but we update the code variable with some Lua, print("hello"), and the lexer we pass into the Highlight function is changed to "lua":

::: filename-for-code-block main.go :::

package main

import (
	"log"
	"os"

	"github.com/alecthomas/chroma/v2/quick"
)

func main() {
	code := `print("hello")`

	err := quick.Highlight(os.Stdout, code, "lua", "html", "monokai")
	if err != nil {
		log.Fatal(err)
	}
}

Now we can try running our main.go like this:

$ docker-run-go go run main.go
go: downloading github.com/alecthomas/chroma/v2 v2.18.0
go: downloading github.com/dlclark/regexp2 v1.11.5
<html>
<style type="text/css">
...

And that should output markup (and styles) for highlighting that block of Lua code to the console. But if we notice, it's importing the Chroma package from the GitHub repo. If we want to use a local version of Chroma, we have to use a replace directive to import Chroma from our local directory:

$ docker-run-go go mod edit -replace \
github.com/alecthomas/chroma/v2@v2.18.0=./chroma

Which adds this line to our go.mod file:

::: filename-for-code-block go.mod :::

...

replace github.com/alecthomas/chroma/v2 v2.18.0 => ./chroma

Now, when we run main.go, we should no longer see Chroma being imported, because it's using our local copy:

$ docker-run-go go run main.go
go: downloading github.com/dlclark/regexp2 v1.11.5
<html>
<style type="text/css">
...

We should also see a list of styles followed by the HTML markup for highlighting our Lua code (formatted for legibility):

<pre class="chroma">
  <code>
    <span class="line">
      <span class="cl">
        <span class="n">print</span>
        <span class="p">(</span>
        <span class="s2">&#34;hello&#34;</span>
        <span class="p">)</span>
      </span>
    </span>
  </code>
</pre>

Add test data

If we want to add our lexer to Chroma, we will need to create some test data for it. We can create a file in lexers/testdata called lua.actual and add the language tokens to it.

Record test output

Once we have test data, we need to record the expected output. We create another file called lexers/testdata/lua.expected. This is the file we will record to by running the following command from the Chroma root directory:

$ docker-run -e RECORD=true golang go test ./lexers

Once test output is recorded in lexers/testdata/lua.expected, we should visually inspect and verify that the expected data is correct.

Run tests

As a final confirmation, we can run the tests to make sure we have not broken anything:

$ docker-run-go go test ./lexers

Conclusion

If we followed all these steps correctly, our lexer should be ready to be pushed to a git repo and for us to open a pull request!

Bonus!: Use local Pygments with `pygments2chroma_xml.py`

These lines in pygments2chroma_xml.py,

import pystache
from pygments import lexer as pygments_lexer
from pygments.token import _TokenType

import Pygments from the Python Package Index. But, if we want to convert a Pygments lexer from a local git repo, we can import it by simply running the pygments2chroma_xml.py script from the repo root directory.

$ git clone https://github.com/pygments/pygments.git
$ cd pygments
$ docker-run \
-v ../chroma/_tools/pygments2chroma_xml.py:/opt/pygments2chroma_xml.py \
python bash -c \
 "pip install pystache && \
  python pygments2chroma_xml.py pygments.lexers.scripting.LuaLexer && \
  pip list"

We should see the lexer output followed by

Package  Version
-------- -------
pip      25.0.1
pystache 0.6.8

which indicates no remote pygments package was installed.

7.0 KiB Raw Blame History