Publish post 'Add a Pygments Lexer to Chroma' #2
1
html/.gitignore
vendored
1
html/.gitignore
vendored
@ -2,6 +2,7 @@ app.css
|
||||
code.html
|
||||
index.html
|
||||
pandoc.css
|
||||
posts/add-a-pygments-lexer-to-chroma.html
|
||||
posts/build-a-neovim-qt-appimage-from-source.html
|
||||
posts/build-static-website-generator-part-1.html
|
||||
posts/deploy-elixir-generated-html-with-docker-on-digitalocean.html
|
||||
|
287
posts/2025-06-20-add-a-pygments-lexer-to-chroma.md
Normal file
287
posts/2025-06-20-add-a-pygments-lexer-to-chroma.md
Normal file
@ -0,0 +1,287 @@
|
||||
{
|
||||
title: "Add a Pygments Lexer to Chroma"
|
||||
blurb: "[Pygments][4] and [Chroma][5] are syntax highlighting libraries
|
||||
written in [Python][6] and [Go][7], respecitvely. Chroma is missing a
|
||||
language we like, which Pygments already supports. We add support for our
|
||||
language to Chroma by converting the existing lexer from Pygments.
|
||||
|
||||
[4]: https://github.com/pygments/pygments
|
||||
[5]: https://github.com/alecthomas/chroma
|
||||
[6]: https://www.python.org/
|
||||
[7]: https://go.dev/"
|
||||
}
|
||||
$index
|
||||
|
||||
## Introduction
|
||||
|
||||
[Gitea][8] uses [Chroma][9] for syntax highlighting. Chroma is based on the
|
||||
Python syntax highlighter, [Pygments][10], and includes a [script][11] to help
|
||||
convert Pygments lexers for use with Chroma. We describe how below.
|
||||
|
||||
[8]: https://github.com/go-gitea/gitea
|
||||
[9]: https://github.com/alecthomas/chroma
|
||||
[10]: https://github.com/pygments/pygments
|
||||
[11]: https://github.com/alecthomas/chroma/blob/484750a96fc430f49d6b69cc2a2a8b7a67691446/_tools/pygments2chroma_xml.py
|
||||
|
||||
## Setup
|
||||
|
||||
We're going to be using the `python` and `golang` [Docker][3] images. Docker
|
||||
Desktop is _not_ required.
|
||||
|
||||
```console
|
||||
$ docker pull python
|
||||
$ docker pull golang
|
||||
```
|
||||
|
||||
Let's set up some aliases to make running the commands easier.
|
||||
|
||||
```console
|
||||
$ alias docker-run='docker run --rm -it -w /opt -v $PWD:/opt'
|
||||
$ alias docker-run-go='docker-run golang'
|
||||
$ alias docker-run-py='docker-run python'
|
||||
```
|
||||
|
||||
[3]: https://docs.docker.com/engine/
|
||||
|
||||
## Convert a Pygments lexer to a Chroma lexer with `pygments2chroma_xml.py`
|
||||
|
||||
```console
|
||||
$ git clone https://github.com/alecthomas/chroma.git
|
||||
$ cd chroma
|
||||
```
|
||||
|
||||
In the Chroma root directory, we run:
|
||||
|
||||
```console
|
||||
$ docker-run-py bash -c \
|
||||
"pip install pystache pygments && \
|
||||
python _tools/pygments2chroma_xml.py \
|
||||
pygments.lexers.scripting.LuaLexer > lexers/embedded/lua.xml && \
|
||||
pip list"
|
||||
```
|
||||
|
||||
We should see this in the output:
|
||||
|
||||
```
|
||||
Package Version
|
||||
-------- -------
|
||||
pip 25.0.1
|
||||
Pygments 2.19.2
|
||||
pystache 0.6.8
|
||||
```
|
||||
|
||||
This just helps us know what version of Pygments we generated our lexer from.
|
||||
The file `lexers/embedded/lua.xml` should now contain all the tokenization
|
||||
rules for the [Lua](https://www.lua.org) language.
|
||||
|
||||
::: filename-for-code-block
|
||||
`lexers/embedded/lua.xml`
|
||||
:::
|
||||
|
||||
```xml
|
||||
<lexer>
|
||||
<config>
|
||||
<name>Lua</name>
|
||||
...
|
||||
```
|
||||
|
||||
## Highlight some code with a Chroma lexer
|
||||
|
||||
Chroma provides a [simple example test file][1] we can modify to see what syntax
|
||||
highlighting with our new lexer looks like. First, though, we need to create a
|
||||
new Go module by running `go mod init`:
|
||||
|
||||
```console
|
||||
$ cd ..
|
||||
$ docker-run-go go mod init main
|
||||
go: creating new go.mod: module main
|
||||
go: to add module requirements and sums:
|
||||
go mod tidy
|
||||
```
|
||||
|
||||
We will need required modules, so let's go ahead and run `go mod tidy` as the
|
||||
output suggests.
|
||||
|
||||
```console
|
||||
$ docker-run-go go mod tidy
|
||||
```
|
||||
|
||||
We should now have 2 additional files, `go.mod` and `go.sum`. `go.sum` has some
|
||||
package hashes while `go.mod` should look like this:
|
||||
|
||||
::: filename-for-code-block
|
||||
`go.mod`
|
||||
:::
|
||||
|
||||
```
|
||||
module main
|
||||
|
||||
go 1.25
|
||||
|
||||
require github.com/alecthomas/chroma/v2 v2.18.0
|
||||
|
||||
require github.com/dlclark/regexp2 v1.11.5 // indirect
|
||||
```
|
||||
|
||||
Now we can create a `main.go` file and copy over the code from Chroma's example
|
||||
test file, but we update the `code` variable with some Lua, `print("hello")`,
|
||||
and the lexer we pass into the `Highlight` function is changed to `"lua"`:
|
||||
|
||||
::: filename-for-code-block
|
||||
`main.go`
|
||||
:::
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"log"
|
||||
"os"
|
||||
|
||||
"github.com/alecthomas/chroma/v2/quick"
|
||||
)
|
||||
|
||||
func main() {
|
||||
code := `print("hello")`
|
||||
|
||||
err := quick.Highlight(os.Stdout, code, "lua", "html", "monokai")
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Now we can try running our `main.go` like this:
|
||||
|
||||
```console
|
||||
$ docker-run-go go run main.go
|
||||
go: downloading github.com/alecthomas/chroma/v2 v2.18.0
|
||||
go: downloading github.com/dlclark/regexp2 v1.11.5
|
||||
<html>
|
||||
<style type="text/css">
|
||||
...
|
||||
```
|
||||
|
||||
And that should output markup (and styles) for highlighting that block of Lua
|
||||
code to the console. But if we notice, it's importing the Chroma package from
|
||||
the GitHub repo. If we want to use a local version of Chroma, we have to use a
|
||||
[`replace` directive][2] to import Chroma from our local directory:
|
||||
|
||||
```console
|
||||
$ docker-run-go go mod edit -replace \
|
||||
github.com/alecthomas/chroma/v2@v2.18.0=./chroma
|
||||
```
|
||||
|
||||
Which adds this line to our `go.mod` file:
|
||||
|
||||
::: filename-for-code-block
|
||||
`go.mod`
|
||||
:::
|
||||
|
||||
```
|
||||
...
|
||||
|
||||
replace github.com/alecthomas/chroma/v2 v2.18.0 => ./chroma
|
||||
```
|
||||
|
||||
Now, when we run `main.go`, we should no longer see Chroma being imported,
|
||||
because it's using our local copy:
|
||||
|
||||
```console
|
||||
$ docker-run-go go run main.go
|
||||
go: downloading github.com/dlclark/regexp2 v1.11.5
|
||||
<html>
|
||||
<style type="text/css">
|
||||
...
|
||||
```
|
||||
|
||||
We should also see a list of styles followed by the HTML markup for
|
||||
highlighting our Lua code (formatted for legibility):
|
||||
|
||||
```html
|
||||
<pre class="chroma">
|
||||
<code>
|
||||
<span class="line">
|
||||
<span class="cl">
|
||||
<span class="n">print</span>
|
||||
<span class="p">(</span>
|
||||
<span class="s2">"hello"</span>
|
||||
<span class="p">)</span>
|
||||
</span>
|
||||
</span>
|
||||
</code>
|
||||
</pre>
|
||||
```
|
||||
|
||||
[1]: https://github.com/alecthomas/chroma/blob/484750a96fc430f49d6b69cc2a2a8b7a67691446/quick/example_test.go
|
||||
[2]: https://go.dev/ref/mod#go-mod-file-replace
|
||||
|
||||
## Add test data
|
||||
|
||||
If we want to add our lexer to Chroma, we will need to create some test data
|
||||
for it. We can create a file in `lexers/testdata` called `lua.actual` and
|
||||
add the language tokens to it.
|
||||
|
||||
## Record test output
|
||||
|
||||
Once we have test data, we need to record the expected output. We create
|
||||
another file called `lexers/testdata/lua.expected`. This is the file we
|
||||
will record to by running the following command from the Chroma root directory:
|
||||
|
||||
```console
|
||||
$ docker-run -e RECORD=true golang go test ./lexers
|
||||
```
|
||||
|
||||
Once test output is recorded in `lexers/testdata/lua.expected`, we should
|
||||
visually inspect and verify that the expected data is correct.
|
||||
|
||||
## Run tests
|
||||
|
||||
As a final confirmation, we can run the tests to make sure we have not broken
|
||||
anything:
|
||||
|
||||
```console
|
||||
$ docker-run-go go test ./lexers
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
If we followed all these steps correctly, our lexer should be ready to be
|
||||
pushed to a `git` repo and for us to open a pull request!
|
||||
|
||||
## Bonus!: Use local Pygments with `pygments2chroma_xml.py`
|
||||
|
||||
These lines in `pygments2chroma_xml.py`,
|
||||
|
||||
```python
|
||||
import pystache
|
||||
from pygments import lexer as pygments_lexer
|
||||
from pygments.token import _TokenType
|
||||
```
|
||||
|
||||
import Pygments from the [Python Package Index](https://pypi.org/). But, if we
|
||||
want to convert a Pygments lexer from a local `git` repo, we can import it
|
||||
by simply running the `pygments2chroma_xml.py` script from the repo root
|
||||
directory.
|
||||
|
||||
```console
|
||||
$ git clone https://github.com/pygments/pygments.git
|
||||
$ cd pygments
|
||||
$ docker-run \
|
||||
-v ../chroma/_tools/pygments2chroma_xml.py:/opt/pygments2chroma_xml.py \
|
||||
python bash -c \
|
||||
"pip install pystache && \
|
||||
python pygments2chroma_xml.py pygments.lexers.scripting.LuaLexer && \
|
||||
pip list"
|
||||
|
||||
```
|
||||
We should see the lexer output followed by
|
||||
|
||||
```console
|
||||
Package Version
|
||||
-------- -------
|
||||
pip 25.0.1
|
||||
pystache 0.6.8
|
||||
```
|
||||
|
||||
which indicates no remote `pygments` package was installed.
|
Loading…
x
Reference in New Issue
Block a user