285 lines
7.0 KiB
Markdown
285 lines
7.0 KiB
Markdown
{
|
|
title: "Add a Pygments Lexer to Chroma"
|
|
blurb: "[Pygments][4] and [Chroma][5] are syntax highlighting libraries
|
|
written in [Python][6] and [Go][7], respecitvely. Chroma is missing a
|
|
language we like, which Pygments already supports. We add support for our
|
|
language to Chroma by converting the existing lexer from Pygments.
|
|
|
|
[4]: https://github.com/pygments/pygments
|
|
[5]: https://github.com/alecthomas/chroma
|
|
[6]: https://www.python.org/
|
|
[7]: https://go.dev/"
|
|
}
|
|
$index
|
|
|
|
## Introduction
|
|
|
|
[Gitea](https://github.com/go-gitea/gitea) uses [Chroma](https://github.com/alecthomas/chroma) for syntax highlighting. Chroma is based on the Python
|
|
syntax highlighter, [Pygments](https://github.com/pygments/pygments), and includes a [script](https://github.com/alecthomas/chroma/blob/484750a96fc430f49d6b69cc2a2a8b7a67691446/_tools/pygments2chroma_xml.py) to help convert Pygments
|
|
lexers for use with Chroma. We describe how below.
|
|
|
|
|
|
## Setup
|
|
|
|
We're going to be using the `python` and `golang` [Docker][4] images. Docker
|
|
Desktop is _not_ required.
|
|
|
|
```console
|
|
$ docker pull python
|
|
$ docker pull golang
|
|
```
|
|
|
|
Let's set up some aliases to make running the commands easier.
|
|
|
|
```console
|
|
$ alias docker-run='docker run --rm -it -w /opt -v $PWD:/opt'
|
|
$ alias docker-run-go='docker-run golang'
|
|
$ alias docker-run-py='docker-run python'
|
|
```
|
|
|
|
[3]: https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user
|
|
[4]: https://docs.docker.com/engine/
|
|
|
|
## Convert a Pygments lexer to a Chroma lexer with `pygments2chroma_xml.py`
|
|
|
|
```console
|
|
$ git clone https://github.com/alecthomas/chroma.git
|
|
$ cd chroma
|
|
```
|
|
|
|
In the Chroma root directory, we run:
|
|
|
|
```console
|
|
$ docker-run-py bash -c \
|
|
"pip install pystache pygments && \
|
|
python _tools/pygments2chroma_xml.py \
|
|
pygments.lexers.scripting.LuaLexer > lexers/embedded/lua.xml && \
|
|
pip list"
|
|
```
|
|
|
|
We should see this in the output:
|
|
|
|
```
|
|
Package Version
|
|
-------- -------
|
|
pip 25.0.1
|
|
Pygments 2.19.2
|
|
pystache 0.6.8
|
|
```
|
|
|
|
This just helps us know what version of Pygments we generated our lexer from.
|
|
The file `lexers/embedded/lua.xml` should now contain all the tokenization
|
|
rules for the [Lua](https://www.lua.org) language.
|
|
|
|
::: filename-for-code-block
|
|
`lexers/embedded/lua.xml`
|
|
:::
|
|
|
|
```xml
|
|
<lexer>
|
|
<config>
|
|
<name>Lua</name>
|
|
...
|
|
```
|
|
|
|
## Highlight some code with a Chroma lexer
|
|
|
|
Chroma provides a [simple example test file][1] we can modify to see what syntax
|
|
highlighting with our new lexer looks like. First, though, we need to create a
|
|
new Go module by running `go mod init`:
|
|
|
|
```console
|
|
$ cd ..
|
|
$ docker-run-go go mod init main
|
|
go: creating new go.mod: module main
|
|
go: to add module requirements and sums:
|
|
go mod tidy
|
|
```
|
|
|
|
We will need required modules, so let's go ahead and run `go mod tidy` as the
|
|
output suggests.
|
|
|
|
```console
|
|
$ docker-run-go go mod tidy
|
|
```
|
|
|
|
We should now have 2 additional files, `go.mod` and `go.sum`. `go.sum` has some
|
|
package hashes while `go.mod` should look like this:
|
|
|
|
::: filename-for-code-block
|
|
`go.mod`
|
|
:::
|
|
|
|
```
|
|
module main
|
|
|
|
go 1.25
|
|
|
|
require github.com/alecthomas/chroma/v2 v2.18.0
|
|
|
|
require github.com/dlclark/regexp2 v1.11.5 // indirect
|
|
```
|
|
|
|
Now we can create a `main.go` file and copy over the code from Chroma's example
|
|
test file, but we update the `code` variable with some Lua, `print("hello")`,
|
|
and the lexer we pass into the `Highlight` function is changed to `"lua"`:
|
|
|
|
::: filename-for-code-block
|
|
`main.go`
|
|
:::
|
|
|
|
```go
|
|
package main
|
|
|
|
import (
|
|
"log"
|
|
"os"
|
|
|
|
"github.com/alecthomas/chroma/v2/quick"
|
|
)
|
|
|
|
func main() {
|
|
code := `print("hello")`
|
|
|
|
err := quick.Highlight(os.Stdout, code, "lua", "html", "monokai")
|
|
if err != nil {
|
|
log.Fatal(err)
|
|
}
|
|
}
|
|
```
|
|
|
|
Now we can try running our `main.go` like this:
|
|
|
|
```console
|
|
$ docker-run-go go run main.go
|
|
go: downloading github.com/alecthomas/chroma/v2 v2.18.0
|
|
go: downloading github.com/dlclark/regexp2 v1.11.5
|
|
<html>
|
|
<style type="text/css">
|
|
...
|
|
```
|
|
|
|
And that should output markup (and styles) for highlighting that block of Lua
|
|
code to the console. But if we notice, it's importing the Chroma package from
|
|
the GitHub repo. If we want to use a local version of Chroma, we have to use a
|
|
[`replace` directive][2] to import Chroma from our local directory:
|
|
|
|
```console
|
|
$ docker-run-go go mod edit -replace \
|
|
github.com/alecthomas/chroma/v2@v2.18.0=./chroma
|
|
```
|
|
|
|
Which adds this line to our `go.mod` file:
|
|
|
|
::: filename-for-code-block
|
|
`go.mod`
|
|
:::
|
|
|
|
```
|
|
...
|
|
|
|
replace github.com/alecthomas/chroma/v2 v2.18.0 => ./chroma
|
|
```
|
|
|
|
Now, when we run `main.go`, we should no longer see Chroma being imported,
|
|
because it's using our local copy:
|
|
|
|
```console
|
|
$ docker-run-go go run main.go
|
|
go: downloading github.com/dlclark/regexp2 v1.11.5
|
|
<html>
|
|
<style type="text/css">
|
|
...
|
|
```
|
|
|
|
We should also see a list of styles followed by the HTML markup for
|
|
highlighting our Lua code (formatted for legibility):
|
|
|
|
```html
|
|
<pre class="chroma">
|
|
<code>
|
|
<span class="line">
|
|
<span class="cl">
|
|
<span class="n">print</span>
|
|
<span class="p">(</span>
|
|
<span class="s2">"hello"</span>
|
|
<span class="p">)</span>
|
|
</span>
|
|
</span>
|
|
</code>
|
|
</pre>
|
|
```
|
|
|
|
[1]: https://github.com/alecthomas/chroma/blob/484750a96fc430f49d6b69cc2a2a8b7a67691446/quick/example_test.go
|
|
[2]: https://go.dev/ref/mod#go-mod-file-replace
|
|
|
|
## Add test data
|
|
|
|
If we want to add our lexer to Chroma, we will need to create some test data
|
|
for it. We can create a file in `lexers/testdata` called `lua.actual` and
|
|
add the language tokens to it.
|
|
|
|
## Record test output
|
|
|
|
Once we have test data, we need to record the expected output. We create
|
|
another file called `lexers/testdata/lua.expected`. This is the file we
|
|
will record to by running the following command from the Chroma root directory:
|
|
|
|
```console
|
|
$ docker-run -e RECORD=true golang go test ./lexers
|
|
```
|
|
|
|
Once test output is recorded in `lexers/testdata/lua.expected`, we should
|
|
visually inspect and verify that the expected data is correct.
|
|
|
|
## Run tests
|
|
|
|
As a final confirmation, we can run the tests to make sure we have not broken
|
|
anything:
|
|
|
|
```console
|
|
$ docker-run-go go test ./lexers
|
|
```
|
|
|
|
## Conclusion
|
|
|
|
If we followed all these steps correctly, our lexer should be ready to be
|
|
pushed to a `git` repo and for us to open a pull request!
|
|
|
|
## Bonus!: Use local Pygments with `pygments2chroma_xml.py`
|
|
|
|
These lines in `pygments2chroma_xml.py`,
|
|
|
|
```python
|
|
import pystache
|
|
from pygments import lexer as pygments_lexer
|
|
from pygments.token import _TokenType
|
|
```
|
|
|
|
import Pygments from the [Python Package Index](https://pypi.org/). But, if we
|
|
want to convert a Pygments lexer from a local `git` repo, we can import it
|
|
by simply running the `pygments2chroma_xml.py` script from the repo root
|
|
directory.
|
|
|
|
```console
|
|
$ git clone https://github.com/pygments/pygments.git
|
|
$ cd pygments
|
|
$ docker-run \
|
|
-v ../chroma/_tools/pygments2chroma_xml.py:/opt/pygments2chroma_xml.py \
|
|
python bash -c \
|
|
"pip install pystache && \
|
|
python pygments2chroma_xml.py pygments.lexers.scripting.LuaLexer && \
|
|
pip list"
|
|
|
|
```
|
|
We should see the lexer output followed by
|
|
|
|
```console
|
|
Package Version
|
|
-------- -------
|
|
pip 25.0.1
|
|
pystache 0.6.8
|
|
```
|
|
|
|
which indicates no remote `pygments` package was installed.
|