Compare commits
3 Commits
trunk
...
b5ab0ef104
| Author | SHA1 | Date | |
|---|---|---|---|
| b5ab0ef104 | |||
| 523e99e65d | |||
| ccbbd91f3e |
256
posts/2025-06-20-chroma.md
Normal file
256
posts/2025-06-20-chroma.md
Normal file
@@ -0,0 +1,256 @@
|
||||
{
|
||||
blurb: "Add a new lexer to Chroma"
|
||||
}
|
||||
$index
|
||||
|
||||
## Introduction
|
||||
|
||||
[Gitea](https://github.com/go-gitea/gitea) uses [Chroma](https://github.com/alecthomas/chroma) for syntax highlighting. Chroma is based on the Python
|
||||
syntax highlighter, [Pygments](https://github.com/pygments/pygments), and includes a [script](https://github.com/alecthomas/chroma/blob/484750a96fc430f49d6b69cc2a2a8b7a67691446/_tools/pygments2chroma_xml.py) to help convert Pygments
|
||||
lexers for use with Chroma. This post describes that process.
|
||||
|
||||
## Convert a Pygments lexer to a Chroma lexer with `pygments2chroma_xml.py`
|
||||
|
||||
In the Chroma root directory, we run:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt python bash -c \
|
||||
"pip install pystache pygments && pip list \
|
||||
&& python _tools/pygments2chroma_xml.py \
|
||||
pygments.lexers.scripting.LuaLexer > lexers/embedded/lua.xml"
|
||||
```
|
||||
|
||||
As output, we should see this in our terminal:
|
||||
|
||||
```
|
||||
Package Version
|
||||
-------- -------
|
||||
pip 25.0.1
|
||||
Pygments 2.19.2
|
||||
pystache 0.6.8
|
||||
```
|
||||
|
||||
This just helps us know what version of Pygments we generated our lexer from.
|
||||
The file `lexers/embedded/lua.xml` should now contain all the tokenization
|
||||
rules for the [Lua](https://www.lua.org) language.
|
||||
|
||||
::: filename-for-code-block
|
||||
`lexers/embedded/lua.xml`
|
||||
:::
|
||||
|
||||
```xml
|
||||
<lexer>
|
||||
<config>
|
||||
<name>Lua</name>
|
||||
...
|
||||
```
|
||||
|
||||
## Highlight some code with our new lexer
|
||||
|
||||
Chroma provides a [simple example test file][1] we can modify to see what syntax
|
||||
highlighting with our new lexer looks like. First, though, we need to create a
|
||||
new Go module by running `go mod init`:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm \
|
||||
go mod init main
|
||||
go: creating new go.mod: module main
|
||||
go: to add module requirements and sums:
|
||||
go mod tidy
|
||||
```
|
||||
|
||||
We will need required modules, so let's go ahead and run `go mod tidy` as the
|
||||
output suggests.
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm \
|
||||
go mod tidy
|
||||
```
|
||||
|
||||
We should now have 2 additional files, `go.mod` and `go.sum`. `go.sum` has some
|
||||
package hashes while `go.mod` should look like this:
|
||||
|
||||
::: filename-for-code-block
|
||||
`go.mod`
|
||||
:::
|
||||
|
||||
```
|
||||
module main
|
||||
|
||||
go 1.25
|
||||
|
||||
require github.com/alecthomas/chroma/v2 v2.18.0
|
||||
|
||||
require github.com/dlclark/regexp2 v1.11.5 // indirect
|
||||
```
|
||||
|
||||
Now we can create a `main.go` file and copy over the code from Chroma's example
|
||||
test file, but we update the `code` variable and the lexer we pass into the
|
||||
`Highlight` function for Lua:
|
||||
|
||||
|
||||
::: filename-for-code-block
|
||||
`main.go`
|
||||
:::
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"log"
|
||||
"os"
|
||||
|
||||
"github.com/alecthomas/chroma/v2/quick"
|
||||
)
|
||||
|
||||
func main() {
|
||||
code := `print("hello")`
|
||||
|
||||
err := quick.Highlight(os.Stdout, code, "lua", "html", "monokai")
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Now we can try running our `main.go` like this:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm go run main.go
|
||||
go: downloading github.com/alecthomas/chroma/v2 v2.18.0
|
||||
go: downloading github.com/dlclark/regexp2 v1.11.5
|
||||
<html>
|
||||
<style type="text/css">
|
||||
...
|
||||
```
|
||||
|
||||
And that should output markup (and styles) for highlighting that block of Lua
|
||||
code to the console. But if we notice, it's importing the Chroma package from
|
||||
the GitHub repo. If we want to use a local version of Chroma, we have to use a
|
||||
[`replace` directive][2] to import Chroma from our local directory:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm \
|
||||
go mod edit -replace github.com/alecthomas/chroma/v2@v2.18.0=./chroma
|
||||
```
|
||||
|
||||
Which adds this line to our `go.mod` file:
|
||||
|
||||
::: filename-for-code-block
|
||||
`go.mod`
|
||||
:::
|
||||
|
||||
```
|
||||
...
|
||||
|
||||
replace github.com/alecthomas/chroma/v2 v2.18.0 => ./chroma
|
||||
```
|
||||
|
||||
Now, when we run `main.go`, we should no longer see Chroma being imported,
|
||||
because it's using our local copy:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm go run main.go
|
||||
go: downloading github.com/dlclark/regexp2 v1.11.5
|
||||
<html>
|
||||
<style type="text/css">
|
||||
...
|
||||
```
|
||||
|
||||
We should also see a list of styles followed by the HTML markup for
|
||||
highlighting our Lua code (formatted for legibility):
|
||||
|
||||
```html
|
||||
<pre class="chroma">
|
||||
<code>
|
||||
<span class="line">
|
||||
<span class="cl">
|
||||
<span class="n">print</span>
|
||||
<span class="p">(</span>
|
||||
<span class="s2">"hello"</span>
|
||||
<span class="p">)</span>
|
||||
</span>
|
||||
</span>
|
||||
</code>
|
||||
</pre>
|
||||
```
|
||||
|
||||
[1]: https://github.com/alecthomas/chroma/blob/484750a96fc430f49d6b69cc2a2a8b7a67691446/quick/example_test.go
|
||||
[2]: https://go.dev/ref/mod#go-mod-file-replace
|
||||
|
||||
## Add test data
|
||||
|
||||
If we want to add our lexer to Chroma, we will need to create some test data
|
||||
for it. We can create a file in `lexers/testdata` called `lua.actual` and
|
||||
add the language tokens to it.
|
||||
|
||||
## Record test output
|
||||
|
||||
Once we have test data, we need to record the expected output. We create
|
||||
another file called `lexers/testdata/lua.expected`. This is the file we
|
||||
will record to by running the following command from the Chroma root directory:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt -e RECORD=true golang:tip-bookworm \
|
||||
go test ./lexers
|
||||
```
|
||||
|
||||
Once test output is recorded in `lexers/testdata/lua.expected`, we should
|
||||
visually inspect and verify that the expected data is correct.
|
||||
|
||||
## Run tests
|
||||
|
||||
As a final confirmation, we can run the tests to make sure we have not broken
|
||||
anything:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm \
|
||||
go test ./lexers
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
If we followed all these steps correctly, our lexer should be ready to be
|
||||
pushed to a `git` repo and for us to open a pull request!
|
||||
|
||||
## Bonus!: Use local Pygments with `pygments2chroma_xml.py`
|
||||
|
||||
These lines in `pygments2chroma_xml.py`,
|
||||
|
||||
```python
|
||||
import pystache
|
||||
from pygments import lexer as pygments_lexer
|
||||
from pygments.token import _TokenType
|
||||
```
|
||||
|
||||
import Pygments from the [Python Package Index](https://pypi.org/). But, if we are working on a
|
||||
Pygments lexer locally, we might want to convert it to a Chroma lexer for
|
||||
testing. We can import a local version of Pygments when running
|
||||
`pygments2chroma_xml.py` by running the following from the Pygments root
|
||||
directory:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt \
|
||||
-v path/to/chroma/_tools/pygments2chroma_xml.py:/opt/pygments2chroma_xml.py \
|
||||
python bash -c "pip install pystache && pip list \
|
||||
&& python pygments2chroma_xml.py pygments.lexers.scripting.LuaLexer"
|
||||
```
|
||||
|
||||
We should see
|
||||
|
||||
```console
|
||||
Package Version
|
||||
-------- -------
|
||||
pip 25.0.1
|
||||
pystache 0.6.8
|
||||
```
|
||||
|
||||
which indicates no remote Pygments package is installed. Following that, we
|
||||
should also see the lexer markup output.
|
||||
|
||||
```console
|
||||
<lexer>
|
||||
<config>
|
||||
...
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user