Publish post 'Add a Pygments Lexer to Chroma' #2
@ -1,62 +1,74 @@
|
||||
{
|
||||
blurb: "Add a new lexer to chroma"
|
||||
blurb: "Add a new lexer to Chroma"
|
||||
}
|
||||
$index
|
||||
|
||||
## Intro
|
||||
## Introduction
|
||||
|
||||
Gitea uses Chroma for syntax highlighting. Chroma doesn't have a MoonScript
|
||||
lexer. It does has a Python script that can convert Pygments lexers, though,
|
||||
and Pygments has a MoonScript lexer.
|
||||
[Gitea](https://github.com/go-gitea/gitea) uses [Chroma](https://github.com/alecthomas/chroma) for syntax highlighting. Chroma is based on the Python
|
||||
syntax highlighter, [Pygments](https://github.com/pygments/pygments), and includes a [script](https://github.com/alecthomas/chroma/blob/484750a96fc430f49d6b69cc2a2a8b7a67691446/_tools/pygments2chroma_xml.py) to help convert Pygments
|
||||
lexers for use with Chroma. This post describes that process.
|
||||
|
||||
## Run MoonScript lexer generation script
|
||||
## Convert a Pygments lexer to a Chroma lexer with `pygments2chroma_xml.py`
|
||||
|
||||
To create the lexer, in the Chroma root directory run:
|
||||
In the Chroma root directory, we run:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt python bash -c \
|
||||
"pip install pystache pygments \
|
||||
"pip install pystache pygments && pip list \
|
||||
&& python _tools/pygments2chroma_xml.py \
|
||||
pygments.lexers.scripting.MoonScriptLexer > lexers/embedded/moonscript.xml \
|
||||
&& pip list"
|
||||
pygments.lexers.scripting.LuaLexer > lexers/embedded/lua.xml"
|
||||
```
|
||||
|
||||
## Use the Chroma MoonScript lexer to highlight some code
|
||||
As output, we should see this in our terminal:
|
||||
|
||||
Create a file like this:
|
||||
```
|
||||
Package Version
|
||||
-------- -------
|
||||
pip 25.0.1
|
||||
Pygments 2.19.2
|
||||
pystache 0.6.8
|
||||
```
|
||||
|
||||
This just helps us know what version of Pygments we generated our lexer from.
|
||||
The file `lexers/embedded/lua.xml` should now contain all the tokenization
|
||||
rules for the [Lua](https://www.lua.org) language.
|
||||
|
||||
::: filename-for-code-block
|
||||
`main.go`
|
||||
`lexers/embedded/lua.xml`
|
||||
:::
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
|
||||
"github.com/alecthomas/chroma/v2/quick"
|
||||
)
|
||||
|
||||
func main() {
|
||||
code := `package main
|
||||
|
||||
func main() { }
|
||||
`
|
||||
|
||||
fmt.Println(quick.Highlight(os.Stdout, code, "go", "html", "monokai"))
|
||||
}
|
||||
```xml
|
||||
<lexer>
|
||||
<config>
|
||||
<name>Lua</name>
|
||||
...
|
||||
```
|
||||
|
||||
I did one of these:
|
||||
## Highlight some code with our new lexer
|
||||
|
||||
Chroma provides a [simple example test file][1] we can modify to see what syntax
|
||||
highlighting with our new lexer looks like. First, though, we need to create a
|
||||
new Go module by running `go mod init`:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm \
|
||||
go mod init main
|
||||
go: creating new go.mod: module main
|
||||
go: to add module requirements and sums:
|
||||
go mod tidy
|
||||
```
|
||||
|
||||
Which gave me the `go.mod` file.
|
||||
We will need required modules, so let's go ahead and run `go mod tidy` as the
|
||||
output suggests.
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm \
|
||||
go mod tidy
|
||||
```
|
||||
|
||||
We should now have 2 additional files, `go.mod` and `go.sum`. `go.sum` has some
|
||||
package hashes while `go.mod` should look like this:
|
||||
|
||||
::: filename-for-code-block
|
||||
`go.mod`
|
||||
@ -67,23 +79,55 @@ module main
|
||||
|
||||
go 1.25
|
||||
|
||||
require (
|
||||
github.com/alecthomas/chroma/v2 v2.18.0 // indirect
|
||||
github.com/dlclark/regexp2 v1.11.5 // indirect
|
||||
)
|
||||
require github.com/alecthomas/chroma/v2 v2.18.0
|
||||
|
||||
require github.com/dlclark/regexp2 v1.11.5 // indirect
|
||||
```
|
||||
|
||||
Then I did one of these:
|
||||
Now we can create a `main.go` file and copy over the code from Chroma's example
|
||||
test file, but we update the `code` variable and the lexer we pass into the
|
||||
`Highlight` function for Lua:
|
||||
|
||||
|
||||
::: filename-for-code-block
|
||||
`main.go`
|
||||
:::
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"log"
|
||||
"os"
|
||||
|
||||
"github.com/alecthomas/chroma/v2/quick"
|
||||
)
|
||||
|
||||
func main() {
|
||||
code := `print("hello")`
|
||||
|
||||
err := quick.Highlight(os.Stdout, code, "lua", "html", "monokai")
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Now we can try running our `main.go` like this:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm \
|
||||
go run main.go
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm go run main.go
|
||||
go: downloading github.com/alecthomas/chroma/v2 v2.18.0
|
||||
go: downloading github.com/dlclark/regexp2 v1.11.5
|
||||
<html>
|
||||
<style type="text/css">
|
||||
...
|
||||
```
|
||||
|
||||
And that should output markup (and styles) for highlighting that block of Go
|
||||
And that should output markup (and styles) for highlighting that block of Lua
|
||||
code to the console. But if we notice, it's importing the Chroma package from
|
||||
the GitHub repo. We want to use our local version of chroma, so we use `go mod
|
||||
edit` to [replace the chroma import with our local version](https://go.dev/ref/mod#go-mod-file-replace):
|
||||
the GitHub repo. If we want to use a local version of Chroma, we have to use a
|
||||
[`replace` directive][2] to import Chroma from our local directory:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm \
|
||||
@ -102,49 +146,76 @@ Which adds this line to our `go.mod` file:
|
||||
replace github.com/alecthomas/chroma/v2 v2.18.0 => ./chroma
|
||||
```
|
||||
|
||||
Now we can put some MoonScript in `main.go`.
|
||||
|
||||
```go
|
||||
code := `print "Hello, #{@name}!"`
|
||||
|
||||
fmt.Println(quick.Highlight(os.Stdout, code, "moonscript", "html", "monokai"))
|
||||
```
|
||||
|
||||
And we have it:
|
||||
Now, when we run `main.go`, we should no longer see Chroma being imported,
|
||||
because it's using our local copy:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm \
|
||||
go run main.go
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm go run main.go
|
||||
go: downloading github.com/dlclark/regexp2 v1.11.5
|
||||
<html>
|
||||
<style type="text/css">
|
||||
...
|
||||
```
|
||||
|
||||
That should output syntax highlighting using our local version of chroma.
|
||||
We should also see a list of styles followed by the HTML markup for
|
||||
highlighting our Lua code (formatted for legibility):
|
||||
|
||||
## Create testdata
|
||||
```html
|
||||
<pre class="chroma">
|
||||
<code>
|
||||
<span class="line">
|
||||
<span class="cl">
|
||||
<span class="n">print</span>
|
||||
<span class="p">(</span>
|
||||
<span class="s2">"hello"</span>
|
||||
<span class="p">)</span>
|
||||
</span>
|
||||
</span>
|
||||
</code>
|
||||
</pre>
|
||||
```
|
||||
|
||||
Create a file in `lexers/testdata` called `moonscript.actual`. Add the tokens
|
||||
from the language in this file.
|
||||
[1]: https://github.com/alecthomas/chroma/blob/484750a96fc430f49d6b69cc2a2a8b7a67691446/quick/example_test.go
|
||||
[2]: https://go.dev/ref/mod#go-mod-file-replace
|
||||
|
||||
## Add test data
|
||||
|
||||
If we want to add our lexer to Chroma, we will need to create some test data
|
||||
for it. We can create a file in `lexers/testdata` called `lua.actual` and
|
||||
add the language tokens to it.
|
||||
|
||||
## Record test output
|
||||
|
||||
Create another file called `lexers/testdata/moonscript.expected`. This is the
|
||||
file we will record to.
|
||||
Once we have test data, we need to record the expected output. We create
|
||||
another file called `lexers/testdata/lua.expected`. This is the file we
|
||||
will record to by running the following command from the Chroma root directory:
|
||||
|
||||
```console
|
||||
$ RECORD=true go test ./lexers
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt -e RECORD=true golang:tip-bookworm \
|
||||
go test ./lexers
|
||||
```
|
||||
|
||||
Visually inspect and verify that the expected data is correct.
|
||||
Once test output is recorded in `lexers/testdata/lua.expected`, we should
|
||||
visually inspect and verify that the expected data is correct.
|
||||
|
||||
## Run tests
|
||||
|
||||
As a final confirmation, we can run the tests to make sure we have not broken
|
||||
anything:
|
||||
|
||||
```console
|
||||
$ go test ./lexers
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt golang:tip-bookworm \
|
||||
go test ./lexers
|
||||
```
|
||||
|
||||
## Bonus!: Use local `pygments` with `pygments2chroma_xml.py`
|
||||
## Conclusion
|
||||
|
||||
These lines in `pygments2chroma_xml.py`:
|
||||
If we followed all these steps correctly, our lexer should be ready to be
|
||||
pushed to a `git` repo and for us to open a pull request!
|
||||
|
||||
## Bonus!: Use local Pygments with `pygments2chroma_xml.py`
|
||||
|
||||
These lines in `pygments2chroma_xml.py`,
|
||||
|
||||
```python
|
||||
import pystache
|
||||
@ -152,19 +223,20 @@ from pygments import lexer as pygments_lexer
|
||||
from pygments.token import _TokenType
|
||||
```
|
||||
|
||||
Import pygments from pip? How do we get it to load a local version of
|
||||
`pygments`?
|
||||
|
||||
In Pygments root directory:
|
||||
import Pygments from the [Python Package Index](https://pypi.org/). But, if we are working on a
|
||||
Pygments lexer locally, we might want to convert it to a Chroma lexer for
|
||||
testing. We can import a local version of Pygments when running
|
||||
`pygments2chroma_xml.py` by running the following from the Pygments root
|
||||
directory:
|
||||
|
||||
```console
|
||||
$ docker run --rm -it -w /opt -v $PWD:/opt \
|
||||
-v ../gitea-syntax-highlight/chroma/_tools/pygments2chroma_xml.py:/opt/pygments2chroma_xml.py \
|
||||
-v path/to/chroma/_tools/pygments2chroma_xml.py:/opt/pygments2chroma_xml.py \
|
||||
python bash -c "pip install pystache && pip list \
|
||||
&& python pygments2chroma_xml.py pygments.lexers.scripting.LuaLexer"
|
||||
```
|
||||
|
||||
Should see.
|
||||
We should see
|
||||
|
||||
```console
|
||||
Package Version
|
||||
@ -173,8 +245,8 @@ pip 25.0.1
|
||||
pystache 0.6.8
|
||||
```
|
||||
|
||||
That shows no remote pygments package is installed. After that you will see the
|
||||
lexer markup output.
|
||||
which indicates no remote Pygments package is installed. Following that, we
|
||||
should also see the lexer markup output.
|
||||
|
||||
```console
|
||||
<lexer>
|
||||
|
Loading…
x
Reference in New Issue
Block a user