canto-parser – Typst Universe

A Typst package to render Cantonese text with Jyutping (粵拼) or Yale (耶魯) romanizations from “Cantonese words JSON array”.

Sample output

package output

Input Data Format

The render-word-groups function accepts an array of JSON objects. Each object represents a segmented word or symbol and must follow this structure:

Field	Type	Description
`word`	string	The Chinese character(s) or text snippet.
`jyutping`	string (optional)	The combined Jyutping for the word (e.g., `"hou2jek6"`). Use `null` for non-Chinese text.
`yale`	array (optional)	An array of Yale romanization strings, segmented by character (e.g., `["hóu", "yehk"]`).

Example:

[
  {
    "word": "你好",
    "jyutping": "nei5hou2",
    "yale": ["néih", "hóu"]
  },
  {
    "word": "!",
    "jyutping": null,
    "yale": null
  }
]

Usage

Option A: Automatic (Recommended)

Use the auto-canto package. This is the easiest way as it uses a WASM-based logic to segment and annotate text directly within Typst without needing Python.

Option B: Pre-processed JSON (Manual)

If you prefer to use PyCantonese for specific linguistic needs, you can generate a JSON file using the script below.

View Python helper script

import pycantonese
import json
import sys

# Script to generate compatible JSON format
lines = sys.argv[1].split('\n')
result = []

for line in lines:
    words = pycantonese.segment(line)
    for word in words:
        pairs = pycantonese.characters_to_jyutping(word)
        jyutping = "".join([p[1] for p in pairs if p[1]]) if pairs else None
        # Note: adjust pycantonese call to return array for yale if needed
        yale = [pycantonese.jyutping_to_yale(p[1]) for p in pairs if p[1]] if pairs else None
        result.append({"word": word, "jyutping": jyutping, "yale": yale})
    result.append({"word": "\n", "jyutping": None})

print(json.dumps(result, ensure_ascii=False))

Generate your data: python script.py "你好" > data.json

Import and render:

#import "@preview/canto-parser:0.2.1": *
#let data = json("data.json")
#render-word-groups(data)

License

MIT

How to add

Copy this into your project and use the import as canto-parser

#import "@preview/canto-parser:0.2.1"

Check the docs for more information on how to import packages .

About

Author:

Vincent Tam

License:

MIT

Current version:

0.2.1

Last updated:

March 4, 2026

First released:

March 4, 2026

Minimum Typst version:

0.14.2

Archive size:

3.08 kB

Repository:

GitHub

Categories:

Explore more packages by Vincent Tam

Where to report issues?

This package is a project of Vincent Tam. Report issues on their repository. You can also try to ask for help with this package on the Forum.

Please report this package to the Typst team using the contact form if you believe it is a safety hazard or infringes upon your rights.

Version history

Version	Release Date
0.2.1	March 4, 2026

Typst GmbH did not create this package and cannot guarantee correct functionality of this package or compatibility with any version of the Typst compiler or app.