auto-jrubby 0.2.0 – Typst Universe

auto-jrubby is a Typst package that provides automatic Japanese morphological analysis and furigana (ruby) insertion.

It leverages a Rust-based WASM plugin to tokenize text using Lindera (a morphological analysis library) and uses the rubby package to render the furigana.

Features

Automatic Furigana Generation: Automatically determines readings for Kanji based on context and renders them as ruby text.
Smart Okurigana Alignment: Intelligent handling of mixed Kanji/Hiragana words (e.g., 食べる is rendered with ruby た over 食, leaving べる untouched).
Morphological Analysis Table: Visualize the text structure (Part of Speech, Detailed POS, Readings, Base forms) via a formatted data table.
Customizable Styling: Supports custom ruby sizing and positioning via the rubby package backend.
High Performance: Powered by a Rust WASM plugin using Lindera for fast and accurate tokenization.

Usage

Basic Furigana

To automatically add readings to Japanese text:

#import "@preview/auto-jrubby:0.2.0": *
#set text(font: "Hiragino Sans", lang: "ja")

#let sample = "ルビ(英語: ruby)は、文章内の任意の文字に対しふりがなや説明、異なる読み方といった役割の本文の横に付属される文字。"
#show-ruby(sample)

sample

Morphological Analysis Table

To debug or display the linguistic structure of the text:

#import "@preview/auto-jrubby:0.2.0": *
#set text(font: "Hiragino Sans", lang: "ja")

#show-analysis-table("すももも桃も桃のうち")

table

API Reference

`show-ruby`

Renders the input text with automatic furigana.

#let show-ruby(
  input-text,
  size: 0.5em,
  leading: 1.5em,
  ruby-func: auto,
  user-dict: none,
  dict: "ipadic"
)

Parameters:

input-text (string): The Japanese text to analyze and render.
size (length): The font size of the ruby text. Defaults to 0.5em.
leading (length): The vertical space between lines to accommodate ruby text. Defaults to 1.5em.
ruby-func (function | auto): A custom ruby function from the rubby package.
- If auto, it uses the default configuration (get-ruby(size: size)).
- If provided, it allows advanced customization of ruby positioning (e.g., specific pos or alignment).
user-dict (string | array | none): Optional user dictionary for custom tokenization.
- If string: A CSV-formatted string with custom dictionary entries.
- If array: An array of arrays, where each inner array represents a CSV row.
- If none: No user dictionary is used.
dict (string): The dictionary to use for tokenization. Must be one of:
- "ipadic" (default): Standard Japanese dictionary
- "unidic": Alternative dictionary with different grammatical analysis

`show-analysis-table`

Renders a table displaying the morphological breakdown of the text.

#let show-analysis-table(
  input-text,
  user-dict: none,
  dict: "ipadic"
)

Parameters:

input-text (string): The text to analyze.
user-dict (string | array | none): Optional user dictionary for custom tokenization.
dict (string): The dictionary to use. Must be one of: "ipadic" or "unidic".

Table Columns:

Surface Form (表層形): The word as it appears in the text.
Part of Speech (品詞): Grammatical category (Noun, Verb, etc.).
Details (詳細): Sub-category (e.g., Proper Noun, Suffix).
Reading (読み): Katakana reading.
Base Form (基本形): The dictionary form of the word.

`tokenize`

Low-level function that returns the raw JSON data from the WASM plugin. Useful if you want to process the analysis data manually.

#let tokenize(
  input-text,
  user-dict: none,
  dict: "ipadic"
)

Parameters:

input-text (string): The text to tokenize.
user-dict (string | array | none): Optional user dictionary for custom tokenization.
dict (string): The dictionary to use. Must be one of: "ipadic" or "unidic".

Returns: An array of dictionaries containing:

surface: The surface form of the token
pos: Part of speech
sub_pos: Sub-category of the part of speech
reading: Katakana reading
base: Base (dictionary) form
ruby_segments: Array of segments with text and ruby fields for furigana rendering

User Dictionary Format

The user dictionary allows you to define custom word segmentation and readings. It uses a simple CSV format with three columns:

<surface>,<part_of_speech>,<reading>

surface: The word as it appears in text
part_of_speech: Custom part-of-speech label (e.g., “カスタム名詞”)
reading: Katakana reading for the word

Usage Examples:

Method 1: Inline string

#let user-dict-str = "東京スカイツリー,カスタム名詞,トウキョウスカイツリー
東武スカイツリーライン,カスタム名詞,トウブスカイツリーライン
とうきょうスカイツリー駅,カスタム名詞,トウキョウスカイツリーエキ"

#show-ruby("東京スカイツリーの最寄り駅はとうきょうスカイツリー駅です", user-dict: user-dict-str)

Method 2: Array of arrays

#let user-dict-array = (
  ("東京スカイツリー", "カスタム名詞", "トウキョウスカイツリー"),
  ("東武スカイツリーライン", "カスタム名詞", "トウブスカイツリーライン"),
  ("とうきょうスカイツリー駅", "カスタム名詞", "トウキョウスカイツリーエキ")
)

#show-ruby("東京スカイツリーの最寄り駅はとうきょうスカイツリー駅です", user-dict: user-dict-array)

Method 3: Load from CSV file

#let user-dict-from-file = csv("user_dict.csv")

#show-ruby("東京スカイツリーの最寄り駅はとうきょうスカイツリー駅です", user-dict: user-dict-from-file)

Under the Hood

This package uses Lindera (a Rust port of Kuromoji) with two available dictionary options:

IPADIC: Standard Japanese morphological dictionary
UniDic: Alternative dictionary with different part-of-speech classifications

The processing workflow:

The text is passed from Typst to the Rust WASM plugin.
Lindera tokenizes the text using the specified dictionary and retrieves readings.
A custom algorithm aligns the readings with the surface form to separate okurigana (kana endings of verbs/adjectives) from the kanji stems.
The structured data is returned to Typst and rendered using the rubby package for furigana display.

License

This project is distributed under the AGPL-3.0-or-later License. See LICENSE for details.

How to add

Copy this into your project and use the import as auto-jrubby

#import "@preview/auto-jrubby:0.2.0"

Check the docs for more information on how to import packages .

About

Author:: Eito Yoneyama
License:: AGPL-3.0-or-later
Current version:: 0.2.0
Last updated:: December 10, 2025
First released:: December 9, 2025
Minimum Typst version:: 0.13.1
Archive size:: 56.3 MB
Repository:: GitHub

Explore more packages by Eito Yoneyama

Where to report issues?

This package is a project of Eito Yoneyama. Report issues on their repository. You can also try to ask for help with this package on the Forum.

Please report this package to the Typst team using the contact form if you believe it is a safety hazard or infringes upon your rights.

Version history

Version	Release Date
0.3.0	December 11, 2025
0.2.0	December 10, 2025
0.1.0	December 9, 2025

Typst GmbH did not create this package and cannot guarantee correct functionality of this package or compatibility with any version of the Typst compiler or app.