v2026.4.1
All Bundles

Phi3Tokenizer

Phi-3 tokenizer for encoding text to token IDs and decoding token IDs back to text. Loads a HuggingFace tokenizer.json file and supports BPE encoding/decoding. Example tokenizer := Phi3Tokenizer->New("phi3-mini/tokenizer.json"); ids := tokenizer->Encode("What is 2+2?"); text := tokenizer->Decode(ids); text->PrintLine();

Operations

Decode

Decode token IDs to human-readable text

method : public : Decode(token_ids:Int[]) ~ String

Parameters

NameTypeDescription
token_idsIntarray of token IDs

Return

TypeDescription
Stringdecoded text string

Encode

Encode text to token IDs using BPE with special token handling. Added/special tokens are matched as whole strings before BPE is applied.

method : public : Encode(text:String) ~ Int[]

Parameters

NameTypeDescription
textStringinput text to tokenize

Return

TypeDescription
Intarray of token IDs

EncodeBPE

Encode text to token IDs using BPE (no special token handling)

method : private : EncodeBPE(text:String) ~ Int[]

Parameters

NameTypeDescription
textStringinput text to tokenize

Return

TypeDescription
Intarray of token IDs

GetVocabSize

Gets the vocabulary size

method : public : GetVocabSize() ~ Int

Return

TypeDescription
Intvocabulary size

IsLoaded

Checks if the tokenizer was loaded successfully

method : public : IsLoaded() ~ Bool

Return

TypeDescription
Booltrue if loaded

New

Constructor.

New(tokenizer_path:String)

Parameters

NameTypeDescription
tokenizer_pathStringpath to tokenizer.json file