Objeck Documentation

All Bundles

Bundle ONNX Runtime inference library for running pre-trained models. Supports object detection (YOLO), image classification (ResNet), segmentation (DeepLab), pose estimation (OpenPose), face recognition, and Phi-3 text generation. Compile with -lib onnx.

Phi3VisionSession

Phi-3 Vision multimodal session for image understanding and generation. Uses three separate ONNX models: vision encoder, text embedding, and text decoder. Accepts raw image bytes (JPEG/PNG) and pre-tokenized prompt tokens split into prefix (before image placeholder) and suffix (after image placeholder) segments.

Example

model_dir := "phi3v-directml/";
session := Phi3VisionSession->New(
  model_dir + "phi-3-v-128k-instruct-vision.onnx",
  model_dir + "phi-3-v-128k-instruct-text-embedding.onnx",
  model_dir + "model.onnx");
image_bytes := System.IO.File.FileReader->ReadBinaryFile("photo.jpg");
prefix := [32010, 29871];
suffix := [32007, 32001];
eos := [32000, 32007];
result := session->Generate(image_bytes, prefix, suffix, 256, 0.0, eos);
each(token in result->GetTokens()) {
  token->PrintLine();
};
session->Close();

Close #

Closes all three sessions (vision, embedding, decoder)

method : public : Close() ~ Nil

Generate #

Generate text tokens from an image and prompt tokens. The prompt is split into prefix tokens (before the image) and suffix tokens (after the image).

method : public : Generate(image_bytes:Byte[], prefix_tokens:Int[], suffix_tokens:Int[], max_tokens:Int, temperature:Float, eos_tokens:Int[]) ~ API.Onnx.Phi3Result

Parameters

Name	Type	Description
image_bytes	Byte	raw image file bytes (JPEG/PNG)
prefix_tokens	Int	token IDs before the image placeholder
suffix_tokens	Int	token IDs after the image placeholder
max_tokens	Int	maximum number of tokens to generate
temperature	Float	sampling temperature (0.0 for greedy)
eos_tokens	Int	array of end-of-sequence token IDs

Return

Type	Description
Phi3Result	generation result with output token IDs

New # constructor

Constructor.

New(vision_model:String, embed_model:String, decoder_model:String)

Parameters

Name	Type	Description
vision_model	String	path to vision encoder ONNX model
embed_model	String	path to text embedding ONNX model
decoder_model	String	path to text decoder ONNX model

New # constructor

Constructor with configuration.

New(vision_model:String, embed_model:String, decoder_model:String, config:Map<String,String>)

Parameters

Name	Type	Description
vision_model	String	path to vision encoder ONNX model
embed_model	String	path to text embedding ONNX model
decoder_model	String	path to text decoder ONNX model
config	Map<String,String>	session configuration parameters