v2026.2.1
All Bundles

TextSimilarity

Text similarity metrics for comparing documents and strings

vec1 := [1.0, 2.0, 3.0];
vec2 := [2.0, 4.0, 6.0];
similarity := System.NLP.TextSimilarity->CosineSimilarity(vec1, vec2);

tokens1 := ["the", "cat", "sat"];
tokens2 := ["the", "dog", "sat"];
jaccard := System.NLP.TextSimilarity->JaccardSimilarity(tokens1, tokens2);

distance := System.NLP.TextSimilarity->LevenshteinDistance("kitten", "sitting");

Operations

CosineSimilarity

Calculates cosine similarity between two vectors

function : CosineSimilarity(vec1:Float[], vec2:Float[]) ~ Float

Parameters

NameTypeDescription
vec1Floatfirst vector
vec2Floatsecond vector

Return

TypeDescription
Floatcosine similarity (0.0 to 1.0)

JaccardSimilarity

Calculates Jaccard similarity between two sets of tokens

function : JaccardSimilarity(tokens1:String[], tokens2:String[]) ~ Float

Parameters

NameTypeDescription
tokens1Stringfirst token array
tokens2Stringsecond token array

Return

TypeDescription
FloatJaccard similarity (0.0 to 1.0)

LevenshteinDistance

Calculates Levenshtein distance (edit distance) between two strings

function : LevenshteinDistance(str1:String, str2:String) ~ Int

Parameters

NameTypeDescription
str1Stringfirst string
str2Stringsecond string

Return

TypeDescription
Intedit distance (number of edits needed)

NormalizedEditDistance

Calculates normalized edit distance (0.0 = identical, 1.0 = completely different)

function : NormalizedEditDistance(str1:String, str2:String) ~ Float

Parameters

NameTypeDescription
str1Stringfirst string
str2Stringsecond string

Return

TypeDescription
Floatnormalized distance (0.0 to 1.0)