module Similarity
Defined in:
similarity.crClass Method Summary
-
.calculate_signature(text : String) : Array(Int32)
Calculate MinHash signature for a given text
-
.clear_cache : Nil
Clear the signatures cache (useful for testing or rebuilds)
-
.create_signature(post : Markdown::File, lang : String) : Signature
Create MinHash signature for a post
-
.create_tasks(posts : Array(Markdown::File)) : Nil
Create tasks to calculate and store signatures for all posts
-
.enable(is_enabled : Bool, posts : Array(Markdown::File))
Enable similarity feature (actual work is done in Posts.create_tasks)
-
.find_related(post : Markdown::File, lang : String, limit : Int32 = 5) : Array(RelatedPost)
Find related posts for a given post
-
.get_all_signatures(lang : String) : Array(Signature)
Get all signatures from the kv store with caching
-
.get_signature(post_link : String, lang : String) : Signature | Nil
Retrieve a post's signature from the kv store
-
.jaccard_similarity(sig1 : Signature, sig2 : Signature) : Float64
Calculate Jaccard similarity between two MinHash signatures
- .ngram_size : Int32
- .ngram_size=(ngram_size : Int32)
-
.num_permutations : Int32
Configuration for MinHash generation
-
.num_permutations=(num_permutations : Int32)
Configuration for MinHash generation
-
.store_index(post_links : Array(String), lang : String) : Nil
Store the index of all post links
-
.store_signature(post : Markdown::File, lang : String) : Nil
Store a post's signature in the kv store
Class Method Detail
Calculate MinHash signature for a given text
The signature is an array of minimum hash values across all n-rams in the document, one for each hash function
Create MinHash signature for a post
Create tasks to calculate and store signatures for all posts
Enable similarity feature (actual work is done in Posts.create_tasks)
Get all signatures from the kv store with caching
Retrieve a post's signature from the kv store
Calculate Jaccard similarity between two MinHash signatures
Returns a value between 0.0 (no similarity) and 1.0 (identical)
Store the index of all post links
Store a post's signature in the kv store