🔍

Autocomplete

Search Algorithm

A context-aware autocomplete engine using n-gram language modeling. Combines a character trie for prefix matching with a word-level n-gram tree for context-aware predictions, intersecting results to surface the most relevant completions based on both the current word fragment and preceding context.

Interactive Demo

Try typing words like "the", "bee", "love", or "moon" - trained on Bee Movie script and A Midsummer Night's Dream.

How It Works

Dual Tree Architecture

Uses both a character trie for prefix matching and an n-gram word tree for context-aware predictions.

N-Gram Context

The word tree captures 4-gram sequences, allowing predictions based on preceding words in a phrase.

Set Intersection

Combines results from both trees using set intersection and probability multiplication for optimal relevance.

Probability Ranking

Tracks word frequencies during training to rank suggestions by likelihood of occurrence.

Code Preview

autocomplete.js

// Find completions using both trees
findCompletions(input) {
    const words = input.split(/\s+/);
    const lastWord = words[words.length - 1];

    // Character-based completions (prefix match)
    const charCompletions = this.charTree.search(
        lastWord.split(''));

    // Word-based completions (context-aware)
    const context = words.slice(0, -1);
    const wordCompletions = this.wordTree.search(context);

    // Intersect and combine probabilities
    const intersection = this.intersect(
        charCompletions, wordCompletions);

    return intersection.sort((a, b) =>
        b.probability - a.probability);
}

Technologies

JavaScript Node.js Trie N-Grams NLP

Links

View on GitHub