Need help designing an algorithm for similarities between words

Jamin2112 · Feb 25, 2014

For fun, I'm trying to make a C++ program that takes a word from the user and comes up with an ordered list of suggested words, similar to the kind of thing you have on your cell phone when sending SMS messages.

So far I have:

An std::vector<std::string> of the 5000 most common English words
[*]An std::map<std::pair<char,char>,int> of the pairwise distances of keys on a standard computer keyboard. For instance, (A,A) → 0, (A,Q) → 1, (W,C) → 3, (Z,M) → 6.

The idea is that when the user types in a word, it is checked against every one of the 5000 most common English words using some algorithm that uses my map. That map is supposed to help detect if a user hit a wrong key or two when typing in a word. Any ideas?

mfb · Feb 25, 2014

What do you do with missing or additional letters?

The Levenshtein distance could be interesting, you just have to apply weights to the character substitutions in some way.

Jamin2112 · Feb 25, 2014

mfb said:

What do you do with missing or additional letters?

That's my problem -- I don't know!

An easy distance formula for comparing words of the same length is to sum up the distances between letters at corresponding indices. For instance, ("rhe","the")→1+0+0=1, so "the" is going to be one of the top suggestions. But yea, I'm having trouble thinking of a general formula.

Need help designing an algorithm for similarities between words

Thread 'How to connect Frontend and Backend?'

Thread 'Is this public key encryption?'

Thread 'Who is responsible for the software when AI takes over programming?'

Similar threads

Hot Threads

Touch-typing for programmers

How to calculate Tension for a series of connected points?

Python Complaining About Python

Fortran Reading files in pre-f77 - handling end of file

Python Partial pivoting in getting the reduced row echelon form of a matrix

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem