Detectors
Detectors are used to detect spelling errors (non-word and real-word). Different corrections require different detectors. For example, the detector for a non-word correction is to detect if a token is a non-word errors (i.e. words not in the dictionary) while the detector for a real-word correction is to detect if a token is a real-word errors (errors are valid words, but not intended). This page uses the detector for non-word spelling (1-to-1) correction to illustrate the concept of detector. Please refer to each process for the details of different types of detectors.
The non-word 1-to-1 detector checks if a token is a spelling error. A token can be valid (not need to be correct) if it is known by the (checking) dictionary or a spelling error exception. They are described as follows:
I. Dictionary
II. Algorithm
III. Exception Examples
| Input | Notes |
|---|---|
| year-long | Spelling variants |
| dont's | possessive |
| 123 | digit |
| 123.456 | digit |
| _ | punctuation |
| 12-35-00 | digit and punctuation |
| 12.35.00 | digit and punctuation |
| clinicaltrials.gov | url |
| http://www.yahoo.com?test=1%20try%20abc | url |
| 123@gmail.com | |
| -0.25mm | measurement |
| 30mg/50kg | measurement |