Performance Tests on Phonetic Similarity Score
I. Test Setup
- Data: Training Set
- Gold Standard: non-word only
- Dictionary: CSpell (Lexicon-based)
- Corpus: none
- Ranking: Orthographic ranking
II. Test Results
- Tests on various phonetic coding system within orthographic similarity score ranking.
| ID | Phonetic | Precision | Recall | F1
|
|---|
| 11 | Double Metaphone | 0.7490 | 0.7519 | 0.7505
|
| 12 | Refined Soundex | 0.7332 | 0.7370 | 0.7351
|
| 13 | Caverphone-2 | 0.7172 | 0.7209 | 0.7191
|
| 14 | Metaphone | 0.7487 | 0.7506 | 0.7497
|
| 15 | Metaphone-3 | 0.7452 | 0.7481 | 0.7466
|
- Tests on various weighting factors (WF) on costs of the edit distance (delete, insert, substitute, and transpose) with Metaphone 2 in the orthographic similarity score.
| ID | Delete | Insert | Substitute | Transpose | Precision | Recall | F1 | Notes
|
|---|
| 1 | 0.95 | 0.95 | 0.95 | 0.95 | 0.7490 | 0.7519 | 0.7505 | Same ratio of WF
|
| 2 | 1.00 | 0.95 | 0.95 | 0.95 | 0.7349 | 0.7377 | 0.7363 | Increasing 1 WF
|
| 3 | 0.95 | 1.00 | 0.95 | 0.95 | 0.7275 | 0.7313 | 0.7294
|
| 4 | 0.95 | 0.95 | 1.00 | 0.95 | 0.7413 | 0.7442 | 0.7427
|
| 5 | 0.95 | 0.95 | 0.95 | 1.00 | 0.7490 | 0.7519 | 0.7505
|
| 6 | 0.90 | 0.95 | 0.95 | 0.95 | 0.7275 | 0.7313 | 0.7294 | Decreasing 1 WF
|
| 7 | 0.95 | 0.90 | 0.95 | 0.95 | 0.7439 | 0.7468 | 0.7453
|
| 8 | 0.95 | 0.95 | 0.90 | 0.95 | 0.7172 | 0.7209 | 0.7191
|
| 9 | 0.95 | 0.95 | 0.95 | 0.90 | 0.7439 | 0.7468 | 0.7453
|
| 10 | 0.95 | 0.90 | 0.95 | 1.00 | 0.7439 | 0.7468 | 0.7453 | Try and error to find the WF of cost and phonetic
|
| 99-1 | 0.95 | 0.95 | 0.95 | 0.90 | 0.7375 | 0.7403 | 0.7389
|
- Tests on various weighting factors (WF) on costs of the edit distance (delete, insert, substitute, and transpose). The WF for orthographic is 1.0, 1.0, 1.0.
III. Discussion
- From the results of test 11-15, we chose Double Metaphone as the phonetic system in the orthographic similarity score.
- From the results of test 2-5, we observed the higher the weighting factor of transpose cost, the better the F1 score.
- From the results of test 6-9, we observed the lower the weighting factor of insert cost, the better the F1 score.
- Find the best F1 by try and error from tests 10-99-1, that is lower the cost of insert and raise the cost of transpose.
- Use test 13 for the weighting factors for costs of delete, insert, substitute and transpose.