Performance Tests - Ensemble on Training Set
I. Introduction
Performance tests are conducted on different ranking methods of Ensemble Spelling (original code).
II. Setup
${C_SPELL}/SpellCorrection/bin/runSpellingAllData
0 (all data)
3, 4 (nonword, real-word)
0,1,2,3,4 (methods)
${C_SPELL}/SpellCorrection/CHQA_SpellCorrection_Dataset/AllData/
${C_SPELL}/SpellCorrection/CHQA_SpellCorrection_Dataset/ResultAllData/LinearWeighted_nw_OUT_*
${C_SPELL}/SpellCorrection/CHQA_SpellCorrection_Dataset/ResultAllData/LinearWeighted_rw_OUT_*
Backup on:
III. Performance Results
| Methods | Original GoldStd TP|Ret|Rel Precision|Recall|F1 | Revised GoldStd TP|Ret|Rel Precision|Recall|F1 |
|---|---|---|
| 0. PreProcess | 289|347|814 0.8329|0.3550|0.4978 | 289|347|774 0.8329|0.3734|0.5156 |
| 1. Orthographic | 495|824|814 0.6007|0.6081|0.6044 | 511|824|774 0.6201|0.6602|0.6395 |
| 2. Corpus Frequency | 361|810|814 0.4457|0.4435|0.4446 | 366|810|774 0.4519|0.4729|0.4621 |
| 3. Word Embedding | 350|807|814 0.4337|0.4300|0.4318 | 358|807|774 0.4436|0.4625|0.4529 |
| 4. Ensemble | 530|825|814 0.6424|0.6511|0.6467 | 552|825|774 0.6691|0.7132|0.6904 |
| Methods | Original GoldStd TP|Ret|Rel Precision|Recall|F1 | Revised GoldStd TP|Ret|Rel Precision|Recall|F1 |
|---|---|---|
| Ensemble (non-word) | 531|825|926 0.6436|0.5734|0.6065 | 556|825|964 0.6739|0.5768|0.6216 |
| Ensemble (real-Word) | 498|718|926 0.6936|0.5378|0.6058 | 517|718|964 0.7201|0.5363|0.6147 |