Dictionary Analysis
I. Introduction
Four dictionaries (Jazzy, Ensemble, Medline, and Lexicon) are compared. Here are the summary:
| Jazzy | Ensemble | Medline | Lexicon | |
|---|---|---|---|---|
| Size | 159,345 | 459,038 | 496,387 | 558,353 |
| Files | 1 + 10 | 1 + 10 | 1 | 1 |
| Preserved Case | No (LC) | No (LC) | No (LC) | Yes |
| Verified | No | No | No | Yes |
| General English | Yes | Yes | Yes | Yes |
| Biomedical | No | Yes | Yes | Yes |
| Coded | No | No | No | Yes, extra information are available:
|
II. Analysis and Tests
Analysis and performance tests are conducted from various dictionaries to obtain a better dictionary generation. Please see the following URL for details:
From the above results, we observe:
III. Overlap and Contain Check
| Lexicon (lexicon.ewLc.dic, 534,330) | |||
|---|---|---|---|
| Src+Tar | Src-Tar | Tar-Src | |
| Ensemble (medical.dic, 299,670) | 71,212 | 228,458 | 463,118 |
| Medline (medline.dic, 496,387) | 212,961 | 283,426 | 321,369 |
| Jazzy (eng_com.dic, 150,843) | 104,853 | 45,990 | 429,477 |
| Jazzy (spVar10File.dic, 8,502) | 6,198 | 2,304 | 528,132 |