ASCII LEXICON: Reports and Review
I. Log Files
Three log files are generated for every line of LEXICON contains non-ASCII chracters (convert to ASCII). These logs files are the raw source files used to generate report files on the next steps. They are:
| Log File | Desciption | Tags |
|---|---|---|
| LEXICON.asciiBaseLog | Log for the ASCII conversion on citation and spVars |
|
| LEXICON.asciiLineLog | Log for other line by line ASCII conversion |
|
| LEXICON.asciiLog | Log for the final clean up | Log for duplicated lines |
II. Reports
Reports are generated from the log files as described in follows:
| Reports | Desriptions | Action | 2010 | 2011 | 2012 |
|---|---|---|---|---|---|
| baseChange.rpt | non-ASCII citations converted to ASCII | fgrep "|base|change|from-spVar|" LEXICON.asciiBaseLog | 270 | 368 | 0 |
| baseDeleteNotLex.rpt | non-ASCII citation are deleted (not known to LEXICON) | fgrep "|base|delete|not-Lex|" LEXICON.asciiBaseLog | 29 | 42 | 8 |
| spVarDeleteAsciiDupBase.rpt | ASCII spVars are deleted (duplicated from citation) | fgrep "|base|delete|ascii-dup-base|" LEXICON.asciiBaseLog | 270 | 368 | 0 |
| spVarDeleteDupBase.rpt | non-ASCII spVars are deleted (duplicated from citation) | fgrep "|base|delete|dup-base|" LEXICON.asciiBaseLog | 1248 | 1522 | 2102 |
| spVarDeleteDupSpVar.rpt | non-ASCII spVars are deleted (duplicated from spVars) | fgrep "|base|delete|dup-spVar|" LEXICON.asciiBaseLog | 2437 | 3384 | 3916 |
| spVarDeleteNotLex.rpt | non-ASCII spVars are deleted (not known to LEXICON) | fgrep "|base|delete|not-Lex|" LEXICON.asciiBaseLog | 259 | 363 | 430 |
| baseSpVarNotLex.rpt | non-ASCII citation and spVars are deleted (not known to LEXICON) |
| 284 | 401 | 430 |
| asciiLineChange.rpt | non-ASCII line are changed (known to LEXICON) | fgrep "|change|ascii-base|" LEXICON.asciiLineLog | 24 | 29 | 42 |
| asciiLineDelete.rpt | non-ASCII line are deleted (not known to LEXICON or not used) | fgrep "|delete|not-Lex|" LEXICON.asciiLineLog | 78 | 93 | 97 |
| abbreviationChange.rpt | non-ASCII abbreviations are changed (known to LEXICON) | fgrep "abbreviation_of=" asciiLineChange.rpt | 3 | 5 | 9 |
| abbreviationDeleteNotLex.rpt | non-ASCII abbreviations are deleted (not known to LEXICON) | fgrep "abbreviation_of=" asciiLineDelete.rpt | 1 | 1 | 3 |
| acronymChange.rpt | non-ASCII acronyms are changed (known to LEXICON) | fgrep "acronym_of=" asciiLineChange.rpt | 20 | 23 | 32 |
| acronymDeleteNotLex.rpt | non-ASCII acronyms are deleted (not known to LEXICON) | fgrep "acronym_of=" asciiLineDelete.rpt | 6 | 11 | 13 |
| nominalizationChange.rpt | non-ASCII nominalizations are changed (known to LEXICON) | fgrep "nominalization=" asciiLineChange.rpt | 1 | 1 | 1 |
| nominalizationDeleteNotLex.rpt | non-ASCII nominalizations are deleted (not known to LEXICON) | fgrep "nominalization=" asciiLineChange.rpt | 2 | 2 | 0 |
| complDelete.rpt | non-ASCII compl are deleted (not used) | fgrep "compl=" asciiLineDelete.rpt | 2 | 4 | 4 |
| irregDelete.rpt | non-ASCII irreg are deleted (duplicated or not known to LEXICON) | fgrep "variants=irreg|" asciiLineDelete.rpt | 66 | 74 | 76 |
| trademarkDelete.rpt | non-ASCII trademark are deleted (not used) | fgrep "trademark=" asciiLineDelete.rpt | 1 | 1 | 1 |
III. Review
| Field | Report | Actions |
|---|---|---|
| summary.rpt |
| |
| citation | baseDeleteNotLex.out |
|
| spVar | spVarDeleteNotLex.out |
|
| irreg | irregDelete.out |
|
| abbreviation | abbreviationDeleteNotLex.out |
|
| acronym | acronymDeleteNotLex.out |
|
| nominalization | nominalizationDeleteNotLex.out |
|
| compl | complDelete.out |
|
| trademark | trademarkDelete.out |
|