ABSTRACT
This four-part study includes character frequencies for letters, numerals and special characters using large samples of European languages in the Multilingual
Corpus 1 compact disk published by the European Corpus Initiative of the Association of Computational Linguistics. Part One has frequency tables for Spanish. Part Two, has tables for French, Italian, Portuguese, Latin, and Greek. Part Three will similarly treat English, German, Dutch, Norwegian, and Swedish. Part Four will include frequencies for the remaining European languages in monolingual Corpus folders: Albanian, Bulgarian, Czech, Estonian, Gaelic, Lithuanian, Maltese, Russian, Serbian, and Turkish. Sample sizes, except for Bulgarian, Estonian, and Latin are a minimum of one million consecutive characters. The main tables for each language include combined and separate counts for upper and lower case letters with and without marks or accents, as well as tables for numerals and "special" characters.

|