Re: Corpora: multilingual texts
Ted E. Dunning (firstname.lastname@example.org)
Tue, 2 Dec 1997 12:36:16 -0800
I did some work on language identification and have an evaluation
corpus available for anybody who wants to try their hand. This corpus
was developed by taking random samples from a Spanish/English parallel
I include with the test corpus both a technical report (somewhat
outdated) and working code (also somewhat outdated).
You can ftp the 1995 version of the test corpus/paper/code from
If you want the latest description and code, please email me.