some days ago A. Harley <firstname.lastname@example.org> put forward
a question concerning the unbiased evaluation of taggers
in terms of correctness, speed etc. It seems to me that this is
a serious issue, and because I'm not aware of systematic studies
in the field, so I prepared a text sample and fed it into five
taggers open to the public, namely:
Manual correction of the output is neccessary for any evaluation
of a tagger's correctness. Therefore, please feel free to download
the technical report that contains the tagged data:
Leidner, Jochen (1997): Evaluating Taggers for English: Some Evidence.
(Technical Report CLUE-TR-971101)
and the data files at
If anybody would like to participate in the evaluation of the data
and share thir results, feel free to mail them to this list or to me.
However, any correctness result is dependent on the respective tagset,
so n% correctness using tagset A is perhaps still worse than (n-1)%
correctness using a more detailed tagset B.
-- Jochen Leidner email@example.com CLUE http://www.linguistik.uni-erlangen.de/~leidner/