Re: Corpora: Clause "splitting"

Pasi Tapanainen (
Mon, 19 Oct 1998 20:50:20 +0200

Ruslan Mitkov wrote:
> In the light of Tony Rose's query, may I inform you that I am also
> interested in parser-free clause "splitting" - detecting clause boundaries
> in complex sentences without using (full) parsing.
> Any references / algorithms / implementations
> related to this topic would be appreciated.


Gregory Grefenstette and Pasi Tapanainen: What is a Word, What is a
Sentence? Problems of Tokenization. in the proceedings of The 3rd
International Conference on Computational Lexicography (COMPLEX'94).
pages 79-87. Budapest, 1994.

It discusses some general principles and their accuracy.