If you are indeed working on texts derived from the BNC, then a fairly
obvious thing to check would be whether the lines are in fact duplicated in
the BNC itself. Go to http://sara.natcorp.ox.ac.uk/lookup.html and type one
of your repeated phrases into the box.
There are (still) a few erroneous text duplications. More interestingly
there are several cases of genuine repetition-with-variants caused by
different newspapers (or the same newspaper at different times) re-using the
same agency material.
If you're not using the BNC of course this is irrelevant, except insofaras
it illustrates the general principle that one should *always* suspect the
From: email@example.com [mailto:firstname.lastname@example.org]On
Behalf Of Anne Harrap
Sent: 17 December 2002 10:52
To: corpora list - messages to list
Subject: [Corpora-List] Wordsmith concordance
Does anyone else get a lot of duplicated entries when doing a
concordance in Wordsmith?
Not sure if this is a bug or we are doing something wrong...
Languages Centre Documentalist
School of Languages
Oxford Brookes University
Tel: +44 865 483723
Fax: +44 865 483791
This archive was generated by hypermail 2b29 : Thu Dec 19 2002 - 10:58:42 MET