Randall Jones wrote:
I have a question that I hesitate to ask because I'm sure the answer is
obvious. I have a tagged German text. I want to run WordList in Word
Smith Tools in a way that the tags will differentiate homographs, e.g. sein
(verb and pronoun), da (adverb and conjunction), etc. I would think that
because the words have different tags that they appear differently in the
list. However, thus far I have been successful in ignoring the tags or
having them treated as separate words. In both cases the different uses of
sein etc. are grouped together.
What am I doing wrong?
There should be an obvious solution but there isn't, I'm afraid
In WordSmith 3.0, a way to solve this problem is to ensure your tags can be
seen as part of the "word". As you will know, the apostrophe is by default,
for English, included in a word as an "acceptable mid-word character" so to
speak. If your text were tagged like this you'd get the results you want:
John'PROPERNOUN is'VERB on'PREP the'DET john'NOUN
You could also set another symbol as an acceptable mid-word character, say %
John%PROPERNOUN is%VERB on%PREP the%DET john&NOUN
(I haven't tested this but it *should* work. Test on a small text first,
then if OK, you could make a copy of your corpus and use Text Converter to
make the changes.)
In WS4 (emerging blinking into the daylight from a long dark tunnel) I will
think of a neater way than this of working! Am still refining tag treatment
so this query came at a good moment.
Applied English Language Studies Unit
University of Liverpool
Liverpool L69 3BX, UK.
This archive was generated by hypermail 2b29 : Mon Feb 17 2003 - 19:14:07 MET