Does anyone have any ideas as to how to measure and
compare generalisations in text corpora? Working and conducting
research in the advanced EFL context (university level), I would very
much appreciate having a set of simple corpus-analysis measurements
to help me automatically identify student texts that ramble on as
opposed to those which develop ideas in a more proper in-depth way.
My original plan was to use the WordNet system to tag my data
semantically and (mainly) compare statistics of hypernym/hyponym
depth in the nouns found in the corpora, but it does not seem to
be working very well plus it is very straining since I do not have
access to any semantic tagging software.

I will very much welcome any ideas, suggestions or pointers that
could help develop/refine/improve my approach. Also, if you
happen to know of any semantic tagging systems available to the
public or for research, I'd be grateful for a tip.

Finally, a while ago I posted this query:

> Hello Everyone,
> I have two questions regarding WordNet (whether 1.5 or 1.6), which I
> want to use in my project on EFL learners' lexicon.
> 1. Have any of you used it in (especially) any corpus-related
> research? Can you refer me to any publications, reports etc.?
> 2. Are you aware of any critical appraisals of this resource
> anywhere? Again pointers will be very helpful.

I'm invariably interested in getting info on the above, and so are
quite a few other people who have been contacting me about a summary.
So if you have anything to share, please do e-mail me. I will
post an updated summary of all the responses. For those interested
now, here are the two responses so far:

1. Lluís Padró <> suggested taking a look at the
Acquilex group publications at:

2. Adam Kilgarriff <> has said
he's unaware of any critical reviews of WN so far. He's done a lot of
digging too, in relation with the WSD evaluation program SENSEVAL.


