Re: Corpora: Chomsky/Harris

From: Stefan Th. Gries (
Date: Mon Apr 02 2001 - 11:14:25 MET DST

  • Next message: Mike Maxwell: "Re: Corpora: Chomsky/Harris - one more fun question."

    I would agree with most of what Michael wrote in his recent posting,
    especially when it comes to the relation between Chomskyan linguistics and
    corpus-based approaches. While I would not consider them the enemy at MIT, I
    nevertheless believe that Chomskyan linguistics and much what I see as
    corpus linguistics are widely disparate domains that differ in many issues
    defining scientific reseach agendas:
    - the issue of what counts as data;
    - the issue of how data are analysed;
    - the range of questions that is considered of being liable to fruitful
    - the range of questions actually being investigated.
    Put differently, I think many, if not most, corpus linguistics investigate
    completely different things on the basis of completely different data and
    methods of analysis. I am not sure whether this bridge is impossible to gap
    or not, but I am sure that, for a large number of research questions, each
    side has little to offer for the other.
        To give one example, I recently attended a predominantly generative
    conference (admitting me there shows that there need not be any such enmity
    as hinted at in previous postings!) and gave a talk on a corpus-based
    approach to the influence of processing on Preposition Stranding (PS) in
    English, PS also being at least touched upon by other presenters. While I of
    course do not claim that my work can represent corpus linguistics as a
    whole, the discrepancies between the corpus-based approach and the other,
    generative, analyses were obvious (and also mentioned in some comments from
    the audience):
    - my analysis was corpus-based, although acceptability judgements on the
    part of the investigating linguist could perhaps also have provided similar
    answers (which I find highly unlikely, given my multifactorial/statistical
    - my analysis was only concerned with performance;
    - my analysis did not focus on any structural representation (not to say,
    tree diagram) of the phenomenon under consideration that was compatible with
    recent/contemporary analyses.
    On the other hand, from a corpus-linguistic point of view at least, it was
    difficult for me to value the generative analyses presented - not because
    they were not sound in the framework presented, but because they asked
    questions completely different from mine and, e.g., exhibited a range of
    (from my point of view) acceptability judgements of sentences that were, in
    the case of two presenters, re-interpreted after every question from the
        To cut a long story short, what is it that we have to offer to
    generative theory if we frequently talk about frequencies, probabilistic
    processes etc. - and what does generative theory have to offer to us, given
    the steadily increasing number of empty nodes and functional categories and
    the neglect of the linguistic material that is actual being produced?

    This archive was generated by hypermail 2b29 : Mon Apr 02 2001 - 11:21:15 MET DST