Corpora: text collection difficulties

From: Janice McAlpine (
Date: Thu Nov 02 2000 - 20:26:49 MET

  • Next message: Kaiman: "Corpora: corpus representativeness summary/copyright-authorship appeal"

    From: Janice McAlpine <>

    Dear Colleagues,
        I am in charge of the development of the the Strathy Corpus
    of Canadian English, at Queen's University in Canada. This corpus now
    contains about 14 million words of published Canadian writing,
    carefully edited to mirror published hard copy. It is supplemented
    by hundreds of millions of words of newspaper writing on CD-ROM. The
    Strathy corpus has been created to study contemporary Canadian usage.
        The corpus was begun in 1981, at which time computers were
    a novelty and the word "Internet" did not exist. Writers and
    publishers were often honoured and eager to have their work
    consigned to an electronic repository devoted to the study of
    Canadian English. Now writers and publishers are extremely
    wary of giving permission to reproduce their works in electronic
    form. For one thing, they fear piracy. They also feel they should
    be paid. Newspapers and broadcasters now have commercial partners
    which exist specifically to exploit the market for searchable
    versions of news media. Therefore, newspapers are no longer giving us
    last year's CD-ROMs. Also, just this year, Cancopy, Canada's
    centralized copyright release clearinghouse, has announced
    that they will handle requests from universities to place
    authors' texts in electronic reserves and LANs.
         The upshot of all this is that I don't think I can get
    free published texts at the rate at which we need them anymore--
    unless I make a text solicitation campaign my full-time job (and I
    have many other duties!) Am I just losing my touch or have others
    found that the temper of the times has changed regarding
    text donation?
         All suggestions and comments are welcome.

    Janice McAlpine Contact me at
    Director, Strathy Language Unit
    Department of English
    Queen's University
    Kingston, On

    This archive was generated by hypermail 2b29 : Thu Nov 02 2000 - 20:24:22 MET