Corpora: Please help with Kids Corpus repetition study!

From: James P. Salsman (
Date: Fri Jan 12 2001 - 05:05:18 MET

  • Next message: Adam Przepiorkowski: "Corpora: Standard stopword for English"

    Recently I have been studying the CMU/LDC Kids Corpus of children's
    oral English reading, in order to help build reading skills evaluation
    systems. I have come to the point where I need a lot of help, and have
    set up a web page to make it easy for anyone to contribute as much or
    as little as they like. It is fun, too, because there is embedded audio
    of the Pittsburgh children who read for the collection of the Corpus,
    repeating part of what they were supposed to say. Please have a look
    and listen:

    Please also try to submit some of the requested characterizations; it
    takes less than a minute each. We need to collect thousands of those
    submissions to get a statistical model of the kinds of repetitions that
    occurred, which will help computerized reading skills assessment systems
    tell the difference between harmless self-corrections and bona-fide
    mispronunciations. So, please forward this message on to any of your
    friends and associates who might be able to do some, too.

    The resulting data will remain available for everyone at:

    Thank you!


    This archive was generated by hypermail 2b29 : Fri Jan 12 2001 - 05:01:55 MET