RE: Corpora: Broadcast corpus

From: Raman Chandrasekar (
Date: Mon Jan 17 2000 - 18:36:44 MET

  • Next message: Christopher Cieri: "Re: Corpora: Broadcast corpus"

    LDC does have transcribed broadcast news. See
    <> under the heading
    Broadcast text . You'll see the following:


    Broadcast text
    [ text <> ]
     <> LDC98T31 1996 CSR
    Hub-4 Language Model

     <> LDC97T22 1996
    English Broadcast News Transcripts (Hub-4)

     <> LDC98T28 1997
    English Broadcast News Transcripts (Hub-4)

     <> LDC98T24 1997
    Mandarin Broadcast News Transcripts (Hub-4NE)

     <> LDC98T29 1997
    Spanish Broadcast News Transcripts (Hub-4NE)

     <> LDC99T36 USC
    Marketplace Broadcast News Transcripts
    However, access to these collections may require you to be a member. I'm
    cc'ing LDC on this, hopefully they'll get back to you directly.
       -- Raman Chandrasekar

    -----Original Message-----
    From: Mirjam Sepesy Maucec []
    Sent: Sunday, January 16, 2000 10:41 PM
    Subject: Corpora: Broadcast corpus


    my research topic is domain based adaptation of language model. For my work
    I hardly need a text corpus
    with topic tags.
    Broadcast corpus seems to be appropriate. Where can I get it? I don't find
    it in LDC catalog. I also write 2
    e-mails to Primary Source Media to get some information and I got no answer.

    Please, help!




    Mirjam Sepesy Maucec

    Faculty of Electrical Engineering and Computer Science

    University of Maribor

    Smetanova 17

    2000 MARIBOR

    tel: ++386 (062) 220 7225


    This archive was generated by hypermail 2b29 : Mon Jan 17 2000 - 18:37:18 MET