Corpora: EMNLP 2001 call for participation

From: Lillian Lee (llee@CS.Cornell.EDU)
Date: Thu May 03 2001 - 23:25:24 MET DST

  • Next message: alexis nasr: "Corpora: list of last and first names"


    New: Preliminary program.
         Note the May 7 early registration deadline!

    2001 Conference on Empirical Methods in Natural Language Processing

    Sponsored by the Intelligent Information Systems Institute (IISI)
    (a joint Cornell University/Air Force Research Laboratory institute)

      * Early registration: by May 7 (lower registration fee)
      * Late registration: May 8-26
      * On site registration also available

    SIGDAT, the Association for Computational Linguistics' special
    interest group on linguistic data and corpus-based approaches to NLP,
    invites participation at EMNLP 2001 at Carnegie Mellon University,
    Pittsburgh, PA USA on June 3 and 4, immediately preceding the meeting
    of the North American Chapter of the ACL (NAACL 2001). We have
    arranged an exciting program devoted to advances in all areas of
    traditional interest to the SIGDAT and related fields, as well as to
    this year's theme:
      "What Works and What Doesn't: Successes and Challenges".

    We'll have two days of paper presentations, plus:

    * An invited talk by Eric Brill, Microsoft Research.

    * A panel debating the efficacy of the Expectation-Maximization (EM)
      algorithm. The panel will begin with an introduction to EM.
      Confirmed panelists:
        - Eugene Charniak, Brown University
        - Kevin Knight, ISI
        - Stefan Riezler, Xerox PARC

    * A panel on industrial perspectives on natural language technology.
      Confirmed panelists:
        - Adam Berger, Eizel Technologies
        - Joshua Goodman, Microsoft Research
        - Lynette Hirschman, MITRE



    8:45-9:00 Welcome

    9:00-9:25 Limitations of Co-training for Natural Language
                   Learning from Large Datasets
                 David Pierce and Claire Cardie

    9:25-9:50 A Sequential Model for Multi-class Classification
                 Yair Even-Zohar and Dan Roth

    9:50-10:15 Learning Within-Sentence Semantic Coherence
                 Elena Eneva, Rose Hoberman, and Lucian Lita

    10:15-10:45 BREAK

    10:45-11:10 Knowledge Sources for Word-Level Translation Models
                 Philipp Koehn and Kevin Knight

    11:10-11:35 Improving Lexical Mapping Model of English-Korean
                   Bitext Using Structural Features
                 Seonho Kim, Juntae Yoon and Mansuk Song

    11:35-11:45 SHORT BREAK

    11:45-12:45 INVITED TALK
                 Eric Brill

    12:45-2:00 LUNCH

    2:00-2:25 Stacking classifiers for anti-spam filtering of e-mail
                 Georgios Sakkis, Ion Androutsopoulos, Georgios Paliouras,
                   Vangelis Karkaletsis, Constantine D. Spyropoulos, and
                   Panagiotis Stamatopoulos

    2:25-2:50 Feature Space Restructuring for SVMs with Application
                   to Text Categorization
                 Hiroya Takamura and Yuji Matsumoto

    2:50-3:15 Using Bins to Empirically Estimate Term Weights for
                   Text Categorization
                 Carl Sable and Ken Church

    3:15-3:25 SHORT BREAK

    3:25-3:50 Question Answering Using a Large Text Database: A
                   Machine Learning Approach
                 Hwee Tou Ng, Jennifer Lai Pheng Kwan, and Yiyuan Xia

    3:50-4:15 Information Extraction using the Structured Language Model
                 Ciprian Chelba and Milind Mahajan

    4:15-4:30 SHORT BREAK

                   (includes an introduction to the EM algorithm)
                 Eugene Charniak, Kevin Knight, Stefan Riezler (confirmed
                   so far)


    8:35-9:00 Classifying Semantic Relations between Noun Compounds
                   using a Domain-Specific Lexical Hierarchy
                 Barbara Rosario and Marti Hearst

    9:00-9:25 The Unknown Word Problem: A Morphological Analysis of
                   Japanese Using Maximum Entropy Aided by a Dictionary
                 Kiyotaka Uchimoto, Satoshi Sekine, Hitoshi Isahara

    9:25-9:50 Is Knowledge-Free Induction of Multiword Unit Dictionary
                   Headwords a Solved Problem?
                 Patrick Schone and Daniel Jurafsky

    9:50-10:15 Latent Semantic Analysis for Text Segmentation
                 Freddy Y. Y. Choi, Peter Wiemer-Hastings, and Johanna Moore

    10:15-10:45 BREAK

    10:45-11:10 Detecting short passages of similar text in large
                   document collections
                 Caroline Lyon, Bob Dickerson and James Malcolm

    11:10-11:35 Hybrid text mining for finding abbreviations and their
                 Youngja Park and Roy J. Byrd

    11:35-11:45 SHORT BREAK

                 Adam Berger, Joshua Goodman, Lynette Hirschman (confirmed
                   so far)

    12:45-2:00 LUNCH

    2:00-2:25 Automatic Corpus-based Tone Prediction using K-ToBI
                 Jin-seok Lee, Byeongchang Kim and Gary Geunbae Lee

    2:25-2:50 Probabilistic Context-Free Grammars for Syllabification and
                   Grapheme-to-Phoneme Conversion
                 Karin Mueller

    2:50-3:00 SHORT BREAK

    3:00-3:25 Comparing Data-driven Learning Algorithms for PoS Tagging
                   of Swedish
                 Beata Megyesi

    3:25-3:50 Impact of quality and quantity of corpora on stochastic
                 Srinivas Bangalore, John Chen, and Owen Rambow

    3:50-4:15 Corpus Variation and Parser Performance
                 Daniel Gildea

    4:15-4:30 REFRESHMENTS (CLOSE)


    Program Chair: Lillian Lee, Cornell University (
    Program Co-Chair: Donna Harman, NIST (
    Publication Chair: David Yarowsky, Johns Hopkins University

    Regina Barzilay, Columbia University
    Thorsten Brants, Xerox PARC
    Chris Brew, Ohio State University
    Eugene Charniak, Brown University
    Key-Sun Choi, KAIST
    Kenneth Church, AT&T Labs - Research
    Stephen Clark, University of Edinburgh
    Michael Collins, AT&T Labs - Research
    Eric Gaussier, Xerox
    Marti Hearst, UC Berkeley
    Don Hindle, AnswerLogic
    Changning Huang, Microsoft
    Rebecca Hwa, University of Maryland
    Hitoshi Iida, Sony
    Paul Jacobs, AnswerLogic
    Christian Jacquemin, LIMSI
    Maghi King, University of Geneva
    Wessel Kraaij, TNO TPD
    Maria Lapata, Saarland University/University of Edinburgh
    Elizabeth Liddy, Syracuse University
    Marc Light, MITRE
    Dekang Lin, University of Alberta
    Kim-Teng Lua, National University of Singapore
    Lluís Màrquez, Technical University of Catalonia
    Diana McCarthy, University of Sussex
    Helen Meng, The Chinese University of Hong Kong
    Paola Merlo,University of Geneva
    Rada Mihalcea, Southern Methodist University
    Guenter Neumann, DFKI
    Jian-Yun Nie, University of Montreal
    Franz Josef Och, RWTH Aachen
    Ted Pedersen, University of Minnesota,Duluth
    Roni Rosenfeld, Carnegie Mellon University
    Anoop Sarkar, University of Pennsylvania
    Erik Tjong Kim Sang, University of Antwerp
    Paola Velardi, University of Rome "La Sapienza"
    Atro Voutilainen, Conexor
    Kiri Wagstaff, Cornell University
    Roman Yangarber, New York University
    Joe Zhou, Intel

    This archive was generated by hypermail 2b29 : Thu May 03 2001 - 23:21:10 MET DST