SECOND MULTILINGUAL ENTITY TASK EVALUATION
Evaluation: 2-6 March 1998
Conference: (coincides with MUC-7)
Washington, D.C. area
The Human Language Systems Tipster Text Program of the
Defense Advanced Research Projects Agency
Information Technology Office
The Second Multilingual Entity Task Evaluation (MET-2) is the
outcome of a successful experimental run of MET in the spring of 1996.
MET-1 was an evaluation of systems that marked Named Entities in
Japanese, Chinese, and Spanish newspaper articles and the results were
reported anonymously during one day of the Tipster Phase II 24-month
meeting. Please refer to the Proceedings for further information on
MET-1. MET-2 will be run in conjunction with MUC-7.
The Message Understanding Conferences have provided an ongoing forum
for assessing the state of the art and practice in English text
analysis technology and for exchanging information on innovative
computational techniques in the context of fully implemented systems
that perform realistic tasks in English. The evaluations have
provided researchers and potential sponsors and customers with a
quantitative means to appreciate the strengths and weaknesses of the
technologies, and the results reported on at the conferences have
sparked customer interest in the potential utility of the technologies
and their extension to foreign languages.
While the Seventh Message Understanding Conference (MUC-7) will
provide an opportunity for both new and experienced MUC participants
to participate in an evaluation of a range of tasks, MET-2 will focus
on the Named Entity task *only* with future plans for higher level
information extraction. The languages for MET-2 will be Japanese and
Chinese with an additional, experimental track using Thai.
Participation in MET-2 is actively sought from both new and
veteran organizations. With an established test methodology and
multilingual task descriptions, MET-2 offers a good opportunity for
organizations to try out new ideas for handling NLP problems in a
multilingual setting that are of both scientific and practical
The portion of the MUC-7 conference devoted to MET-2 will consist
primarily of presentations and discussions of innovative techniques,
system design, and test results in the multilingual area. There will
also be an opportunity for participants to demo their evaluation
systems. Attendance at the conference is limited to evaluation
participants and to guests invited by the ARPA Tipster Text Program.
MET-2 will be represented in the MUC-7 conference proceedings,
including the reporting of participant test results with the sites
specified in the cases of Japanese and Chinese. The Thai results will
be reported anonymously due to the involvement of some participants in
the dataset preparation.
1 July 97: Application deadline for participation
15 July 97: Release of training data and scorer
8 September 97: Release of Dry Run training data and scorer
29 Sept - 3 Oct 97: MET-2 Dry Run (all participants)
6 February 97: Release of formal test training data and scorer
2-6 March 98: MET-2 Formal Run
7-9 April 98: 7th Message Understanding Conference (tentative
dates) with MET-2 session included
DATA AND TASK DESCRIPTION:
The texts to be used for system development and testing are news
articles In Japanese, Chinese, and Thai from various sources listed
below and supplied by arrangement with the Linguistic Data Consortium
(LDC). These articles will be distributed to MET-2 participants.
* Japanese: Kyodo, Nikkei
* Chinese: Xinhua, Peoples Daily, China Radio Broadcast
* Thai: partially unknown at this time
The Named Entity task (NE) covers named organizations, people,
and locations, date/time expressions, and numeric expressions limited
to monetary amounts and percentages. As output, it requires production
of SGML tags within the supplied texts. The English task definition
for MUC-7 has been updated to coincide with the general MET-2 task
There is a World Wide Web site that allows automated testing
following the rules of MET-1, an anonymous ftp site containing this
call for participation and the MET-2 participation agreement, and a
password-protected ftp site which will provide MET-2 data, definitions
(general and language-specific), and scoring software for download at
release times as noted above.
The URL of the website is http://muc.saic.com. The anonymous ftp
site is ftp.muc.saic.com.
The MET-1 website is password-protected and you need to obtain
permission from Tom Keenan (firstname.lastname@example.org) to be given a
password by Nancy Chinchor (email@example.com). MET-2 participants
will automatically be given passwords to access the password-protected
ftp site containing MET-2 data and resources. The anonymous ftp site
is available at all times to everyone.
TEST PROTOCOL AND EVALUATION CRITERIA:
MET-2 participants may elect to do one or any combination of
languages. Participants will have access to shared resources such as
the training texts and annotations, task documentation, and scoring
The test set used for all languages will consist of 100
texts. All MET-2 participants are encouraged to participate in the dry
run and take advantage of material available.
The formal test will be conducted during the first week in March.
It will be carried out by the participants at their own sites in
accordance with a prepared test procedure and the results submitted to
the password-protected ftp site for official scoring by SAIC.
Systems will be evaluated using recall and precision metrics,
F-measure, and error-based metrics. The computation of these metrics
is based on the scoring categories of correct, partial, incorrect,
spurious, missing, and noncommittal. MET-2 participants will be able
to familiarize themselves with the evaluation criteria through usage
of the evaluation software, which will be released along with the
INSTRUCTIONS FOR RESPONDING TO THE CALL FOR PARTICIPATION:
Organizations within and outside the U.S. are invited to respond
to this call for participation. Minimal requirements include
development by the time of the test of a system that can accept texts
without manual preprocessing, process them without human intervention,
and output annotations in the expected format.
Organizations should plan on allocating approximately two
person-months of effort for participation in the evaluation and
conference. It is understood that organizations will vary with
respect to experience with SGML text annotation, resources,
contractual demands/expectations, etc. Recognition of such factors
will be made in any analyses of the results.
Organizations wishing to participate in the evaluation and
conference must respond by July 1, 1997 by submitting a short
statement of interest via email and a signed copy of the MET-2
participation agreement via surface mail.
1. The statement of interest should be submitted via email
to firstname.lastname@example.org and should include the following:
a. Language(s) (choose one or more)
b. Primary point of contact. Please include name, surface
and email addresses, and phone and fax numbers.
2. The participation agreement can be downloaded from the ftp
site. A signed copy should be sent by surface mail to Nancy Chinchor,
Science Applications International Corporation, 10260 Campus
Pt. Dr. M/S A2-F, San Diego, CA 92121, USA.
Questions concerning this call for participation may be sent by email
to Nancy Chinchor (email@example.com) WITH A COPY TO Tom Keenan
MUC-7 PLANNING COMMITTEE:
Ralph Grishman, New York University, program co-chair
Elaine Marsh, Naval Research Laboratory, program co-chair
Chinatsu Aone, Systems Research and Applications
Lois Childs, Lockheed Martin
Nancy Chinchor, Science Applications International
Jim Cowie, New Mexico State University
Rob Gaizauskas, University of Sheffield
Megumi Kameyama, SRI International
Tom Keenan, U.S. Department of Defense
Boyan Onyshkevych, U.S. Department of Defense
Martha Palmer, University of Pennsylvania
Beth Sundheim, NCCOSC NRaD
Marc Vilain, MITRE
Ralph Weischedel, BBN Systems and Technologies