Re: [Corpora-List] Tag-set conversion

From: Timothy Baldwin (tbaldwin@csli.Stanford.EDU)
Date: Fri Jan 31 2003 - 02:32:42 MET

  • Next message: Detmar Meurers: "Re: [Corpora-List] Tag-set conversion"

    > Does anybody know of an existing tool to translate between the BNC C5
    > tag-set and the Penn Tree Bank tag-set?

    Assuming you are running Solaris or Linux, you could use the tools supplied
    with cass, as developed by Marc Light and Steve Abney:

    Their use is documented in the cass manual supplied in the tarball, but for
    the record, you run:

     bncsents BNCFILE | tagfixes -f bnc.fxc

    where BNCFILE is a BNC source file.

    You could alternatively just retag the BNC using a Penn-style tagger, of
    course, given that the BNC data was for the most part automatically tagged.


    This archive was generated by hypermail 2b29 : Fri Jan 31 2003 - 02:38:37 MET