**Previous message:**Ted Pedersen: "Re: Corpora: Negative mutual information?"**In reply to:**David Campbell: "Corpora: Negative mutual information?"**Reply:**Leonel Ruiz Miyares: "Corpora: Tesis doctoral"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

*> I have a question about calculating mutual information for bigrams
*

*> in text. According to every definition I've seen of MI, the values
*

*> are non-negative. However, I've found that for some bigrams made
*

*> of common words in very uncommon bigrams, the value is less than
*

*> zero. Does anyone know how to interpret a negative mutual
*

*> information?
*

Where have you seen a definition suggesting (pointwise) MI must be

non-negative? The definition is based on a comparision between the

observed co-occurrence probability for the two words (i.e. the joint

probability P(x,y)), compared with the co-occurrence probability one

would expect to see if the two words were independent (i.e. the

product of the marginal probabilities P(x) and P(y)); namely

I(x,y) = log [ P(x,y) / P(x)P(y) ]

If the two words occur together *exactly* as frequently as one would

expect by chance, the ratio inside the log is equal to 1, giving us

I(x,y) = 0; if they occur more frequently than one would expect by

chance, the ratio is greater than 1 so I(x,y) > 0; and conversely if

they occur less frequently than one would expect by chance, the ratio

is less than 1 so I(x,y) < 0.

Nothing in principle or in practice prevents this last case, and the

interpretation is that the two words are for some reason dissociated

rather than associated, e.g. for linguistic reasons. For example,

"he" and "write" are probably both quite frequent unigrams, but the

bigram "he write" is highly unlikely because it violates number

agreement between the subject and the object. Hence one would predict

I(he,write) < 0.

That said, note that the *average* mutual information between two

random variables X and Y is defined as the relative entropy

D( P(x,y) || P(x)P(y) ) between the joint and the independence

distributions. Like any relative entropy, that value is indeed

guaranteed to be non-negative; e.g. see Cover, T. M. and Thomas,

J. A. (1991), Elements of Information Theory, Wiley, New York. The

term "mutual information" is sometimes used to refer to the

information-theoretic quantity of average mutual information, and

sometimes used to refer to pointwise mutual information, which is a

potential source of confusion.

Philip

----------------------------------------------------------------

Philip Resnik, Assistant Professor

Department of Linguistics and Institute for Advanced Computer Studies

1401 Marie Mount Hall UMIACS phone: (301) 405-6760

University of Maryland Linguistics phone: (301) 405-8903

College Park, MD 20742 USA Fax : (301) 405-7104

http://umiacs.umd.edu/~resnik E-mail: resnik@umiacs.umd.edu

**Next message:**larry moss: "Corpora: FG/MOL Second CFP"**Previous message:**Ted Pedersen: "Re: Corpora: Negative mutual information?"**In reply to:**David Campbell: "Corpora: Negative mutual information?"**Reply:**Leonel Ruiz Miyares: "Corpora: Tesis doctoral"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ]

*
This archive was generated by hypermail 2b29
: Fri Mar 09 2001 - 01:18:16 MET
*