>I'm considering submitting a grant to put a 2,800,000 word corpus of
>historical Portuguese texts on the web ...
>My question concerns getting copyrights for the texts.

What you propose would seem to be a clear case of fair use for
scholarship or research purposes, for which prior permission cannot
be required. Two sites devoted to this issue are:
(in particular, )
The following is taken from :
(dated 1998):

Copyright and Fair Use, Stanford University Libraries

I. Fair Use for Teaching and Research

The "fair use" doctrine allows limited reproduction of copyrighted works
for educational and research purposes. The relevant portion of the copyright
statue provides that the "fair use" of a copyrighted work, including
reproduction "for purposes such as criticism, news reporting, teaching
(including multiple copies for classroom use), scholarship, or research"
is not an infringement of copyright. The law lists the following factors as
ones to be evaluated in determining whether a particular use of a
copyrighted work is a permitted "fair use," rather than an infringement
of the copyright:

the purpose and character of the use, including whether such use
is of a commercial nature or is for nonprofit educational purposes;

the nature of the copyrighted work;

the amount and substantiality of the portion used in relation to the
copyrighted work as a whole, and

the effect of the use upon the potential market for or value of the
copyrighted work.

Although all of these factors will be considered, the last factor is the
most important in determining whether a particular use is "fair."
Where a work is available for purchase or license from the copyright
owner in the medium or format desired, copying of all or a significant
portion of the work in lieu of purchasing or licensing a sufficient number
of "authorized" copies would be presumptively unfair. Where only a
small portion of a work is to be copied and the work would not be used
if purchase or licensing of a sufficient number of authorized copies
were required, the intended use is more likely to be found to be fair.

