My question concerns getting copyrights for the texts. In talking with a
person at the granting agency they indicated that it would probably be
necessary to get written permission from the publishers of each of the 122
texts on the corpus. I myself wonder if this is really necessary or
feasible, however, because of a number of factors, e.g.:
1) Many of the publishers are very small overseas operations, and many of
them have gone out of business in the 30-40 years since a given text was
published, or have merged with another company.
2) Most of the 122 blocks of text are just short selections from the entire
work (perhaps 30-40 pages from a 200-300 page book).
3) Most importantly, users of the corpus would never be able to access even
one entire page of the work. All of the hits would be displayed in
context, with about one or two lines of text both before and after the hit
(see http://mdavies.for.ilstu.edu/corpus/ for an example of this from an
equivalent corpus of historical Spanish texts that I have created). I was
told by someone at another funding agency that since the format and display
of the texts was greatly altered between the original form (the entire book
or page) and my site (isolated hits in context), there were no copyright
issues, and I wouldn't need to obtain permission from the publishers.
Has anyone else run into similar problems, and can anyone suggest what the
proper application of (U.S.) copyright law might be in this case?
Thanks in advance,
Mark Davies, Associate Professor, Spanish Linguistics
Dept. of Foreign Languages, Illinois State University
Normal, IL 61790-4300