Corpora: Digital recording procedures
Fri, 15 May 1998 09:26:32 +0100
I am looking for advice on the best way to proceed with the recording of a
large corpus of isolated units.
The corpus we'll be using is a dictionary comprising a number of fields
identifying each entry (i.e. syntactic status, phonetic transcription,
We plan to be using a Pentium II266mhz PC, 128Mo Ram, with 1 UDMA HD to
boot, and 2 UW SCSI 4,3 HD's, and a professional microphone, plus a Fidji
sound card. The recording will most certainly be done at 44khz sampling
rate (though compression levels and storage capacity might later lead us to
convert this to 22 or even 11Khz).
We have had a close look at Soundforge, but hesitate to go ahead with it.
What we need is as straightforward a routine as possible, so that a minimal
amount of tailoring remains to be done after the takes.
An ideal routine, as I see it, would look like this:
- from within the database (not chosen as yet but, at a working stage, say
something like Access or FileMakerPro), the recording software is launched
and remains permanently active;
- at his/her leisure, the person reading out the words can activate a macro
command which will move down the list to the next word to be recorded and
start the recording proper, another command will then end the recording
- to speed up the whole process, the individual sound files should be
automatically identified and not have to be individually given an
I will not go into the details of subsequent equalising and denoising,
which come as a matter of course.
Many thanks to all for your suggestions.
Charge de Mission pour les Nouvelles Technologies
(EA 1226 FORELL-AIT) Universite de Poitiers,
95 avenue du Recteur Pineau, 86022, Poitiers, France
Tel. (prof) 05.49.45.32.02
Fax. (prof) 05.49.45.32.07
Tel. (pers.) 05.49.43.59.79