gencgm
Manual Reference Pages - GENCGM (1)
gencgm - generate a cochleogram
CONTENTS
Synopsis
Description
I. Display Defaults
Ii. Leaky Integration
References
Files
See Also
Bugs
Copyright
Acknowledgements
SYNOPSIS
gencgm [ option=value | -option ] [ filename ]
DESCRIPTION
Gencgm converts the input wave into a simulated neural activitypattern (NAP) and summarises the NAP as a sequence of excitationpatterns (EPNs) (Patterson et al. 1992a, 1993a). The operation takesplace in three stages: spectral analysis, neural encoding, andtemporal integration (Patterson et al. 1995). In the spectral analysisstage, the input wave is converted into an array of filtered waves,one for each channel of a gammatone auditory filterbank. The surfaceof the array of filtered waves is AIMs representation of basilarmembrane motion (BMM) as a function of time. In the neural encodingstage, compression, adaptation and suppression, are used to converteach wave from the filterbank into a simulation of the aggregateneural response to that wave. The array of responses is AIMssimulation of the neural activity pattern (NAP) in the auditory nerveat about the level of the cochlear nucleus. Finally, the NAP isconverted into a sequence of excitation patterns (EPNs) by calculatingthe envelope of the NAP and extracting spectral slices from theenvelope every frstep_epn ms (Patterson et al. 1994). The envelopeis calculated continuously, by lowpass filtering the individualchannels of the NAP as they flow from the cochlea simulation.
When the sequence of excitation patterns is presented inspectrographic format, it is referred to as a cochleogram (CGM). Thespectrographic format has time on the abscissa (x-axis), filtercentre-frequency on the ordinate (y-axis), and activity level as thedegree of black in the display. In AIM, the suffix cgm is used todistinguish this spectral representation from the other spectralrepresentations provided by the software (asa auditory spectralanalysis, sgm auditory spectrogram, and epn excitation pattern).
The neural activity pattern produced by genepn is the same as thatproduced by gennap. The primary differences are in the defaults forthe Displays and the fact that the Leaky Integration is used toconstruct spectral slices from the NAP rather than simulating loss ofphase locking. As a result, this manual entry is restricted todescribing the options that differ from those in gennap.
I. DISPLAY DEFAULTS
The default values for three of the display options are reset toproduce a spectrographic format rather than a landscape. Specifically,display=greyscale, bottom=0 and top=2500. The number of channels isset to 128 for compatibility with the auditory spectrum modules,genasa and genepn. When using AIM as a preprocessor for speechrecognition the number of channels would typically be reduced tobetween 24 and 32. Use option downsample if it is necessary toreduce the output to less than 24 channels across the speech range.
NOTE: The cochlea simulations impose compression of one form oranother on the NAP and the notes on compression in the man pages forgennap apply to gencgm as well.
Transduction
transduction Neural transduction switch (at, meddis, off)Switch. Default: at.
II. LEAKY INTEGRATION
stages_idt Number of stages of lowpass filteringDefault unit: scalar. Default value: 2 tup_idt The time constant for each filter stageDefault unit: ms. Default value: 8 ms. The Equivalent Rectandular Duration (ERD) of a two stage lowpassfilter is about 1.6 times the time constant of each stage, or12.8 ms in the current case.
downsample The time between successive spectral frames.Default unit: ms. Default value: 10 ms. Downsample is simply another name for frstep_epn, provided tofacilitate a different mode of thinking about time-series data.
frstep_epn The time between successive spectral framesDefault unit: ms. Default value: 10 ms. With a frstep_epn of 10 ms, gencgm will producespectral frames at a rate of 100 per second.
REFERENCES
Patterson, R.D., Holdsworth, J. and Allerhand M. (1992a). "Auditory Models as preprocessors for speech recognition," In: TheAuditory Processing of Speech: From the auditory periphery to words,M.E.H. Schouten (ed), Mouton de Gruyter, Berlin, 67-83.
| Patterson, R.D., Allerhand, M.H. and Holdsworth, J. (1993a)"Auditory representations of speech sounds," In Visualrepresentations of speech signals, Eds. Martin Cooke, SteveBeet, and Malcolm Crawford, John Wiley & Sons, Chichester. 307-314. | |
| Patterson, R.D., Anderson, T., and Allerhand, M. (1994)."The auditory image model as a preprocessor for spoken language," inProc. Third ICSLP, Yokohama, Japan, 1395-1398. | |
| Patterson, R.D., Allerhand, M., and Giguere, C., (1995)."Time-domain modelling of peripheral auditory processing: A modulararchitecture and a software platform," J. Acoust. Soc. Am. 98-3, (inpress). | |
The options file for gencgm.
SEE ALSO
gensgm, genasa, genepn, gennap, genbmm
BUGS
None currently known.
COPYRIGHT
Copyright (c) Applied Psychology Unit, Medical Research Council, 1995
Permission to use, copy, modify, and distribute this software without feeis hereby granted for research purposes, provided that this copyrightnotice appears in all copies and in all supporting documentation, and thatthe software is not redistributed for any fee (except for a nominalshipping charge). Anyone wanting to incorporate all or part of thissoftware in a commercial product must obtain a license from the MedicalResearch Council.
The MRC makes no representations about the suitability of thissoftware for any purpose. It is provided "as is" without express orimplied warranty.
THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDINGALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALLTHE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGESOR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THISSOFTWARE.
ACKNOWLEDGEMENTS
The AIM software was developed for Unix workstations by JohnHoldsworth and Mike Allerhand of the MRC APU, under the direction ofRoy Patterson. The physiological version of AIM was developed byChristian Giguere. The options handler is by Paul Manson. The revisedSAI module is by Jay Datta. Michael Akeroyd extended the postscriptfacilites and developed the xreview routine for auditory imagecartoons.
The project was supported by the MRC and grants from the U.K. DefenseResearch Agency, Farnborough (Research Contract 2239); the EEC EspritBR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust.
| SunOS 5.6 | GENCGM (1) | 4 September 1995 |