Manual Reference Pages - GENASA (1)
NAME
genasa - generate auditory spectral analysis
CONTENTS
Synopsis
Description
I. Display Defaults
Iii. Leaky Integration
References
Files
See Also
Bugs
Copyright
Acknowledgements
SYNOPSIS
genasa [ option=value | -option ] [ filename ]
DESCRIPTION
The genasa module of the AIM software performs a time-domain spectral
analysis on the input wave using a bank of auditory filters, and
summarises the information in a sequence of auditory spectra. The
spectral analysis converts the input wave into an array of filtered
waves, one for each channel of the filterbank. The surface of the
array of filtered waves is AIMs representation of basilar membrane
motion (BMM) as a function of time (Patterson et al. 1995). The
sequence of auditory spectra is produced by calculating the envelope
of the BMM and extracting spectral slices from the envelope everyrectifing, compressing, and lowpass filtering the individual BMM waves
as they flow from the filterbank (Patterson et al. 1992a, 1993a,
Patterson, 1994a).The auditory spectrum produced by genasa is intended to simulate the
spectral representation of a sound as it occurs in the peripheral
auditory system just prior to neural transduction. As a result, the
frequency resolution of the analysis varies with the center frequency
of the channel, and the distribution of channels across frequency is
chosen to match that in the auditory system (Patterson and Moore,
1986; Glasberg and Moore, 1990). The auditory spectrum is a plot of
the activity in each channel as a function of the centre frequency of
the auditory filter (in ERBs). The representation is referred to as
an auditory spectrum to distinguish it from the Fourier energy
spectrum (Patterson, 1994a). The suffix asa is short for auditory
spectral analysis; it is used to distinguish this spectral
representation from three other spectral representations provided by
the AIM software (epn excitation pattern, sgm auditory
spectrogram, and cgm cochleogram).The spectral analysis performed by genasa is the same as that
performed by genbmm. The primary differences are in the defaults for
the Displays, the Compression and the Leaky Integration used to
construct the spectral slices from the BMM. As a result, this manual
entry is restricted to describing the options that differ from
those in genbmm.
I. DISPLAY DEFAULTS
The default values for three of the display options are reset to
produce a spectral format rather than a landscape; specifically,
display=excitation, bottom=0 and top=2500. The number of channels is
increased to 128 to ensure reasonable frequency resolution in the
spectral display.
I. RECTIFICATION AND COMPRESSION
The adaptive thresholding process begins with rectification and
compression of the BMM. The default form of compression is
logarithmic; it has the advantage of transforming the exponential
envelope of the ringing response of the gammatone filter into a linear
decay with time. There is evidence, however, that auditory
compression may be better represented by power compression with an
exponent in the range of 0.5. It is also advisable to insert power
compression before the Meddis haircell when driving it with a
gammatone filter. For a discussion of these issues, see
docs/aimMeddisHewitt. To accommodate power compression and the
assembly of different configurations of AIM, the rectification and
compression options are presented separately in the options list
before the neural transduction section.
rectify |
Apply half-wave rectification to filtered waves
Switch. Default value: off.
If rectify is on, the BMM is half-wave rectified. The log compressor
Note: Full wave rectification is produced if rectify is set to 2. |
compress |
Apply compression to filtered waves. The form of the compression can be either logarithmic (log), or a power function (with a value between 0 and 1). Switch. Choices log, 0-1, off. Default value: log.
The default compressor is logarithmic, not because it is a
NOTE: When using the physiological version of AIM with the |
Transduction
transduction Neural transduction switch (at, meddis, off) Switch. Default: off.
III. LEAKY INTEGRATION
stages_idt Number of stages of lowpass filtering Default unit: scalar. Default value: 2
tup_idt The time constant for each filter stage Default unit: ms. Default value: 8 ms.
The Equivalent Rectandular Duration (ERD) of a two stage lowpass
filter is about 1.6 times the time constant of each stage, or
12.8 ms in the current case.downsample The time between successive spectral frames. Default unit: ms. Default value: 10 ms.
Downsample is simply another name for frstep_epn, provided to
facilitate a different mode of thinking about time-series data.frstep_epn The time between successive spectral frames Default unit: ms. Default value: 10 ms.
With a frstep_epn of 10 ms, genasa will produce
spectral frames at a rate of 100 per second.
REFERENCES
Glasberg, B. R. and B. C. J. Moore (1990). "Derivation of auditory filter shapes from notched-noise data."
Hearing Research, 47, 103-138.
Patterson, R.D. and B.C.J. Moore (1986). "Auditory filters and excitation patterns as representations of frequency resolution," In: Frequency Selectivity in Hearing. B.C.J. Moore (Ed.), Academic Press, London. 123-177. |
|
Patterson, R.D., Holdsworth, J. and Allerhand M. (1992a). "Auditory Models as preprocessors for speech recognition," In: The Auditory Processing of Speech: From the auditory periphery to words, M.E.H. Schouten (ed), Mouton de Gruyter, Berlin, 67-83. |
|
Patterson, R.D., Allerhand, M.H. and Holdsworth, J. (1993a). "Auditory representations of speech sounds," In Visual representations of speech signals, Eds. Martin Cooke, Steve Beet, and Malcolm Crawford, John Wiley & Sons, Chichester. 307-314. |
|
Patterson, R.D. (1994a). "The sound of a sinusoid: Spectral models" J. Acoust. Soc. Am. 96, 1409-1418. |
|
Patterson, R.D., Anderson, T., and Allerhand, M. (1994). "The auditory image model as a preprocessor for spoken language," in Proc. Third ICSLP, Yokohama, Japan, 1395-1398. |
|
Patterson, R.D., Allerhand, M., and Giguere, C., (1995). "Time-domain modelling of peripheral auditory processing: A modular architecture and a software platform," J. Acoust. Soc. Am. 98-3, (in press). |
|
FILES
.genasarc The options file for genasa.
SEE ALSO
genbmm, gensgm
BUGS
None currently known.
COPYRIGHT
Copyright (c) Applied Psychology Unit, Medical Research Council, 1995
Permission to use, copy, modify, and distribute this software without fee
is hereby granted for research purposes, provided that this copyright
notice appears in all copies and in all supporting documentation, and that
the software is not redistributed for any fee (except for a nominal
shipping charge). Anyone wanting to incorporate all or part of this
software in a commercial product must obtain a license from the Medical
Research Council.The MRC makes no representations about the suitability of this
software for any purpose. It is provided "as is" without express or
implied warranty.THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL
THE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES
OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
SOFTWARE.
ACKNOWLEDGEMENTS
The AIM software was developed for Unix workstations by John
Holdsworth and Mike Allerhand of the MRC APU, under the direction of
Roy Patterson. The physiological version of AIM was developed by
Christian Giguere. The options handler is by Paul Manson. The revised
SAI module is by Jay Datta. Michael Akeroyd extended the postscript
facilites and developed the xreview routine for auditory image
cartoons.The project was supported by the MRC and grants from the U.K. Defense
Research Agency, Farnborough (Research Contract 2239); the EEC Esprit
BR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust.
SunOS 5.6 | GENASA (1) | 4 September 1995 |
Generated by manServer 1.07 from /cbu/cnbh/aim/release/man/man1/genasa.1 using man macros.