genwav
Manual Reference Pages - GENWAV (1)
genwav - display the wave in filename.
CONTENTS
Synopsis
Description
Files
See Also
Bugs
Copyright
Acknowledgements
SYNOPSIS
genwav [ option=value | -option ] [ filename ]
DESCRIPTION
Genwav sets up an Xwindow and displays a segment of the input wave inthe window. The size of the window and the size of the wave aredetermined by options, as are a number of other input/output functionsand printing functions. These options have no direct bearing on theauditory processing performed by AIM. For convenience, then, theseNon-Auditory options are associated with the instruction genwav (theone non-auditory instruction), and they are listed at the top of theoptions tables prior to the auditory options. There are also a largenumber of Silent Options which control lesser used functions, bothauditory and non-auditory. They are listed in docs/aimSilentOptionsand there is documentation for some of them at the end of the listing.
There are three classes of Non-Auditory options:
I) DISPLAY OPTIONS that determine the format of the auditory representationsof sound on the screen, or on paper when printed.
II) OUTPUT OPTIONS that determine the format and content of files usedto store the auditory representations of sounds.
III) INPUT OPTIONS that determine how the wave in the input file shouldbe interpreted.
The output options are presented before the input options so that theinput options will be adjacent to the filterbank options in theoptions tables produced by genbmm and subsequent instructions.
I. DISPLAY OPTIONSThe AIM modules produce output in the form of a set of functions, onefor each channel of the auditory filterbank. For example, the outputof genbmm is a set of functions that simulate basilar membrane motionproduced in response to the input wave. By default, the AIM softwareputs an Xwindow up on the computer screen and displays the output inthe window. This section describes the options that control thesedisplays. There are also a number of Silent Options associated withdisplays (see docs/aimSilentOptions).
The display options are: title, display, x0-win, y0-win, width_win,height_win, display, view, top, bottom, magnification, pensize, hiddenline.
A. The Display Window Title, Position, and Size
title Title of output display.
Character string. Default: input file name. The title of the output being displayed. If no title is given, thedisplay bears the name of the file of the input wave.
x0_win Left edge of window
Unit: pixels. Default: centre. The left edge of the window into which the display will be drawn,relative to the left edge of the screen (i.e. the x-coordinate of thewindow within the screen). A value of centre will cause centring inthe horizontal dimension (provided the window manager does notoverride).
y0-win Lower edge of window
Unit: pixels. Default: centre. The lower edge of the window into which the display will be drawn,relative to the lower edge of the screen (i.e. the y-coordinate of thewindow within the screen). A value of centre will cause centring inthe vertical dimension (provided the window manager does notoverride).
Taken as a pair x0_win and y0-win determine the origin of the window,relative to the screen origin which is assumed to be the lower leftcorner of the screen.
width_win Window width
Unit: pixels. Default: 640. The width of the window into which the display will be drawn.
height_win Window height
Unit: pixels. Default: 480. The height of the window into which the display will be drawn.
B. Display Controls
display Display output on screen
Switch. Default: on. Normally this switch is on and a bitmap of the output is displayed ina graphical window on the computer screen. The switch is providedbecause the time taken to create the displays is considerable, and itis useful to turn it dsiplay off using AIM as a preprocessor forspeech recognition.
top The largest postive value visible in the display
Scalar. Default value: 1024 (for genwav) Each of the functions in the multi-channel output of a module isdisplayed in a transparent window. Provided the channel density is nottoo low, the functions are related and the set of functions produces adisplay that looks like a complex landscape. Top determines thelargest positive value that will appear in the transparent windows ofthe individual functions, so top must be as large as the largest valuein the full set of functions. Increasing top has the effect of movingthe viewer farther up above the landscape.
bottom The largest negative value visible in the display
Scalar. Default value: -1024 (for genwav) Bottom determines the largest negative value that will appear in thetransparent windows of the individual functions, so bottom must be aslarge in the negative direction as the largest negative value in thefull set of functions. Increasing bottom in the negative direction hasthe effect of depeening the valleys in the landscape.
magnification Display magnification
Scalar. Default: 1.0. The degree to which the amplitude of the functions in the displayshould be magnified before being displayed. This parameter is merelyfor adjusting the visual contrast of the display. The magnificationoption is a multiplier, so a value of 1 implies drawing to scale,while a value of 10 implies ten times (10x) the size of values in themodule output and 0.1 implies one tenth of the output size.Magnification is related to, but separate from, the gain options whichaffect the values of the output functions and the values stored in anyoutput files. Magnification is an alternative means of controlling thesize of the functions in the display -- alternative to top and bottom.
pensize The size of the lines in the displays and the
dots on the spiral
Unit: pixels. Default: 1. This option allows the user to specify the thickness of the lines inthe display and the size of the dots on spiral auditory images. Italso affects the lines and dots in postscript plots. It is providedprimarily for use with printers which have much more resolution thancomputer screens. On laser printers a value of 3-5 gives reasonableline thickness. On the screen, a linewidth greater than 1 producesslow drawing, and a gagged, blurred display.
hiddenline Draw with overlapping parts of functions
hiddenSwitch. Default: on. This switch specifies whether or not a hidden line algorithm shouldbe used when drawing the display. It also affects printed displays.In almost all cases, hiddenline results in more attractive displays ofwaveforms, and it often makes complex displays easier to understand,so the default is on. Note: hiddenline almost doubles the drawingtime so it is sometimes useful to switch it off on slower machines.
II. OUTPUT OPTIONSThe output options are listed and described before the input optionsso that the input options will be adjacent to the filterbank optionsin the listings produced by genbmm and subsequent modules. The outputoptions are downchannel, erase_ctn, animate_ctn, bitmap_ctn,postscript, output, and header. There are also a number of SilentOptions associated with output (see docs/aimSilentOptions).
downchannel Average adjacent channels of multichannel
representations
Units: Number of averagings. Default value: 0. There is interaction between channels in the transmission-linefilterbank of the physiological version of AIM, and in the neuralencoding of the functional version of AIM. The minimum channeldensity for these processes to operate properly is four channels perERB and 2 channels per ERB, respectively. For broadband signals likespeech this means that the minimum number of channels is on the orderof 128 and 64, respectively. This channel density can producecluttered displays, and more importantly, it is far too many channelsfor current speech recognition systems which typically use 12-24channels. This is not just a computer power problem; the recognitionsystems actually perform less well with extra channels. Accordingly,the option downchannel provides the option of reducing the channeldensity at output, so that AIM can operate with the appropriatechannel density and still provide output that is compatible withdisplays and speech recognition systems.
Downchannel averages pairs of adjacent channels and the option valuespecifies how many times it should execute the averaging process. Eachaveraging reduces the number of channels by a factor of 2, so forproper transmission-line filtering and an output file with 16channels, set channels_afb=128 and downchannel=3 (three successivehalvings of the number of channels).
A. Animated Cartoons
Four of the AIM instructions produce output in the form of sequencesof spectral frames (gensgm, gencgm, genasa and genepn). Bitmapversions of the displays of the frames can be stored by AIM andreplayed by review and xreview. When the sequence of frames is playedrapidly, it appears as an animated cartoon that shows the dynamicbehaviour of the spectrum of the sound.
Similarly, the AIM instructions for auditory images (gensai andgenspl) produce sequences of landscape frames, and bitmap versions ofthe landscape displays can also be stored by AIM and replayed byreview and xreview. Indeed, it was the desire to produce auditoryimage cartoons that led to the development of much of the AIM softwarepackage. The animated cartoons or auditory images show the dynamicbehaviour of features in the images, like the motion of formants indiphthongs and the motion of notes in a melody.
This section describes the options that control the construction andstorage of sequences of bitmaps; there is a separate manual entries forthe xreview routine that replays the bitmaps (man xreview).
erase_ctn Erase the current frame before presenting
the next frame
Switch. Default value: on. Normally, when presenting a sequence of frames as an animated cartoon,one wants to erase the current frame before presenting the next. Whenthe frames are spectra, however, the set of frames can together form ameaningful display; for example, the set of rising spectra produced atthe onset of a sound produces a contour map of the onset. The optionerase_ctn enables the user to observe the full set of spectrasimultaneously. (See aimdemo_gtf_spectra or aimdemo_tlf_spectra ).
animate_ctn Store frames in memory and replay all of
them as a cartoon
Switch. Default value: off. When this option is on, AIM stores the bitmaps of the frames itproduces in the memory of the machine and replays them rapidly whenthe instruction is complete. Type RETURN to animate the cartoon again;type q RETURN to exit the instruction. (This option was importantwhen machines were slower and before the availability of review andxreview. It is now largely obsolete.)
bitmap_ctn Store bitmaps of frames in a file for
replay as a cartoon
Switch. Default value: off. When this option is on, bitmaps of the frames produced for the inputin file_name will be stored in file_name.ctn. The sequence of framescan later be replayed using either
> review file_name or,
> xreview file_name
Both of these programs enable the user to vary the rate of animation,the section of the sequence to be view, etc. The xreview version has awindow interface with useful information and is the preferred versionin most cases.
B. Output Files for Printing and Postprocessing
postscript Produce printer-ready output
Switch. Default value: off. This switch causes AIM to produce a printer-ready version of thedisplays it presents on the computer screen. For example, the NAP ofa 32-ms section of cegc can be printed using
> gennap length=32 postscript=on cegc | lpr -Plw
where lpr is the Unix printer-driver and the lw of -Plw specifiesthe destination printer. You may need to check the name of yoursystems printer driver and laser printer.
Alternately the postscript version of the display may be directed to afile using an instruction like
> gennap length=32 postscript=on cegc > cegc_nap.ps
and printed later at the users convenience. In this example, the filename cegc_nap.ps is not generated by AIM; the _nap.ps suffix isadded by the user following standard conventions to indicate that the filecontains a NAP in postscript form.
NOTE: There are a very large number of Silent Options associated with
postscript printing that greatly facilitate preparation of
displays for publication (see docs/aimSilentOptions).THREE POSTSCRIPT CAUTIONS:
Postscript files of landscape displays from AIM are very large. As aresult, we recommend
a) that you NOT switch postscript on without redirecting the output toa file, as it will cause the output to be display on the screen in aseemingly endless display,
b) that you be careful NOT to print postscript files on a printerwhich does not understand the Postscript language, as it can cause theprinter to put out an extremely long file, one column per page!
c) that you NOT set postscript=on in an options file as it willgenerate large files in the directory without your noticing.
output Generate an output file
Switch. Default value: off. This switch causes the array of functions that defines AIMssimulation of basilar membrane motion, or a neural activity pattern,or an auditory image, to be stored in a file for subsequent processingby the aimtools or other, user defined, operators. By convention, thefile is given the same name as the input file, but with a suffixreflecting the entry point, to distinguish it from the input file onthe one hand and from other output files on the other hand. The namingsystem enables the user to construct and store a set of output filesfor one input file without the need to specify a sequence of filenames. The suffixes are those used to identify the modules in thelisting produced by gen -help. So, for example, the followingcommand line:
> gennap output=on length=32 cegc
will produce an output file named cegc.nap containing a multiplexedversion of the functions that define the NAP of the first 32 ms ofcegc.
The spectrographic representations produced by gensgm and gencgm canbe stored in the same way, as can the sequences of spectra produced bygenasa and genepn. It is the output files of genasa and gencgm thatare used to interface AIM with speech recognition systems (Robinson etal., 1990; Patterson et al., 1995; Giguere and Woodland, 1994a).Details of the file formats are presented in docs/aimFileFormat.
header Put a header on the output file
Flag. Default value: on. By default, a header is prepended to each output file so thatsubsequent processors have access to the history of the file. Detailsof the header structure are presented in docs/aimFileFormat.
Note: There is an AIM tool hdr which will remove the header from anAIM output file (man hrd).
III. INPUT OPTIONSThe input options enable the user to process a subsection of the inputwave, and to specify characterisitcs of the wave.
The input options are: input_wave, start_wave, length_wave,samplerate, swap_wave, dB_wave.
input_wave Default input wave name
Filename. Default value: none. The name of the wave file to process. This option permits simplerepetitive processing of the same input file without repetitive typing. Italso enables one to circumvent the Unix convention of having the filenamelast on the command line. This option is overridden if the user supplies awave file name at the end of the command line.
start_wave Start point in wave
Default unit: ms. Default value: 0. The point in the input wave at which processing should begin. Thestart_wave option is expressed in milliseconds and its default value is thebeginning of the file (i.e. 0 ms into the file).
length_wave Length of wave
Default unit: ms. Default value: remainder. The number of milliseconds of the wave that ought to be processed,beyond the start point. The special value remainder indicates thatthe entire length of the wave from the start point to the end of thefile should be processed.
samplerate Input wave sample rate
Default unit: Hertz. Default value: 20,000 Hz. The rate at which the input wave was sampled.
swap_wave Swap the bytes in each binary pair of the
input file
Switch. Default: off. The order of the bytes in short integers varies between manufacturers.Specifically the order for Sun. SGI and HP is opposite that for DEC andIBM. The default setting (off) is for the latter byte order.
dB_wave Input wave level for physiological AIM This option sets level of the input wave for the physiological versionof AIM, that is, the route with the transmission line filterbank.(The functional version of AIM is level-independent and dB_wave isignored when the gammatone filerbank is used.)
Units: dB. Default: 60 dB
dB_wave is a scaling parameter that tells AIM the level of the wave inyour input file relative to AIMs internal standard. It is used forcalibration and investigation of the effects of level in the auditorysystem.
Calibration:
To calibrate AIM for a given recording set up, put a sinusoid of knownlevel (dBSPL) into the recording system and store a sample of it asshort integers in a headerless wave file. Calculate the rms amplitudeof the sinusoid (RMS) (see note below) and then use the followingequation to calculate the appropriate value of dB_wave.
dB_wave = dBSPL - 20log(RMS/200)
For example, if the sinusoid has a known level of 60 dB SPL, and therecorded version produces a wave with an RMS amplitude of 467.3, thendB_wave should be set to 52.6.
Note: The RMS value of a stored input wave can be calculated using theAIM tool stats as follows:
> stats stat=rms line=on <input-wave>
Investigation of Level Effects:
If you change the value of dB_wave from 60 to 80 dB, the SAME inputfile is assumed to represent a stimulus that is 20 dB HIGHER in level.This enables you to investigate the effects of level with a fixedinput file.
Output scaling:
You can scale the genbmm output into absolute units of basilarmembrane velocity in cm/s by multiplying the genbmm output numbers by:
antilog[(dB_wave-60)/20] / (4000*gain_tlf)
where dB_wave and gain_tlf are the values specified at run time.
FILES
.genwavrc The options file for genwav.
SEE ALSO
genbmm
BUGS
COPYRIGHT
Copyright (c) Applied Psychology Unit, Medical Research Council, 1995
Permission to use, copy, modify, and distribute this software without feeis hereby granted for research purposes, provided that this copyrightnotice appears in all copies and in all supporting documentation, and thatthe software is not redistributed for any fee (except for a nominalshipping charge). Anyone wanting to incorporate all or part of thissoftware in a commercial product must obtain a license from the MedicalResearch Council.
The MRC makes no representations about the suitability of thissoftware for any purpose. It is provided "as is" without express orimplied warranty.
THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDINGALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALLTHE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGESOR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THISSOFTWARE.
ACKNOWLEDGEMENTS
The AIM software was developed for Unix workstations by JohnHoldsworth and Mike Allerhand of the MRC APU, under the direction ofRoy Patterson. The physiological version of AIM was developed byChristian Giguere. The options handler is by Paul Manson. The revisedSAI module is by Jay Datta. Michael Akeroyd extended the postscriptfacilites and developed the xreview routine for auditory imagecartoons.
The project was supported by the MRC and grants from the U.K. DefenseResearch Agency, Farnborough (Research Contract 2239); the EEC EspritBR Porgramme, Project ACTS (3207); and the U.K. Hearing Research Trust.
| SunOS 5.6 | GENWAV (1) | 16 April 1997 |