genspl
Manual Reference Pages - GENSPL (1)
genspl - generate a spiral auditory image
CONTENTS
Synopsis/syntax
Description
Options
Examples
References
Bugs
Files
See Also
Copyright
Acknowledgements
SYNOPSIS/SYNTAX
genspl [ option=value | -option ] filename
DESCRIPTION
The spiral auditory image is an alternative display of the auditoryimage which emphasises the pitch of a sound and de-emphasises thetimbre, or sound quality. That is, it emphasises the global temporalstructure that arises in the NAPs of periodic sounds and focusesattention on the scale of the structure. It de-emphasises thefine-structure of the time-interval patterns in the auditory image andthe shape of all but the largest features (Patterson, 1986, 1987b).Since the underlying information is the same for the spiral andrectangular auditory images, the options that control the auditoryimage itself are the same for genspl as they are for gensai. Theoptions are those with the suffix _ai and they have the same defaultsin both cases. The options are napdecay_ai, stdecay_ai, stcrit_ai,stlag_ai, decay_ai, and stinfo_ai; they are described in the manualentry for gensai.
With respect to displays, it is difficult to present the functions inthe individual channels of the auditory image in a spiral form; thefine detail of the functions gets lost in the spiral perspective.Accordingly, in the spiral perspective each of the separate SAI pulsesis replaced by a dot positioned at the time of the peak of the SAIpulse. [Previously, this representation was referred to as a pulseribbon (Patterson, 1986, 1987b).] Conceptually, the spiral auditoryimage is a set of concentric spirals one for each channel of theauditory image. The channel with the highest centre frequency is onthe inside with the smallest radius; the channel with the lowestcentre frequency is on the outside with the largest radius. Thespirals lines are omitted for clarity, leaving just the dots. Sets ofdots in adjacent channels often form bars aligned along spokesemmanating from the centre of the spiral. These bars show that thesame period exists in a range of adjacent channels in the NAP.Furthermore, this information about correlation across channelsappears on the same spoke as the information indicating that thepattern repeats in time. Thus the multi-channel spiral maps bothspectral and temporal information concerning the periodicity of thesound onto a single spatial vector -- a spoke of the spiral. It isthis property that enables the spiral representation to explain octaveperception (Patterson, 1986, 1990; Patterson et al, 1993b; Robinsonand Patterson, 1995a,b).
OPTIONS
The input and output options for the spiral auditory image are thesame as for the rectangular auditory image. Furthermore, the spiralauditory image of an extended sound is a cartoon just like the linearauditory image, so spiral images of dynamic sounds can be generated,stored, animated, and reviewed in the same way as with therectaulgular auditory image (gensai).
.
I. DISPLAY OPTIONS FOR THE SPIRAL AUDITORY IMAGE
.
The options that control the position and size of the spiral imagewindow on the screen are the same as for all previous windows. Theoptions that control the timing of the frames and the portion of thethe auditory image presented in the display (frstep_aid, pwidth_aid,and nwidth_aid) are the same as for gensai. Note, however, thatnwidth_aid is disabled (ignored) because logarithms are not definedfor non-positive numbers. This means that the spiral auditory imagerepresses most of the information in transients unless they occurperiodically with a period less than 35 ms. The minimum oberservabletime-interval is controlled by a new option, orig_spd.
There are three new display options for the spiral view of theauditory image: form_spd, zeroline_spd, and orig_spd. The mathematicsof spirals as they pertain to the spiral auditory image are set out inPatterson and Nimmo-Smith (1986), Patterson et al. (1990) andAllerhand et al. (1991).
| form_spl | The form of the spiral time line (Archemedian or logarithmic)Switch: Default, archimedian. The software offers two visual representations of the logarithmicspiral, both of which have the base 2. Both representations gatherdoublings in time onto a single spoke of the spiral, and so both havethe general property that a = log2(t/T) where a is the angle between the horizontal axis and the radius drawnto a point on the spiral, T is the period of the sampling rate, and tis integration time -- the horizontal axis of the rectangularauditory image. Both T and t are in seconds; the angle a is measured inrevolutions, or circuits, of the spiral. Every time t doubles aincreases by 1, and so the integer part of a (the characteristic ofthe logarithm) specifies the circuit of the spiral. The fractionalpart of a (the mantissa of the logarithm) specifies the angle withinthe circuit. The archimedian spiral is like a coil of rope; that is, the radiusincreases by the thickness of the rope on each successivecircuit. The form of the archimedian spiral is r = k*a = k*log2(t/T) where r is the radius from the centre of the spiral to a point on thespiral and k is a constant for scaling the display (the thickness ofthe rope). The logarithmic spiral has the form r = k*[2**a] = k*[2**log2(t/T)] or r = k*[t/T] where ** means exponentiation, and r and k are the radius and adifferent scaling constant. The logarithmic version of the spiral hasthe advantage that image time is linear along the path of the spiral.However, it has the disadvantage that it expands exponentially andgives unwarranted emphasis to the longer integration intervals. As aresult, the default form of the spiral is archimedian. |
| zeroline_spd | Spiral axis, or time lineSwitch: Default, off When the switch is set to "on", a spiral axis, or time line isplotted. It is presented on the outside of the circuit, one channelbelow the lowest filter channel, just as in the rectangular auditoryimage. The default value is "off" because the spiral axis contains alarge number of points and it is slow to calculate and plot. Note: The size of the spiral display is scaled so that the radiusassociated with the current value of pwidth fits inside the rectanglespecified for the window (width_win and height_win). By default,width_win and height_win are set equal which produces a square displaywindow. The spiral does not have to be square window, however, andrectangular windows sometimes give a useful sense of depth. |
| orig_spd | Spiral start point and spiral orientationDefault units: revolutions. Default value 4.072 revolutions.This option determines the minimum integration interval that appearson the spiral. As a result, it determines the starting angle for thespiral and the angular orientation of structures/features that appearin the image. The option enables the user to set the orientation ofthe main spoke of the spiral for a given combination of sampling rateand stimulus period. Periods that are an exact power-of-2 times thebase period, 1/T, appear on the spoke preceeding horizontally from thecentre of the spiral towards the right. By removing a portion of acircuit, the orientation of the spiral can be set to suit the user. Areduction in orig_spd of 0.25 will rotate the main spoke fromhorizontal to vertical. When a sound has a long period, like 16 ms, the structure that appearsin the spiral image (that is, the activity that is aligned on spokes)falls mainly in the outer circuits of the spiral. orig_spd enablesthe user omit the higher octaves that occupy the central section ofthe spiral, and so focus in on the relevant octave of the sound. Theinteger part of orig_spd determines the number of octaves omitted. Note: orig_spd can scale and rotate the spiral simultaneously becauseinteger changes in the parameter cause a scaling without rotation andfractional changes cause rotations that are large relative to theaccompanying expansion or contraction. The default value, 4.072,assigns a vertical spoke to a period of 8 ms (and its base-2relatives) when the sampling rate is 20 kHz (or a base-2 relative). |
There are also two existing display options which have related butsomewhat different functions that are worth specific mention. They arebottom and pensize. | |
| bottom | Threshold value for the production of a dot in the spiral auditory image.Units: auditory image strength. Default value, 25 image units.This threshold specifies the value that a pulse in the auditory imagemust reach, or exceed, in order for it to be presented as a dot in thespiral image. |
| pensize | The size of the dots on the spiralDefault units, pixels: Default value, 2 pixels. The dots on the spiral are actually small squares and the valuepensize determines the number of pixels along the side of the square.This same option controls line width in the rectangular displays. Spiral plots are printed in the same way as other displays, that is,by setting postscript=on and routing them to a printer. Note, however,that printers have greater resolution than computer screens and thedefault mapping of a 2-pixel square from screen to printer may be abit large on a small spiral plot. The full resolution of the printercan be accessed by setting pensize=1 and varying the SilentOptionfiglinewidth. A good combination for most printers is pensize=1figlinewidth=0.25. (This value, 0.25, is the default forfiglinewidth.) |
.
In order to understand the spiral mapping, generate a rectangularauditory image the first note in cegc; that is,
> gensai input=cegc leng=96 top=4000 x0_win=10 or
> gensai input=cegc_br leng=96 top=4000 x0_win=10
For convenience, this note is referred to as C3 although, with aperiod of 8 ms, it is actually closer to B2. Now imagine therectangular pulse ribbon that would be formed by replacing each SAIpulse in this rectangular image with a dot. The spiral auditory imageis a mapped version of this rectangular pulse ribbon produced bycompressing the pulse ribbon vertically by a factor of about 10,stretching it horizontally by a factor of about 2, and then wrappingit counterclockwise into a spiral, with the right-hand edge of therectangular pulse ribbon at the centre of the spiral and the left-handedge of the pulse ribbon at the end of the outer circuit. The spiralversion of this same auditory image is generated by
> genspl input=cegc leng=96 x0_win=560 or
> genspl input=cegc_br leng=96 x0_win=560
If the spiral version is generated from a separate xterm, therectangular and spiral images can be presented side by side forcomparison.
The dots from vertical columns of pulses in the rectangular auditoryimage, merge into short bars in the spiral version of the imagebecause of the vertical compression. The bars fall along spokesradiating from the centre of the spiral. The dots from the arches ofpulses on either side of the vertical column in the rectangularauditory image appear in a stretched form like "wings" in the spiralauditory image. In the case of C3, three of the bars are aligned onone spoke of the spiral (the vertical spoke); they represent thestrong correlations that occur in the auditory image for sections ofthe NAP separated by 1, 2, and 4 cycles. So, much of the informationabout periodicity that is distributed across the temporal dimension inthe rectangular auditory image is gathered together onto a singlespatial vector in the spiral image. The bar associated with the thirdcycle in the auditory image appears on a secondary spoke that proceedsdownwards and a little to the right of vertical in the image.
It is convenient to describe the angle of the spokes on the spiralauditory image in terms of minutes on an analogue clock, since theorientation and scale of the analogue clock are more widely used thandegrees, radians, or musical cents, and the twelve numbers on theclock occur at intervals that correspond on the spiral to the notes ofthe chromatic, equitemperament scale. For example, the two spokesdelineated by C3 are at 0 and 25 minutes past the hour. A musicalinterval of 100 cents is 5 minutes on the clock and so, if thevertical spoke is C, the 25-minute spoke corresponds to F.
.
A pitch glide in the spiral auditory image
.
The spiral auditory image, like its rectangular counterpart, isnot limited to periodic sounds. When the pitch of a sound glidessmoothly from one note to another the pattern on the spiral auditoryimage rotates smoothly from one position to another, and when thepitch changes abruptly from one note to another, the spiral patterndissolves at the end of the first note and forms again in a differentorientation at the start of the next note. There are glides betweenthe notes of cegc and they illustrate the dynamic behaviour of thespiral auditory image. A spiral cartoon of cegc can be generated with
> genspl input=cegc bitmap=on or
> genspl input=cegc_br bitmap=on
and it can be reviewed using
> review cegc or > review cegc_br
> xreview cegc or > xreview cegc_br
As the sound in cegc proceeds from C3 to E3, G3 and C4, the primaryspoke of the pattern on the spiral rotates clockwise from 0 to 20, 35,and 60 minutes, completing one revolution as the pitch rises anoctave. Note, however, that each of the spokes has been extended byone circuit towards the centre of the spiral and there are now eightbars rather than four, occupying four spokes rather than two. Thus,in AIM, octaves are perceived to be similar because they produce spokepatterns with the same orientation on the spiral auditory image. Notealso, that when the note is G3, the secondary spoke of the spiralpattern is in the vertical position (C). The interaction of spokes onthe spiral is extended into a theory of musical consonance inPatterson (1986, 1987a). It is shown that the notes of the majortriad are those that have a spoke that coincides with the main spokeof the tonic, and the notes of the minor triad are minor images ofthose of the major triad.
.
REFERENCES
| Allerhand, M., Patterson, R.D., Robinson, K., andRice, P. (1991). Spiral VOS Interim Report V: Application of the SVOSalgorithm. APU Contract Report. (Appendix contains mathematics ofspirals). | |
| Patterson, R.D. (1986). "Spiral detection of periodicityand the spiral form of musical scales," Psychology of Music 14,44-61. | |
| Patterson R.D. and Nimmo Smith, I. (1986)."Thinning periodicity detectors for modulated pulse streams," InB.C.J. Moore and R.D. Patterson (Eds.) Auditory Frequency Selectivity(NATO ASI Series A: Life Sciences, Vol. 19), New York:Plenum, 299-307. | |
| Patterson, R.D. (1987a). "A pulse ribbon model ofperipheral auditory processing," In William A. Yost and Charles, S.Watson, (Eds.) Auditory Processing of Complex Sounds. Hillsdale,N.J., Erlbaum, 167-169. | |
| Patterson, R.D. (1987b). "A pulse ribbon model ofmonaural phase perception," J. Acoust. Soc. Am. 82, 1560-1586. | |
| Patterson, R.D. (1988). "Timbre cues inmonaural phase perception: Distinguishing within-channel cues andbetween-channel cues," In H. Duifhuis, J.W. Horst and H.P. (Eds.),Basic Issues in Hearing. Proceedings of the 8th InternationalSymposium on Hearing. London: Academic Press, 351-358. | |
| Patterson, R.D. (1990) "The tone height ofmulti-harmonic sounds," Music Perception 8, 203-214. | |
| Patterson, R.D., Allerhand, M., Holdsworth, J., andRice. P. (1990) Spiral VOS interim report IV: Optimisation of the SVOSalgorithm. APU contract report. (Appendices contain mathematics ofspirals). | |
| Patterson, R.D., Milroy, R. and Allerhand, M. (1993b)."What is the octave of a harmonically rich note?" Contemporary MusicReview Vol. 9, Harwood, Switzerland, 69-81. | |
| Robinson, K.L. & Patterson, R.D. (1995a)"The duration required to identify the instrument, the octave, or thepitch-chroma of a musical note," Music Perception 13, (in press). | |
| Robinson, K.L. & Patterson, R.D. (1995b)"The stimulus duration required to identify vowels, their octave, andtheir pitch-chroma," J. Acoust. Soc. Am 98, (in press). | |
.
.
FILES
.gensplrc The options file for genspl.
SEE ALSO
gensai
.
COPYRIGHT
Copyright (c) Applied Psychology Unit, Medical Research Council, 1995
Permission to use, copy, modify, and distribute this software without feeis hereby granted for research purposes, provided that this copyrightnotice appears in all copies and in all supporting documentation, and thatthe software is not redistributed for any fee (except for a nominalshipping charge). Anyone wanting to incorporate all or part of thissoftware in a commercial product must obtain a license from the MedicalResearch Council.
The MRC makes no representations about the suitability of thissoftware for any purpose. It is provided "as is" without express orimplied warranty.
THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDINGALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALLTHE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGESOR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THISSOFTWARE.
ACKNOWLEDGEMENTS
The AIM software was developed for Unix workstations by JohnHoldsworth and Mike Allerhand of the MRC APU, under the direction ofRoy Patterson. The physiological version of AIM was developed byChristian Giguere. The options handler is by Paul Manson. The revisedSAI module is by Jay Datta. Michael Akeroyd extended the postscriptfacilites and developed the xreview routine for auditory imagecartoons.
The project was supported by the MRC and grants from the U.K. DefenseResearch Agency, Farnborough (Research Contract 2239); the EEC EspritBR Programme, Project ACTS (3207); and the U.K. Hearing Research Trust.
| SunOS 5.6 | GENSPL (1) | 29 August 1995 |