skip to content

Department of Physiology, Development and Neuroscience

 

Manual Page - genspl(1)


Manual Reference Pages  - GENSPL (1)

NAME

genspl - generate a spiral auditory image


CONTENTS

Synopsis/syntax
Description
Options
Examples
References
Bugs
Files
See Also
Copyright
Acknowledgements

SYNOPSIS/SYNTAX

genspl [ option=value | -option ] filename

DESCRIPTION

The spiral auditory image is an alternative display of the auditory
image which emphasises the pitch of a sound and de-emphasises the
timbre, or sound quality. That is, it emphasises the global temporal
structure that arises in the NAPs of periodic sounds and focuses
attention on the scale of the structure. It de-emphasises the
fine-structure of the time-interval patterns in the auditory image and
the shape of all but the largest features (Patterson, 1986, 1987b).
Since the underlying information is the same for the spiral and
rectangular auditory images, the options that control the auditory
image itself are the same for genspl as they are for gensai. The
options are those with the suffix _ai and they have the same defaults
in both cases. The options are napdecay_ai, stdecay_ai, stcrit_ai,
stlag_ai, decay_ai, and stinfo_ai; they are described in the manual
entry for gensai.

With respect to displays, it is difficult to present the functions in
the individual channels of the auditory image in a spiral form; the
fine detail of the functions gets lost in the spiral perspective.
Accordingly, in the spiral perspective each of the separate SAI pulses
is replaced by a dot positioned at the time of the peak of the SAI
pulse. [Previously, this representation was referred to as a pulse
ribbon (Patterson, 1986, 1987b).] Conceptually, the spiral auditory
image is a set of concentric spirals one for each channel of the
auditory image. The channel with the highest centre frequency is on
the inside with the smallest radius; the channel with the lowest
centre frequency is on the outside with the largest radius. The
spirals lines are omitted for clarity, leaving just the dots. Sets of
dots in adjacent channels often form bars aligned along spokes
emmanating from the centre of the spiral. These bars show that the
same period exists in a range of adjacent channels in the NAP.
Furthermore, this information about correlation across channels
appears on the same spoke as the information indicating that the
pattern repeats in time. Thus the multi-channel spiral maps both
spectral and temporal information concerning the periodicity of the
sound onto a single spatial vector -- a spoke of the spiral. It is
this property that enables the spiral representation to explain octave
perception (Patterson, 1986, 1990; Patterson et al, 1993b; Robinson
and Patterson, 1995a,b).

OPTIONS

The input and output options for the spiral auditory image are the
same as for the rectangular auditory image. Furthermore, the spiral
auditory image of an extended sound is a cartoon just like the linear
auditory image, so spiral images of dynamic sounds can be generated,
stored, animated, and reviewed in the same way as with the
rectaulgular auditory image (gensai).

                                                                        .

    I. DISPLAY OPTIONS FOR THE SPIRAL AUDITORY IMAGE

                                                                        .

The options that control the position and size of the spiral image
window on the screen are the same as for all previous windows. The
options that control the timing of the frames and the portion of the
the auditory image presented in the display (frstep_aid, pwidth_aid,
and nwidth_aid) are the same as for gensai. Note, however, that
nwidth_aid is disabled (ignored) because logarithms are not defined
for non-positive numbers. This means that the spiral auditory image
represses most of the information in transients unless they occur
periodically with a period less than 35 ms. The minimum oberservable
time-interval is controlled by a new option, orig_spd.

There are three new display options for the spiral view of the
auditory image: form_spd, zeroline_spd, and orig_spd. The mathematics
of spirals as they pertain to the spiral auditory image are set out in
Patterson and Nimmo-Smith (1986), Patterson et al. (1990) and
Allerhand et al. (1991).

form_spl The form of the spiral time line (Archemedian or logarithmic)

Switch: Default, archimedian.

The software offers two visual representations of the logarithmic
spiral, both of which have the base 2. Both representations gather
doublings in time onto a single spoke of the spiral, and so both have
the general property that

        a = log2(t/T)

where a is the angle between the horizontal axis and the radius drawn
to a point on the spiral, T is the period of the sampling rate, and t
is ’integration time’ -- the horizontal axis of the rectangular
auditory image. Both T and t are in seconds; the angle a is measured in
revolutions, or circuits, of the spiral. Every time t doubles a
increases by 1, and so the integer part of a (the characteristic of
the logarithm) specifies the circuit of the spiral. The fractional
part of a (the mantissa of the logarithm) specifies the angle within
the circuit.

The archimedian spiral is like a coil of rope; that is, the radius
increases by the thickness of the rope on each successive
circuit. The form of the archimedian spiral is

        r = k*a = k*log2(t/T)

where r is the radius from the centre of the spiral to a point on the
spiral and k is a constant for scaling the display (the thickness of
the rope).

The logarithmic spiral has the form

        r = k*[2**a] = k*[2**log2(t/T)] or

        r = k*[t/T]

where ** means exponentiation, and r and k are the radius and a
different scaling constant. The logarithmic version of the spiral has
the advantage that image time is linear along the path of the spiral.
However, it has the disadvantage that it expands exponentially and
gives unwarranted emphasis to the longer integration intervals. As a
result, the default form of the spiral is archimedian.

zeroline_spd Spiral axis, or time line

Switch: Default, off

When the switch is set to "on", a spiral axis, or time line is
plotted. It is presented on the outside of the circuit, one channel
below the lowest filter channel, just as in the rectangular auditory
image. The default value is "off" because the spiral axis contains a
large number of points and it is slow to calculate and plot.

Note: The size of the spiral display is scaled so that the radius
associated with the current value of pwidth fits inside the rectangle
specified for the window (width_win and height_win). By default,
width_win and height_win are set equal which produces a square display
window. The spiral does not have to be square window, however, and
rectangular windows sometimes give a useful sense of depth.

orig_spd Spiral start point and spiral orientation

Default units: revolutions. Default value 4.072 revolutions.

This option determines the minimum integration interval that appears
on the spiral. As a result, it determines the starting angle for the
spiral and the angular orientation of structures/features that appear
in the image. The option enables the user to set the orientation of
the main spoke of the spiral for a given combination of sampling rate
and stimulus period. Periods that are an exact power-of-2 times the
base period, 1/T, appear on the spoke preceeding horizontally from the
centre of the spiral towards the right. By removing a portion of a
circuit, the orientation of the spiral can be set to suit the user. A
reduction in orig_spd of 0.25 will rotate the main spoke from
horizontal to vertical.

When a sound has a long period, like 16 ms, the structure that appears
in the spiral image (that is, the activity that is aligned on spokes)
falls mainly in the outer circuits of the spiral. orig_spd enables
the user omit the higher octaves that occupy the central section of
the spiral, and so focus in on the relevant octave of the sound. The
integer part of orig_spd determines the number of octaves omitted.

Note: orig_spd can scale and rotate the spiral simultaneously because
integer changes in the parameter cause a scaling without rotation and
fractional changes cause rotations that are large relative to the
accompanying expansion or contraction. The default value, 4.072,
assigns a vertical spoke to a period of 8 ms (and its base-2
relatives) when the sampling rate is 20 kHz (or a base-2 relative).

There are also two existing display options which have related but
somewhat different functions that are worth specific mention. They are
bottom and pensize.

bottom Threshold value for the production of a dot in the spiral auditory image.

Units: auditory image strength. Default value, 25 image units.

This threshold specifies the value that a pulse in the auditory image
must reach, or exceed, in order for it to be presented as a dot in the
spiral image.

pensize The size of the dots on the spiral

Default units, pixels: Default value, 2 pixels.

The dots on the spiral are actually small squares and the value
pensize determines the number of pixels along the side of the square.
This same option controls line width in the rectangular displays.

Spiral plots are printed in the same way as other displays, that is,
by setting postscript=on and routing them to a printer. Note, however,
that printers have greater resolution than computer screens and the
default mapping of a 2-pixel square from screen to printer may be a
bit large on a small spiral plot. The full resolution of the printer
can be accessed by setting pensize=1 and varying the SilentOption
figlinewidth. A good combination for most printers is pensize=1
figlinewidth=0.25. (This value, 0.25, is the default for
figlinewidth.)

                                                                        .

EXAMPLES

                                                                        .

In order to understand the spiral mapping, generate a rectangular
auditory image the first note in cegc; that is,

> gensai input=cegc leng=96 top=4000 x0_win=10 or

> gensai input=cegc_br leng=96 top=4000 x0_win=10

For convenience, this note is referred to as C3 although, with a
period of 8 ms, it is actually closer to B2. Now imagine the
rectangular pulse ribbon that would be formed by replacing each SAI
pulse in this rectangular image with a dot. The spiral auditory image
is a mapped version of this rectangular pulse ribbon produced by
compressing the pulse ribbon vertically by a factor of about 10,
stretching it horizontally by a factor of about 2, and then wrapping
it counterclockwise into a spiral, with the right-hand edge of the
rectangular pulse ribbon at the centre of the spiral and the left-hand
edge of the pulse ribbon at the end of the outer circuit. The spiral
version of this same auditory image is generated by

> genspl input=cegc leng=96 x0_win=560 or

> genspl input=cegc_br leng=96 x0_win=560

If the spiral version is generated from a separate xterm, the
rectangular and spiral images can be presented side by side for
comparison.

The dots from vertical columns of pulses in the rectangular auditory
image, merge into short bars in the spiral version of the image
because of the vertical compression. The bars fall along spokes
radiating from the centre of the spiral. The dots from the arches of
pulses on either side of the vertical column in the rectangular
auditory image appear in a stretched form like "wings" in the spiral
auditory image. In the case of C3, three of the bars are aligned on
one spoke of the spiral (the vertical spoke); they represent the
strong correlations that occur in the auditory image for sections of
the NAP separated by 1, 2, and 4 cycles. So, much of the information
about periodicity that is distributed across the temporal dimension in
the rectangular auditory image is gathered together onto a single
spatial vector in the spiral image. The bar associated with the third
cycle in the auditory image appears on a secondary spoke that proceeds
downwards and a little to the right of vertical in the image.

It is convenient to describe the angle of the spokes on the spiral
auditory image in terms of minutes on an analogue clock, since the
orientation and scale of the analogue clock are more widely used than
degrees, radians, or musical cents, and the twelve numbers on the
clock occur at intervals that correspond on the spiral to the notes of
the chromatic, equitemperament scale. For example, the two spokes
delineated by C3 are at 0 and 25 minutes past the hour. A musical
interval of 100 cents is 5 minutes on the clock and so, if the
vertical spoke is C, the 25-minute spoke corresponds to F.

                                                                        .

    A pitch glide in the spiral auditory image

                                                                        .

The spiral auditory image, like its rectangular counterpart, is
not limited to periodic sounds. When the pitch of a sound glides
smoothly from one note to another the pattern on the spiral auditory
image rotates smoothly from one position to another, and when the
pitch changes abruptly from one note to another, the spiral pattern
dissolves at the end of the first note and forms again in a different
orientation at the start of the next note. There are glides between
the notes of cegc and they illustrate the dynamic behaviour of the
spiral auditory image. A spiral cartoon of cegc can be generated with

> genspl input=cegc bitmap=on   or

> genspl input=cegc_br bitmap=on

and it can be reviewed using

> review cegc   or      > review cegc_br

> xreview cegc  or      > xreview cegc_br

As the sound in cegc proceeds from C3 to E3, G3 and C4, the primary
spoke of the pattern on the spiral rotates clockwise from 0 to 20, 35,
and 60 minutes, completing one revolution as the pitch rises an
octave. Note, however, that each of the spokes has been extended by
one circuit towards the centre of the spiral and there are now eight
bars rather than four, occupying four spokes rather than two. Thus,
in AIM, octaves are perceived to be similar because they produce spoke
patterns with the same orientation on the spiral auditory image. Note
also, that when the note is G3, the secondary spoke of the spiral
pattern is in the vertical position (C). The interaction of spokes on
the spiral is extended into a theory of musical consonance in
Patterson (1986, 1987a). It is shown that the notes of the major
triad are those that have a spoke that coincides with the main spoke
of the tonic, and the notes of the minor triad are minor images of
those of the major triad.

                                                                        .

REFERENCES

Allerhand, M., Patterson, R.D., Robinson, K., and
Rice, P. (1991). Spiral VOS Interim Report V: Application of the SVOS
algorithm. APU Contract Report. (Appendix contains mathematics of
spirals).
Patterson, R.D. (1986). "Spiral detection of periodicity
and the spiral form of musical scales," Psychology of Music 14,
44-61.
Patterson R.D. and Nimmo Smith, I. (1986).
"Thinning periodicity detectors for modulated pulse streams," In
B.C.J. Moore and R.D. Patterson (Eds.) Auditory Frequency Selectivity
(NATO ASI Series A: Life Sciences, Vol. 19), New York:Plenum, 299-307.
Patterson, R.D. (1987a). "A pulse ribbon model of
peripheral auditory processing," In William A. Yost and Charles, S.
Watson, (Eds.) Auditory Processing of Complex Sounds. Hillsdale,
N.J., Erlbaum, 167-169.
Patterson, R.D. (1987b). "A pulse ribbon model of
monaural phase perception," J. Acoust. Soc. Am. 82, 1560-1586.
Patterson, R.D. (1988). "Timbre cues in
monaural phase perception: Distinguishing within-channel cues and
between-channel cues," In H. Duifhuis, J.W. Horst and H.P. (Eds.),
Basic Issues in Hearing. Proceedings of the 8th International
Symposium on Hearing. London: Academic Press, 351-358.
Patterson, R.D. (1990) "The tone height of
multi-harmonic sounds," Music Perception 8, 203-214.
Patterson, R.D., Allerhand, M., Holdsworth, J., and
Rice. P. (1990) Spiral VOS interim report IV: Optimisation of the SVOS
algorithm. APU contract report. (Appendices contain mathematics of
spirals).
Patterson, R.D., Milroy, R. and Allerhand, M. (1993b).
"What is the octave of a harmonically rich note?" Contemporary Music
Review Vol. 9, Harwood, Switzerland, 69-81.
Robinson, K.L. & Patterson, R.D. (1995a)
"The duration required to identify the instrument, the octave, or the
pitch-chroma of a musical note," Music Perception 13, (in press).
Robinson, K.L. & Patterson, R.D. (1995b)
"The stimulus duration required to identify vowels, their octave, and
their pitch-chroma," J. Acoust. Soc. Am 98, (in press).

                                                                        .

BUGS

                                                                        .

FILES

.gensplrc The options file for genspl.

SEE ALSO

gensai

                                                                .

COPYRIGHT

Copyright (c) Applied Psychology Unit, Medical Research Council, 1995

Permission to use, copy, modify, and distribute this software without fee
is hereby granted for research purposes, provided that this copyright
notice appears in all copies and in all supporting documentation, and that
the software is not redistributed for any fee (except for a nominal
shipping charge). Anyone wanting to incorporate all or part of this
software in a commercial product must obtain a license from the Medical
Research Council.

The MRC makes no representations about the suitability of this
software for any purpose. It is provided "as is" without express or
implied warranty.

THE MRC DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL
THE A.P.U. BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES
OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
SOFTWARE.

ACKNOWLEDGEMENTS

The AIM software was developed for Unix workstations by John
Holdsworth and Mike Allerhand of the MRC APU, under the direction of
Roy Patterson. The physiological version of AIM was developed by
Christian Giguere. The options handler is by Paul Manson. The revised
SAI module is by Jay Datta. Michael Akeroyd extended the postscript
facilites and developed the xreview routine for auditory image
cartoons.

The project was supported by the MRC and grants from the U.K. Defense
Research Agency, Farnborough (Research Contract 2239); the EEC Esprit
BR Programme, Project ACTS (3207); and the U.K. Hearing Research Trust.


SunOS 5.6 GENSPL (1) 29 August 1995

Generated by manServer 1.07 from /cbu/cnbh/aim/release/man/man1/genspl.1 using man macros.