Quite a few years ago I wrote an overview article on the use of sound for representing geographic data, including a series of sound variables for mapping I developed. The article was titled “Sound and Geographic Visualization” and was published as a chapter in the now out-of-print book Visualization in Modern Cartography (MacEachren & Taylor eds., 1994).
Sound is used to convey information all the time, but less so in the realm of mapping where the visual dominates. The article explores the possibilities of making maps with sound, or using sound in tandem with a visual display to add additional layers of information.
Some work on tactile mapping had had occurred at the time the article was published, as well as a few dozen articles on sound for representing data in general (not geographic data). Subsequently, research on multi-sensory mapping has expanded but not as much as I expected. We still can’t hear data with Google Earth.
For an updated bibliography of related work, see the articles and books that cite “Sound and Geographic Visualization” at Google Scholar.
The article is below as originally published. It holds up ok, although technology has changed quite a bit.
Denis Elder emailed me (Feb 6, 2012) and asked about the “manuscript videotape” cited in the paper below. The video was made to accompany my 1993 Association of American Geographers (AAG) conference presentation on using sound with maps. Back then, showing the examples (which were created with the software Director on a Mac) live at the conference would not have been easy, so I made a video of the maps being used (and making sounds). This presentation was an early form of the work that would be published as “Sound and Geographic Visualization.”
I managed to find the video and had our media center create a digital version (in Quicktime / .mov format).
The video and the notes for the presentation (“Mapping with Sound”) are below. This is old stuff, so don’t laugh!
“Mapping with Sound.” (PDF) Presented at the 1993 Association of American Geographers Conference, Atlanta, Georgia.
“Mapping with Sound.” (Quicktime Movie, 9 minutes, 84mb) to accompany paper. Explanation of this video is in the above PDF of the paper presented at the conference.
Sound and Geographic Visualization
“Who the hell wants to hear actors talk?”
Harry Warner on being confronted with the prospect of the sound movie.
The issue of sound in the context of visualization may at first seem incongruous. There is, however, evidence to support the claim that sound is a viable means of representing and communicating information and can serve as a valuable addition to visual displays. Abstracted two-dimensional space and the visual variables – the traditional purview of cartography – may not always be adequate for meeting the visualization needs of geographers and other researchers interested in complex dynamic and multivariate phenomena. The current generation of computer hardware and software gives cartographers access to a broadened range of design options: three-dimensionality, time (animation), interactivity, and sound. Sound – used alone or in tandem with two-or three-dimensional abstract space, the visual variables, time, and interactivity – provides a means of expanding the representational repertoire of cartography and visualization.
This chapter discusses the use of realistic and abstract sound for geographic visualization applications. Examples of how and why sound may be useful are developed and discussed. Uses of sound in geographic visualization include sound as vocal narration, as a mimetic symbol, as a redundant variable, as a means of detecting anomalies, as a means of reducing visual distraction, as a cue to reordered data, as an alternative to visual patterns, as an alarm or monitor, as a means of adding non-visual data dimensions to interactive visual displays, and for representing locations in a sound space. The chapter concludes with research issues concerning sound and its use in geographic visualization.
Experiencing and Using Sound to Represent Data
Our sense of vision often seems much more dominant than our sense of hearing. Yet one only has to think about the everyday environment of sound surrounding us to realize that the sonic aspects of space have been undervalued in comparison to the visual (Ackerman 1990, Tuan 1993). Consider the experience of the visually impaired to appreciate the importance of sound and how it aids in understanding our environment. Also consider that human communication is primarily carried out via speech and that we commonly use audio cues in our day to day lives – from the honk of a car horn to the beep of a computer to the snarl of a angry dog as we approach it in the dark (Baecker and Buxton 1987).
There are several perspectives which can contribute to understanding the use of sound for representing data. Acoustic and psychological perspectives provide insights into the physiological and perceptual possibilities of hearing (Truax 1984, Handel 1989). An environmental or geographical perspective on sound can be used to examine our day to day experience with sound and to explore how such experiential sound can be applied to geographic visualization (Ohlson 1976, Schafer 1977, Schafer 1985, Porteous and Mastin 1985, Gaver 1988, Pocock 1989). Understanding how sound and music is used in non-western cultures may inform our understanding of communication with sound (Herzog 1945, Cowan 1948). Knowledge about music composition and perception provides a valuable perspective on the design and implementation of complicated, multivariate sound displays (Deutsch 1982). Many of these different perspectives have coalesced in the cross-disciplinary study of sound as a means of data representation, referred to as sonification, acoustic visualization, auditory display, and auditory data representation (Frysinger 1990). Within this context both realistic and abstract uses of sound are considered.
Using Realistic Sounds
Vocal narration is an obvious and important use of realistic sound. (note 2) Details about the physiological, perceptual, and cognitive aspects of speech are well known (Truax 1984, Handel 1989) and film studies offer insights into the nature and application of vocal narration (Stam, Burgoyne, and Flitterman-Lewis 1992).
Another use of realistic sounds is as mimetic sound icons, or “earcons” (Gaver 1986, Gaver 1988, Gaver 1989, Blattner et al. 1989, Mountfort and Gaver 1990). Earcons are sounds which resemble experiential sound. Gaver, for example, has developed an interface addition for the Macintosh computer which uses earcons. An example of an earcon is a “thunk” sound when a document is successfully dragged into the trash can in the computer interface.
Using Abstract Sounds
Abstract sounds can be used as cues to alert or direct the attention of users or can be mapped to actual data. Early experiments by Pollack and Ficks (1954) were successful in revealing the ability of sound to represent multivariate data. Yeung (1980) investigated sound as a means of representing the multivariate data common in chemistry after finding few graphic methods suitable for displaying his data. He designed an experiment in which seven chemical variables were matched with seven variables of sound: two with pitch, one each with loudness, damping, direction, duration, and rest (silence between sounds). His test subjects (professional chemists) were able to understand the different patterns of the sound representations and correctly classify the chemicals with a 90% accuracy rate before training and a 98% accuracy rate after training. Yeung’s study is important in that it reveals how motivated expert users can easily adapt to complex sonic displays.
Bly ran three discriminant analysis experiments using sound and graphics to represent multivariate, time-varying, and logarithmic data (Bly 1982a). In the first experiment she presented subjects with two sets of multivariate data represented with different variables of sound (pitch, volume, duration, attack, waveshape, and two harmonics) and asked subjects to classify a third, unknown set of data as being similar to either the first or second original data set. The test subjects were able to successfully classify the sound sets. In a second part of the experiment she tested three groups in a similar manner but compared the relative accuracy of classification among sound presentation only (64.5%), graphic presentation only (62%), and a combination of sound and graphic presentation (69%). She concluded that sound is a viable means of representing multivariate, time-varying, and logarithmic data – especially in tandem with graphic displays.
Mezrich, Frysinger, and Slivjanovski confronted the problem of representing multi-variable, time-series data by looking to sound and dynamic graphics (Mezrich et al. 1984). They had little success finding the graphic means to deal with eight-variable time series data. An experiment was performed where subjects were presented with separated static graphs, static graphs stacked atop each other (small multiples), overlaid static graphs, and redundant dynamic visual and sound (pitch) graphs. The combination of dynamic visual and sound representation was found to be the most successful of the four methods.
An ongoing project at the University of Massachusetts at Lowell seeks to expand the use of sound for representing multivariate and multidimensional data. The “Exvis” project uses a one-, two-, and three-dimensional sound space to represent data (Smith and Williams 1989, Smith et al. 1990, Williams et al. 1990, Smith et al. 1991). The project is based upon the idea of an icon: “an auditory and graphical unit that represents one record of a database” (Williams et al. 1990, 44). The visual attributes of the icon are “stick-figures” which can vary in “length, width, angle, and color” (Williams et al. 1990, 45). The sonic attributes of the icons are “pitch, attack rate, decay rate, volume, and depth of frequency modulation” (Williams et al. 1990, 45). An experimental Exvis workstation has been set up to run various human factors experiments, and initial tests of subjects have been completed. The results reveal that using visual and sonic textures together improves performance.
Two-dimensional sound displays which locate sounds up/down, right/left via stereo technology, and three-dimensional sound displays which add front/back to two dimensional displays are also being developed. A three-dimensional virtual sound environment has been developed at the NASA-Ames Research Center (Wenzel et al. 1988a, Wenzel et al. 1988b, Wenzel et al. 1990). The ability to locate sound in a multidimensional “sound space” will undoubtedly be important for representing spatial relationships.
Almost all of the above studies and applications which use abstract sound to represent data rely upon a set of basic and distinct elements of sound – pitch, loudness, timbre, etc. These abstract elements can be called “sound variables” (figure 1). Most of these abstract sound variables naturally represent nominal and ordinal levels of measurement. (note 3) As such, a “variables” approach, analogous to that developed by Bertin (1983) for visual variables, can serve as a useful heuristic for incorporating sound in geographic visualization displays. This approach can be contrasted with one based on music theory and composition (Weber 1993a, Weber and Yuan 1993). Visual map symbolization and design have been approached from many different perspectives – psychophysics, cognitive psychology, Arnheim’s art theory, and Bertin’s semiotics to name a few – and all have added to our knowledge of cartographic design. The same multiplicity of approaches will undoubtedly underpin our approaches to the use of sound.
Using Abstract Sounds in Geographic Visualization: The Sound Variables
The following discussion reviews a basic set of abstract sound variables – not a complete taxonomy – which are viable for geographic visualization applications. This set of abstract sound variables can be used in tandem with voice narration and mimetic earcons as discussed above. The use of the term “variable” is used loosely and does not imply that the elements of sound are wholly separable from each other. Abstracted elements of sound, like those of vision, interact and effect each other (Lunney and Morrison 1990, Kramer and Ellison 1992). However, abstract sound variables, as with the visual variables, serve to clarify initial design choices and can serve as a viable starting point for incorporating sound into visual displays.
Data display applications using realistic and abstract sounds require a temporal dimension. This is in part due to the need to compare different sounds in order to glean information from the sounds. For example, the use of relative pitch – comparison with other pitches – is a key factor in using pitch to represent data (Kramer 1992). A tone of a certain pitch heard alone means less than when that same tone is heard in comparison to an array of varying pitches. In addition, a temporal dimension is required for certain variables of sound which must vary in some way over time for their character to be identified. Duration, for example, only exists when there is some beginning and end of a sound over time.
The Abstract Sound Variables
Location: the location of a sound in a two or three dimensional sound space. Location is analogous to location in the two dimensional plane of the map. As a sound variable location requires stereo or three-dimensional sound displays. Two- and three-dimensional sound allows for the mapping of left/right, up/down, (and in 3-D) forward/backward locations. Location can represent nominal and ordinal data. For example, a two-dimensional stereo sound map could use location to direct attention to a specific area of the graphic map display where the fastest change is occurring in a spatial data set over time.
Loudness: the magnitude of a sound. Loudness is measured in terms of the decibel and implies an ordinal difference. The average human can just detect a one decibel sound, can detect differences in loudness of about three decibels, and can tolerate up to approximately 100 decibels (the loudness of a jet taking off). We would like to avoid 100 decibel maps. Loudness is inherently ordered and thus seems appropriate for representing ordinal level data. Loudness may be used to imply direction and can be varied over time to represent ordinal change in data over time (eg., to alert one to important but infrequently occurring phenomena). It is known that humans usually become unconscious of constant sounds (Buxton 1990, 125). For example, although the hum of a computer’s fan in becomes inaudible soon after switching it on, even a slight variation in the fan will be instantly noticed. This effect can be used to represent information where a quiet tone represents a steady state and any variation represents change.
Pitch: the highness or lowness (frequency) of a sound. Pitch is highly distinguishable and is one of the most effective ways of differentiating order with sound. Judgements of pitch will vary somewhat from person to person. Western music has traditionally employed a scale of eight octaves comprised of twelve pitches each; extreme pitches, however, are hard to distinguish. On average, individuals can easily distinguish 48 to 60 pitches over at least four or five octaves, and this implies that pitch (divided up by octaves) can be used to represent more than a single variable in a sonic display (Yeung 1980, 1121). Mapping with pitch is appropriate for ordinal data. In addition, pitch may imply direction, where, for example, an increasing pitch represents upward movement. Tonal sharps and flats can be used to some effect also, possibly to represent a second variable such as variations in data quality. Every twelfth pitch has the same pitch color (chroma) and this may serve to represent nominal or ordinal data (Weber 1993b). Pitch, then, can represent quantitative data, primarily ordinal. Time can be added to pitch to create a sound graph which tracks ordinal change in data over time.
Register: the relative location of a pitch in a given range of pitches. Register describes the location of a pitch or set of pitches within the range of available pitches. Register is a more general case of pitch, where one can specify a high, medium and low register, each retaining a full set of chromatic pitches. (note 4) It can add to pitch as a broader ordinal distinction. An application which uses register and pitch is discussed later in this paper.
Timbre: the general prevailing quality or characteristic of a sound. Timbre describes the character of a sound and is best described by the sound of different instruments: the brassy sound of the trumpet, the warm sound of the cello, the bright sound of the flute, etc. Timbre, then, implies nominal differences (Risset and Wessel 1982, Kramer and Ellison 1992). For example, a brassy sound could be used to represent an urban phenomena while a warm or mellow sound could be used to represent a rural phenomena. Such an example draws attention to the evocative nature of sound.
Duration: the length of time a sound is (or isn’t) heard. Duration refers to the length of a single sound (or silence) and can represent some quantity mapped to that duration. Silence must be used in tandem with duration if one is to distinguish the duration of multiple sounds (Yeung 1980, 1122). Duration is naturally ordinal.
Rate of Change: the relation between the durations of sound and silence over time. Rate of change is primarily a function of the varying (or unvarying) durations of sounds/silences in a series of ordered sounds over time and can represent consistent or inconsistent change in the phenomena being represented.
Order: the sequence of sounds over time. The order in which sounds are presented over time can be “natural” – such as the progression from a low pitch to a high pitch – and this means that it should be easy to detect general trends (patterns) in data presented with sound variables such as pitch or loudness. The “natural order” of sounds can be manipulated to represent data “disorder” or different orders. For example, if a natural order of sounds (say pitch from low to high) is matched to chronological temporal order, any non-ordered sound will be recognizable as an indication that data are out of chronological order. An example will be discussed later in this paper.
Attack/Decay: the time it takes a sound to reach its maximum/minimum. The attack of a sound is the time it takes for a sound to reach a specific level of loudness; the decay is the time it takes to reach quiet. Attack has been found to be much more successful in conveying information than decay (Lunney and Morrison 1990, 144). Attack/decay could be used to represent the spread of a specific data variable in a given unit: for example, pitch may represent an average value for the income in a county and attack/decay the spread of values; a long attack and decay would represent, then, a wide range of incomes in that county. Attack/decay may also be used to represent rates of diffusion or recession.
Thus far this chapter has described the use of realistic sound (vocal narration and mimetic earcons) and the use of abstract sound (summarized as a basic set of abstract sound variables) for representing data. The next section describes a series of geographically-oriented applications of sound in geographic visualization.
Sound and Geographic Visualization: Applications
Animation and Sound
Sound is an inherently temporal phenomena. As a result, it is particularly suited to use with map animation. Recent work on cartographic animation has led to the derivation of a set of dynamic variables – duration, rate of change, and order – and some suggestions for their application (DiBiase et al. 1992). Sound can be closely linked to the dynamic variables and their applications and may be used to enhance the comprehension of information presented in a dynamic display. In addition, potential uses of the dynamic variables may be suggested by examining temporal issues in sound and music.
At least three distinct kinds of change can be visualized by a map animation. Spatial change, often called a “fly-by,” is visualized by changing the observer’s viewpoint of some static object. Computerized flight simulators provide an excellent example of visualized spatial change. Voice-over has been used with fly-by applications to provide an explanation of what is being seen (Jet Propulsion Laboratories 1987, DiBiase et al. 1991). Vocal narration is, then, an important way for using sound to enhance dynamic geographical visualizations. Mimetic sounds – earcons – can also be used to enhance dynamic geographic visualizations. Thus sound can be used as a mimetic symbol. The sound of fire and wind, for example, has been incorporated into an animation of forest growth to cue the viewer into what is happening in the animation (Krygier 1993). In this case sound serves as a redundant variable with which to enhance certain key events in the dynamic display.
Chronological change, or “time-series,” may be visualized by mapping chronologically-ordered phenomena onto an animated series. A map of the diffusion of AIDS over time and space is an example of visualized chronological change (Gould, Kabel, Gorr, and Golub 1991, Kabel 1992). Such spatial and chronological change is intuitive and minimal explanation is needed to make such representations understandable for most users. Sound has been used to add additional information to the chronological display of AIDS (Krygier 1993). Loudness is used to represent total cases of AIDS for each of the years displayed in the animation. The increasing loudness adds both a dimension of information (increasing number of AIDS cases) as well as a sense of impending disaster. Pitch is also used in the same AIDS animation to represent the percent increase of new cases for each year. The pitch can be heard “settling down” as the percent increase drops and steadies in the late 1980s. An anomaly can be heard in 1991 where the animation switches from actual AIDS cases to model predicted AIDS cases. Thus sound can be used to detect anomalies in data.
Initially less intuitive but valuable for expert users of visual displays is a third kind of change which can be visualized with map animation. Attribute change, or “reexpression,” is visualized by mapping attribute-ordered phenomena onto an animated series. Such a visualization of change in attribute may be very useful for enhancing or revealing patterns not evident in the original time-series. Graphic methods to alert the animation viewer to the fact that the animation is ordered in terms of attribute change have been used. For example, a time scale can be included at the bottom of the animation and a pointer can indicate the year of each animation scene. The problem with this graphic solution is that the viewer’s attention can be focused on the map or on the time bar and not both at the same time. This is obviously a situation where sound may provide a better solution since it is possible to watch the map and listen to it at the same time. Thus sound can be used to replace a distracting visual element on a map display. Pitch is used to replace the time bar in an animation of presidential election landslides (Krygier 1993). The animation is shown in chronological order with pitch mapped to years (increasing pitch = increasing years). This familiarizes the user with the meaning of pitch in this animation. The same data is then reexpressed in terms of an attribute – magnitude of the landslide – and shown again. The fact that the pitches are heard out of order cues the viewer that the visual sequence is out of chronological order. Patterns noticed in the visual or the sonic display – or both – may then be more carefully examined. Sound patterns may be more easily distinguished than visual patterns, and especially valuable for dealing with cyclic temporal data (Weber 1993b). Such an application of sound is more important as the amount of data being visualized increases and it becomes necessary to isolate the few interesting patterns from the many uninteresting ones.
Interactivity and Sound
Interactive multimedia displays have began to attract the attention of cartographers (Andrews and Tilton 1993, Armenakis 1993, Buttenfield 1993, DiBiase et al. 1993, Huffmann 1993, Shiffer 1993). Voice-over, realistic sounds, and abstract sound used as cues and mapped to data can be incorporated into such displays. A prototype interactive display which uses graphics and sound has been developed to display up to four data variables simultaneously (Krygier 1993). The prototype is based on 1990 U.S. Census data from Pennsylvania. A choropleth map is used to display percent population not in the labor force. A graduated circle map displaying median income is then added to the choropleth map. At this point one could add a third data variable to the display by changing the choropleth map into a bivariate choropleth map, by adding a data variable as a fill for the graduated circles, or by going to a second map. All of these have problems: bivariate maps are somewhat difficult to interpret and understand (Olson 1981); the third variable in the fill of the graduated circle will be hard to see in the small circles; and multiple maps may lead to comparison problems. Sound can provide an alternative to these visual methods. The prototype uses a single pitch in three different octaves (register) to display a “drive to work index.” The index is either high, medium, or low and refers to the relative distance workers have to drive to their places of work. When one points and clicks Pike County with the mouse a high octave pitch is heard representing a long drive to work. Thus two variables are seen and one is heard. A fourth data variable can be added by using the range of pitches within each of the three octaves. In the case of the prototype, this was done with another high/medium/low index, that relating to the percent poor in each county. For example, when one points and clicks Pike County a high octave pitch is heard followed by a low pitch within that octave representing a long drive to work and a low rate of poverty. After a short period of using such a “quad-variate” display it becomes relatively easy to extract the four data variables. Such a supposition will, of course, have to be more carefully evaluated but experience with the prototype suggests that sound is a viable way to add more data dimensions to visual displays.
Sound and Geographic Visualization: Some Research Issues
This chapter has thus far reviewed various ways that sound can be incorporated into visualization displays. These methods include the use of realistic and abstract sounds. A basic set of abstract sound variables have been defined and illustrated with some geographic examples and applications. Many issues, obviously, remain to be investigated. Learning and Sonic Legends
If sound maps are to work then the design of effective sound legends will have to be investigated. Because sound is not a traditional mapping variable, the user of a map display which incorporates sound will have to be acclimated to the idea of sound as a data presentation method as well as what the sound variables used in the display represent. How to best design a sonic legend is unclear: should it be “all sound” and set up as an interactive tutorial before the use of the display begins, or should it be akin to a traditional map legend, available if and when needed? The idea of sequencing may be useful in helping sound map users to understand the elements of a multivariate sound display. (Slocum et al. 1990).
A solid body of knowledge exists (primarily in acoustics, psychology, and music) detailing the sound perception capabilities of the human physiological system. This knowledge can be used to underpin our understanding of the possibilities and limitations of the sound variables as visual design elements. We must also be aware of the problem of “sonic overload,” of barraging the user with too many different variables and dimensions of sound (Blattner et al. 1989, 12, O’Connor 1991). Attendants at the Three Mile Island Nuclear Power Plant were addled by more than sixty different auditory warning systems during the Plant’s 1979 crisis (Buxton 1990, 125).
More difficult issues of identification, problem solving, judgment, remembering, and understanding of sound displays await attention. The sequential nature of sound raises questions of knowledge acquisition and memory. There are also questions of how much information people can deal with. A combined visual and sonic display may be one way to deal with the ever increasing complexities that geographers want to approach; such complex displays, however, have few precedents and may be more confusing than enlightening, especially for non-expert users. However, one of the goals of visualization is the construction of representations which can serve the needs of motivated expert users who are dealing with complex data and thus require sophisticated display methods. Evaluations of such methods must consider the capabilities of these users. One promising way to make the sonic display of complex information feasible is to adapt sound structures we are adept at dealing with – primarily those from music – to display design (Weber and Yuan 1993). Musical structures such as rhythm, melody, and harmony are consensual, defined elements of music and must be differentiated from arbitrary and abstract sound representations such as those discussed above in Yeung’s (1980) study (two pitches, loudness, damping, direction, duration, and silence). To what degree a familiarity with common musical structures will help people to distinguish and recognize sonic patterns is, however, unclear. Indeed, it should be expected that the sound variables – based on common musical structures or arbitrarily based on duration, rate of change, and order – will interact, interfere, and affect each other (Kramer and Ellison 1992, Lunney and Morrison 1990).
Location of Sound
The ability to locate sounds in a two- or three-dimensional “sound space,” analogous to the two- or three-dimensions of the map, is an important aspect of sound which is particularly applicable to the display of spatial data. The location of sound can be used in an abstract manner, as a cue to direct attention to a specific area of a visual display, or can be used to represent the actual location of phenomena in a display. Such applications of sound have been investigated (Blauert 1983, Wenzel et al 1987, Wenzel et al. 1988a, Wenzel et al. 1988b, Wenzel et al. 1990, Begault 1990, Smith et al. 1990) but not in terms of geographically referenced data. Questions concerning hardware and software requirements (for two- or three-dimensional sound generation) and issues of the human ability to adequately locate sounds in a sound space need to be investigated.
Sound Maps for the Visually Impaired
The use of sound displays have been explored in the context of communicating scientific data to visually impaired students. Lunney and Morrison have used high/low pitches and pitch duration to map out “sound graphs” and have found that visually impaired users are able to comprehend the graphs and understand the patterns with relative ease (Lunney and Morrison 1981, Lunney 1983). Mansur, Blattner, and Joy compared tactile graphs to sound graphs (created with continuously varying pitch) and evaluated subjects based upon judgements of line slopes, curve classification, monotonicity, convergence, and symmetry. They found comparable accuracy of information communication capabilities between tactile and sound graphs yet sound graphs were found to be a quicker way of communicating information and were easier to create (Mansur 1984, Mansur et al. 1985).
One can speculate on the use of sound as a means of representation for the visually impaired. While there is a body of cartographic research on tactile maps (Andrews 1988), there is no cartographic research on sound maps for the visually impaired. The nature of the map – with its two graphic dimensions and one or more data variables – complicates the matter and makes sound maps more difficult to create than simple sound graphs. Is there any way to construct spatial representations using a one dimensional sound? If a high/low pitch is used to represent high/low location can this (or other similar sonic metaphors) be used to map with one dimension of sound? Or will we have to look to stereo (two dimensions) and three dimensional sound? If we can create a two or three dimensional sound space how will maps be represented in that space? How finely can the sound variable location be specified? Can both dimensions of the plane and a data variable be represented? How easy is it to comprehend, remember, and use a sonic spatial display? The ability to create and locate a sound in two or three dimensions remains a major problem hindering the use of sound for spatial data representation.
A hybrid of tactile materials and sound may prove more useful than either alone: the research carried out by Yeung, Bly, and Williams has shown that complex multi-variate single dimension sounds can be detected and understood. Thus a map display for the visually impaired could use tactile display for base map information and could use sound to represent single or multivariate data located at points or areas on the map. The sonic representations could be roughly located in a two (or three) dimensional sound space or they could be selected by an interactive tactile display. These approaches would allow the communication of complex, multivariate data to the visually impaired – something to which tactile maps are not particularly well suited. It would use the tactile display for base locational attributes which may be more difficult to create and interpret with sound.
Sound Maps of Data Uncertainty and Quality
Maps often impose strict points, lines, and areas where no strict structures actually exist or where the certainty of their location or magnitude is low. In addition, maps are often compiled from multiple data sources, and these data vary in quality and reliability. Maps tend to be “totalizing” creatures: variations in uncertainty and quality are smoothed over to create an orderly, homogeneous graphic. On one hand, this is why maps are so useful, and it is obvious that maps enable us to deal with our uncertain and messy world by making it look more certain and tidy. Yet it seems important that some sense of the uncertainty or quality of the represented data be available. The cartographer’s reflex is to conceive of uncertainty as a statistical surface and to represent it graphically. There is a rich history of graphical presentations of uncertainty – many historical atlases for example, show past migration of peoples in a manner which stresses that what is know about the migration is “fuzzy” and not well established. On the other hand, taken to its logical extreme, a map which visually displays uncertainty may become a blurred mess. The purpose of maps, remember, is to impose order, not to accurately represent chaos. Further, there is only so much visual “headroom” on a display: using visual variables to display uncertainty may have the effect of limiting the display of other data variables. A final problem with visual representations of uncertainty is that it is difficult to model visually the composite uncertainty of two or more map overlays – the realm of multivariate data displays.
An alternative approach to “visualizing” uncertainty takes advantage of sound (Fisher 1994). A “sound map” can be created which underlies the visual map and can be accessed if and when necessary. This sound map may be multivariate: register and pitch could be used to distinguish different layers of information. The click of a mouse at any position on the visual map would cause a specific sound mapped to that specific point, line, or area to be heard; dragging the mouse would reveal a variation of sound as the sound-mapped data varied. A variable pitch (low to high) can represent the level of uncertainty. By dragging the mouse – the sonic equivalent to Monmonier’s “brushing” (Monmonier 1989) – one begins to move toward a representation of a two dimensional space; but the effect would be like having a small “window” with which you could only see a small portion of a map at one time. Can people build up an sound “image” of the entire sound map from these small glimpses? The creation of a two dimensional sound space would allow a fuller representation of the uncertainty surface. However, if the entire uncertainty surface needs to be known, a visual representation may be more appropriate. Using sound to represent data quality or uncertainty has the advantage of preserving the sharp image of the map while allowing for the extraction of data quality or uncertainty data if and when it is needed or if it passes a predetermined threshold. Sound, in this case, serves as an invisible source of information and may be one solution to the problem of representing the quality and uncertainty of data in an already crowded visual display.
Applicability and Viability
Sound has been only minimally used for data display to date in part because of the limitations and costs of producing and using sound. Such limitations are rapidly diminishing as computers incorporate sophisticated sound capabilities. The Musical Instrument Device Interface (MIDI) standard for sound and music generation in computers is well accepted across all computer platforms and MIDI compatible software is currently available which can create and manipulate all of the sound variables this paper has discussed. It is possible to incorporate sound into visual displays with commonly available hardware and off-the-shelf software. (note 5)
Finally, it seems reasonable to approach the use of sound in visual displays with a critical sense of its viability and value in terms of actual applications which have real (expert and/or motivated and interested) users. Evaluations can be made using traditional quantitative methods as well as qualitative methods such as focus groups (Monmonier and Gluck 1993). While interesting in and of itself as a possible addition to visual display, it is important to avoid using sound just for the sake of its novelty.
This chapter reviews the possibilities of using of sound as a design variable for geographic visualization. It describes how and why sound may be used as vocal narration, as a mimetic symbol, as a redundant variable, as a means of detecting anomalies, as a means of reducing visual distraction, as a cue to reordered data, as an alternative to visual patterns, as an alarm or monitor, as a means of adding non-visual data dimensions to interactive visual displays, and for locating sounds in a “sound space.”
In general the exploration of sound as a visualization design method for geographic visualization is important for two interrelated reasons. It is necessary to explore the ways in which we can take full advantage of human perceptual and cognitive capabilities in our visualization designs. Much of the inspiration behind surging interest in visualization lies in the desire to exploit the tremendous and often unappreciated visual capabilities of humans in order to cope with increasing amounts of data about our physical and human worlds. Our sense of hearing, which has until recently been unappreciated as a means of representing data, can be used to expand the representational repertoire of cartographic design. At the same time, it is important to realize that the ideas and phenomena geographers wish to represent may not always be best represented by static, two-dimensional visual displays. Sound offers a way to represent information for map users who lack the sense of vision. Sound, in tandem with time, offers a way to enhance the comprehension of non-chronological uses of time. Sound offers a way to expand the limited possibilities of representing multivariate data with graphics. Sound, in other words, provides us with more choices for representing ideas and phenomena and thus more ways in which to explore and understand the complex physical and human worlds we inhabit.
1. Harry Warner (of Warner Brothers fame) on being confronted with the prospect of the sound movie. Quoted in A. Walker, (1979) The Shattered Silents: How the Talkies Came to Stay, William Morrow and Co., New York.
2. This assumes the content and meaning of the language used in a narration is unproblematical, which is, of course, an oversimplification. The use of vocal narration in visualization displays opens up interesting possibilities for investigating the relations between spoken and visual languages.
3. I have collapsed interval and ratio levels of measurement into the category of ordinal pending further research on the capacity of sound to represent these finer distinctions.
4. In music, distinctions between soprano, alto, tenor, and bass are more commonly used.
5. The applications created for this paper were constructed on a Macintosh. Sounds were generated or digitized with MacroMind™ SoundEdit Pro and were combined with animations in MacroMind™ Director.
Acknowledgements: For constructive criticism and ideas thanks to Sona Andrews, Mark Detweiler, David DiBiase, Roger Downs, Gregory Kramer, Alan MacEachren, Mark Monmonier, David Tilton, and Chris Weber.
Ackerman, D. (1990) “Hearing,” In: Ackerman, D, A Natural History of the Senses, Random House, New York, pp. 173-226.
Ammer, C. (1987) The Harper Collins Dictionary of Music, Harper Collins, New York.
Andrews, S., and D. Tilton. (1993) “How Multimedia and Hypermedia are Changing the Look of Maps,” Proceedings: Auto-Carto 11, Minneapolis. pp. 348-366.
Armenakis, C. (1993) “Hypermedia: An Information Management Approach for Geographic Data,” Proceedings: GIS/LIS 1993 Annual Conference, Minneapolis, pp. 19-28.
Baecker, R. and W. Buxton (1987) “The Audio Channel,” In: Baecker, R. and W. Buxton, Readings in Human-Computer Interaction, Morgan Kaufman Publishers, Los Altos CA, pp. 393-99.
Begault, D. (1990) “The Composition of Auditory Space: Recent Developments in Headphone Music,” Leonardo, Vol. 23 (1), pp. 45-52.
Bertin, J. (1983) Semiology of Graphics, University of Wisconsin Press, Madison.
Blattner, M., D. Sumikawa, and R. Greenberg (1989) “Earcons and Icons: Their Structure and Common Design Principles,” Human-Computer Interaction, Vol. 4(4), pp. 11-44.
Blauert, J. (1983) Spatial Hearing: The Psychophysics of Human Sound Localization, MIT Press, Cambridge.
Bly, S. (1982a). Sound and Computer Information Presentation, Unpublished PhD, University of California – Davis.
Bly, S. (1982b) “Presenting Information in Sound,” CHI `82 Conference on Human Factors in Computer Systems, pp. 371-375.
Buttenfield, B. (1993) “Proactive Graphics and GIS: Prototype Tools for Query, Modeling, and Display,” Proceedings: Auto-Carto 11, Minneapolis. pp. 377-385.
Buxton, W. (1985) “Communicating With Sound,” Proceedings of CHI `85, pp. 115-119.
Buxton, W. (1989) “Introduction to This Special Issue on Nonspeech Audio,” Human-Computer Interaction, Vol. 4(4), pp. 1-9.
Buxton, W. (1990) “Using our Ears: An Introduction to the Use of Nonspeech Audio Cues,” In: Farrell, E., (ed.), op cit., pp. 124-127.
Cohen, M. and L. Ludwig. (1991) “Multidimensional Audio Window Management,” International Journal of Man-Machine Studies, Vol. 34(3), pp. 319-336.
Cowan, G. (1948) “Mazateco Whistled Speech,” Language, Vol. 24, pp. 280-286.
Deutsch, D. (ed.) (1982) The Psychology of Music, The Academic Press, New York.
DiBiase, D., A. MacEachren, J. Krygier, C. Reeves, and A. Brenner. (1991) “Uses of the Temporal Dimension in Cartographic Animation + Visual and Dynamic Variables,” Computer Animated Videotape, Deasy Geographics Lab, Penn State University, University Park, PA.
DiBiase, D., A. MacEachren, J. Krygier, C. Reeves. (1992) “Animation and the Role of Map Design in Scientific Visualization,” Cartography and GIS, Vol. 19 (4), pp. 201-14.
DiBiase, D., C. Reeves, A. MacEachren, J. Krygier, M. von Wyss, J. Sloan, and M. Detweiler. (1993) “A Map Interface for Exploring Multivariate Paleoclimate Data,” Proceedings: Auto-Carto 11, Minneapolis. pp. 43-52.
Evans, B. (1990) “Correlating Sonic and Graphic Materials in Scientific Visualization,” In: Farrell, E., (ed.), op cit., pp. 154-162.
Farrell, E., (ed) (1990) Extracting Meaning from Complex Data: Processing, Display, Interaction, Proceedings of The International Society for Optical Engineering, Vol. 1259, SPIE, Bellingham WA.
Farrell, E., (ed.) (1991) Extracting Meaning from Complex Data: Processing, Display, Interaction II. Proceedings of The International Society for Optical Engineering, Vol. 1459, SPIE, Bellingham WA.
Fisher, P. (1994) “Hearing the Reliability in Classified Remotely Sensed Images,” Cartography and Geographical Information Systems, vol. 21, no. 1, pp. 31-36.
Frysinger, S. (1990) “Applied Research in Auditory Data Representation,” In: Farrell, E., (ed.), op cit., pp. 130-139.
Gaver, W. (1986) “Auditory Icons: Using Sound in Computer Interfaces,” Human-Computer Interaction, Vol. 2 (2), pp. 167-77.
Gaver, W. (1988) Everyday Listening and Auditory Icons, Unpublished PhD, University of California – San Diego.
Gaver, W. (1989) “The Sonic Finder: An Interface that uses Auditory Icons,” Human-Computer Interaction Vol. 4 (4), pp. 67-94.
Gould, P., J. Kabel, W. Gorr, and A. Golub. (1991) “AIDS: Predicting the Next Map.” Interfaces Vol. 21(3), pp. 80-92.
Handel, S. (1989) Listening: An Introduction to the Perception of Auditory Events, The MIT Press, Cambridge.
Herzog, G. (1945) “Drum Signaling in a West African Tribe,” Word, Vol. 1, pp. 217-38.
Huffmann, N. (1993) “Hyperchina: Adventures in Hypermapping,” Proceedings: 16th International Cartographic Conference, Cologne, pp. 26-45.
Jet Propulsion Laboratories. (1987) “LA: The Movie,” Computer animated videotape, Jet Propulsion Laboratories, Pasadena.
Kabel, J. (1992) “AIDS in the United States,” Computer Animated Videotape, Deasy Geographics Lab, Penn State University, University Park, PA.
Kendall, G. (1991) “Visualization by Ear: Auditory Imagery for Scientific Visualization and Virtual Reality,” Computer Music Journal, Vol. 15 (4), pp. 70-73.
Kramer, G. (1992) Personal communication, April 13.
Kramer, G. and S. Ellison. (1992) “Audification: The Use of Sound to Display Multivariate Data,” Unpublished manuscript.
Krygier, J. (1993) “Sound and Cartographic Design,” Manuscript Videotape, Deasy Geographics Lab, Penn State University, University Park, PA.
Iverson, W. (1992) “The Sound of Science,” Computer Graphics World, January, pp. 54-62.
Lunney, D. (1983) “A Microcomputer-based Laboratory Aid for Visually Impaired Students,” IEEE Micro, Vol 3 (4), pp. 19-31.
Lunney, D. and R. Morrison. (1981) “High Technology Laboratory Aids for Visually Handicapped Chemistry Students,” Journal of Chemical Education, Vol. 8 (3), pp. 228-231.
Lunney, D. and R. Morrison. (1990) “Auditory Presentation of Experimental Data,” In: Farrell, E., (ed.), op cit., pp. 140-146.
McCormick, B., T. DeFanti, and M. Brown. (1987) “Visualization in Scientific Computing,” Computer Graphics, Vol. 21(6).
Mansur, D. (1984) Graphs in Sound: A Numerical Data Analysis Method for the Blind, Lawrence Livermore National Laboratory report UCRL-53548.
Mansur, D., M. Blattner, and K. Joy. (1985) “Sound Graphs: A Numerical Data Analysis Method for the Blind.” Journal of Medical Systems, Vol. 9 (3), pp. 163-174. Mezrich, J., S. Frysinger, and R. Slivjanovski. (1984) “Dynamic Representation of Multivariate Time Series Data,” Journal of the American Statistical Association, Vol. 79 (385), pp. 34-40.
Moellering, H. (1991) “The Background and Concepts of Dynamic Cartography,” Paper presented at the meeting of the Association of American Geographers, Miami FL.
Monmonier, M. (1989) “Geographic Brushing: Enhancing Exploratory Analysis of the Scatterplot Matrix,” Geographical Analysis, Vol. 21, pp. 81-84.
Monmonier, M. (1991) “Ethics and Map Design – Six Strategies for Confronting the Traditional One-Map Solution,” Cartographic Perspectives No. 10, pp. 3-8.
Monmonier, M. and M. Gluck. (1993) “Focus Groups for Design Improvement in Dynamic Cartography,” Cartography and Geographical Information Systems, vol. 21, no. 1. pp. 37-47..
Mountford, S., and W. Gaver. (1990). “Talking and Listening to Computers.” In: B. Laurel, B. The Art of Human Computer Interface Design. Addison-Wesley, Reading, pp. 319-334.
O’Connor, R. (1991) “Workers in Close Quarters May Not Be Ready for Noisy Computers,” Centre Daily Times, Monday, March 18, State College, PA, p. 9E.
Ohlson, B. (1976) “Sound Fields and Sonic Landscapes in Rural Environments,” Fennia, Vol. 148, pp. 33-45.
Olson, J. (1981). “Spectrally Encoded Two-Variable Maps,” Annals of the Association of American Geographers, Vol 71(2), pp. 259-276.
Peterson, I. (1985) “Some Labs are Alive with the Sound of Data,” Science News, Vol. 27(June 1), pp. 348-350.
Pocock, D. (1989) “Sound and the Geographer,” Geography, Vol. 74 (3), pp. 193-200.
Pollack, I. and L. Ficks. (1954) “Information of Elementary Multidimensional Auditory Displays,” Journal of the Acoustical Society of America, Vol. 6, pp. 155-158.
Porteous, J. and J. Mastin. (1985) “SoundScape.” Journal of Architectural Planning Research, Vol. 2, pp. 169-186.
Rabenhorst, D. (1990) “Complementary Visualization and Sonification of Multi-Dimensional Data,” In: Farrell, E., (ed.), op cit., pp. 147-153.
Risset, J. and D. Wessel. (1982) “Exploration of Timbre by Analysis and Synthesis,” In: Deutsch D., (ed.), op cit.
Scaletti, C. and A. Craig,. (1991) “Using Sound to Extract Meaning from Complex Data,” In: Farrell, E., (ed.), op cit., pp. 207-219.
Schafer, R. (1977) The Tuning of the World, Knopf, New York.
Schafer, R. (1985) “Acoustic Space,” In: Seamon, D. and R. Mugerauer, (eds.) Dwelling, Place, and Environment Martinus Nijhoff, Dordrecht, pp. 87-98.
Shiffer, M. (1993) “Augmenting Geographic Information with Collaborative Multimedia Technologies,” Proceedings: Auto-Carto 11, Minneapolis. pp. 367-376.
Slocum, T., and S. Egbert. (1991) “Cartographic Data Display,” In: Taylor, D. (ed) Geographic Information Systems: The Microcomputer and Modern Cartography, Pergamon Press, Oxford, pp. 167-199.
Slocum, T., W. Roberson, and S. Egbert. (1990) “Traditional versus Sequenced Choropleth Maps: An Experimental Investigation,” Cartographica, Vol. 27 (1), pp. 67-88.
Smith, S. and M. Williams. (1989) “The Use of Sound in an Exploratory Visualization Environment,” Department of Computer Science University of Lowell Technical Report No. R-89-002, Computer Science Department, University of Massachusetts at Lowell, Lowell MA.
Smith, S., R. Bergeron, and G. Grinstein. (1990) “Stereophonic and Surface Sound Generation for Exploratory Data Analysis,” Proceedings of the Association for Computing Machinery Special Interest Group on Computer Human Interfaces, pp. 125-32.
Smith, S., G. Grinstein, and R. Pickett. (1991) “Global Geometric, Sound, and Color Controls for Iconographic Displays of Scientific Data.” In: Farrell, E., (ed.), op cit., pp. 192-206.
Stam, R., R. Burgoyne and S. Flitterman-Lewis. (1992). New Vocabularies in Film Semiotics: Structuralism, Post-Structuralism and Beyond, Routledge, New York.
Szlichcinski, K. (1979) “The Art of Describing Sounds,” Applied Ergonomics, Vol. 10 (3), pp. 131-138.
Truax, B. (1984) Acoustic Communication, Ablex Publishing Co, Norwood, NJ.
Tuan, Y. (1993) “Voices, Sounds and Heavenly Music,” In: Passing Strange and Wonderful: Aesthetics, Nature, and Culture, Island Press, Washington D.C., pp. 70-95.
Tukey, J. (1977). Exploratory Data Analysis, Addison-Wesley, Reading MA.
Weber, C. (1993a). “Sonic Enhancement of Map Information: Experiments Using Harmonic Intervals.” Unpublished PhD dissertation, State University of New York at Buffalo, Department of Geography.
Weber, C. (1993b). Personal communication.
Weber, C. and M. Yuan. (1993) “A Statistical Analysis of Various Adjectives Predicting Consonance/Dissonance and Intertonal Distance in Harmonic Intervals,” Technical Papers: ACSM/ASPRS Annual Meeting, New Orleans, Vol. 1, pp. 391-400.
Wenzel, E., F. Wightman, and S. Foster. (1987) “Development of a Three-Dimensional Auditory Display System.” In: Yost, W. and G. Gourevitch, (eds.) Directional Hearing, Springer-Verlag, New York.
Wenzel, E., F. Wightman, and S. Foster. (1988a) “Development of a Three-Dimensional Auditory Display System,” SIGCHI Bulletin, Vol. 20 (2), pp. 52-57.
Wenzel, E., F. Wightman, and S. Foster. (1988b) “A Virtual Display System for Conveying Three-Dimensional Acoustic Information,” Prceedings of the Human Factors Society, Vol. 32, pp. 86-90.
Wenzel, E., S. Fisher, P. Stone, and S. Foster. (1990) “A System for Three-Dimensional Acoustic `Visualization’ in a Virtual Environment Workstation,” Visualization `90: First IEEE Conference on Visualization, IEEE Computer Society Press, Washington, pp. 329-337.
Williams, M. (1989) “The Architecture of the Exvis Kernel,” Department of Computer Science University of Lowell Technical Report No. R-89-004, Computer Science Department, University of Massachusetts at Lowell, Lowell MA.
Williams, M., S. Smith, and G. Pecelli. (1990) “Computer-Human Interface Issues in the Design of an Intelligent Workstation for Scientific Visualization,” SIGCHI Bulletin, Vol. 21(4), pp. 44-49.
Yeung, E. (1980) “Pattern Recognition by Audio Representation of Multivariate Analytical Data,” Analytical Chemistry, Vol. 52 (7), pp. 1120-1123.