Our experience of the world is seamlessly multimodal. We do not perceive separate visual, auditory, and tactile worlds — we perceive objects and events defined by correlated information across senses. Watching someone speak involves integrating lip movements (vision), speech sounds (audition), and perhaps tactile information (feeling vibrations). Multisensory integration refers to the neural and cognitive mechanisms that bind information from different senses into unified percepts.
Principles of Multisensory Integration
Barry Stein and Alex Meredith identified three fundamental principles governing multisensory integration in individual neurons (primarily studied in the superior colliculus). The spatial rule states that stimuli from different modalities must originate from approximately the same location to be integrated. The temporal rule requires that stimuli occur within a temporal window (typically a few hundred milliseconds). The inverse effectiveness rule states that multisensory enhancement is greatest when the individual unisensory signals are weak or ambiguous — precisely when combining information is most useful.
w_A = (1/σ²_A) / (1/σ²_V + 1/σ²_A)
Ŝ = w_V · S_V + w_A · S_A
Each modality is weighted by its reliability (inverse variance); the combined estimate is more precise than either sense alone.
Bayesian Models of Integration
The Bayesian framework has been remarkably successful in predicting multisensory integration behavior. When localizing a sound accompanied by a flash of light, observers weight visual and auditory information in proportion to their relative reliability — more weight to vision in good visibility, more to audition when the visual signal is degraded. The resulting combined estimate is statistically optimal, achieving lower variance than either sense alone. Ernst and Banks (2002) provided a compelling demonstration of this optimal integration for visual-haptic size estimation.
Illusions of Integration
When the senses provide conflicting information, striking illusions reveal the integration process. The ventriloquist effect demonstrates visual capture of auditory localization — a sound is perceived as coming from a simultaneously moving visual object (the puppet's mouth) rather than its true source (the ventriloquist). The McGurk effect shows auditory-visual integration in speech perception. The rubber hand illusion demonstrates visual-tactile integration in body ownership — synchronized visual and tactile stimulation can cause a rubber hand to feel like one's own.
Multisensory integration requires that stimuli from different modalities fall within a temporal binding window — the time range over which cross-modal stimuli are perceived as simultaneous. This window is not symmetric: visual stimuli can follow auditory stimuli by up to 200 ms and still be perceived as simultaneous (because light travels faster than sound, this compensates for real-world asynchrony). The width of this window varies across individuals, can be narrowed by training, and is wider in some clinical populations including those with autism spectrum disorder.
Neural Substrates
While classical neuroscience considered sensory cortices as strictly unisensory, it is now clear that multisensory interactions occur at virtually every level of processing. The superior colliculus is a well-studied site of audiovisual integration important for orienting behavior. In cortex, the superior temporal sulcus, intraparietal sulcus, and frontal regions contain multisensory neurons. Even primary sensory cortices show cross-modal influences — auditory stimulation can modulate activity in primary visual cortex, and vice versa.
Development and Plasticity
Multisensory integration develops gradually over the first years of life. While newborns show some cross-modal matching abilities (preferring faces that match speech sounds), the adult-like principles of integration — particularly the spatial and temporal rules — take years to mature. In individuals who are blind or deaf from birth, cortical areas normally devoted to the missing sense are recruited for the remaining senses, demonstrating the remarkable plasticity of multisensory cortical organization.