Multimedia Use 15 Minute Read

Principles of Multimedia Learning

Computer and books

Richard Mayer’s seminal book Multimedia Learning details his extensive research on how to structure multimedia materials effectively to maximize learning. Relying on numerous experiments, he distills his findings into 12 principles that constitute (in part) what he refers to as the “cognitive theory of multimedia learning.” This theory and its principles provide guidance on how to create effective multimedia presentations for learning.

This article introduces the cognitive psychology foundation upon which Mayer’s principles are built and then summarizes each principle. Let’s begin by discussing Mayer’s assumptions on how people learn.

Information Processing

Mayer’s cognitive theory of multimedia learning makes three assumptions about how humans process information: the dual-channel assumption, the limited-capacity assumption, and the active-processing assumption.

The Dual-Channel Assumption

According to Mayer (2009), the dual-channel assumption dictates that “humans possess separate channels for processing visual and auditory information” (p. 63). The first is the visual–pictorial channel, which processes images seen through the eyes (including words displayed on a screen). The other channel is the auditory–verbal channel, which processes spoken words.

The Limited-Capacity Assumption

The limited-capacity assumption suggests that humans have a hard limit on the amount of information they can process at any given moment. This is probably intuitive to anyone who’s sat in a sports bar and tried to watch several games at the same time or tried to listen to the news while having a conversation.

Although it’s difficult to nail it down, Mayer suggests that most people can maintain maybe five to seven “chunks” of information in working memory at a given time (p. 67). He also indicates that individuals at the higher end of that range may have stronger metacognitive strategies, which allow them to manage their limited cognitive resources more efficiently.

The Active-Processing Assumption

The active-processing assumption asserts that humans don’t learn by just passively absorbing information. Instead, they need to engage in active cognitive processes, namely identifying and selecting relevant material, organizing it into visual and/or verbal models, and integrating those new models with prior knowledge (p. 70). The cognitive theory of multimedia learning fundamentally argues against a “knowledge transmission” approach to learning in favor of a student-centered “knowledge construction” model. Students, he argues, are not “empty vessels” waiting to be filled up with information but must instead work to synthesize words and pictures into meaningful information that is stored in long-term memory.

Cognitive Load Theory

Mayer’s cognitive theory of multimedia learning also relies heavily on cognitive load theory. While we discuss it in greater depth elsewhere, the underlying premise is that the kind of information we encounter during learning leads to one of three different types of processing in the brain.

Extraneous load (also known as “extraneous processing”) refers to wasted cognitive effort on material or details that don’t support the learning outcomes. Instructors can minimize extraneous load by focusing narrowly on the essential material and eschewing everything that could distract learners (such as needless animations or irrelevant information).

Intrinsic load (also known as “essential processing”) refers to the cognitive effort required to represent the material in working memory and is based on the complexity or difficulty inherent to the learning materials. Instructors should aim to manage intrinsic load by chunking their materials and identifying technical terms in advance.

Germane load (also known as “generative processing”) is the effort required of learners to actually understand the material and is strongly affected by their motivation. Instructors should optimize germane load by scaffolding learning and pacing material appropriately.

In some ways, we can see cognitive load theory as being an extension of the limited-capacity assumption. Given that we have a limited ability to process information in real time, instructors should aim to construct multimedia that manage intrinsic load, optimize germane load, and minimize extraneous load to ensure maximum storage in long-term memory. So while Mayer’s principles provide insight on how to effectively construct multimedia messages for learning, each also maps to a best practice in managing cognitive load.

In short, the cognitive theory of multimedia learning assumes that the human mind is a dual-channel, limited-capacity, active-processing system, and that presenters must construct multimedia messages to manage all three types of cognitive load accordingly. Mayer adopts a constructivist view of learning in which multimedia are not simply information delivery systems, but rather cognitive aids for knowledge construction (p. 14).

Principles of Multimedia Learning

Now that we’ve established the cognitive psychology foundation, let’s move on to summarizing each principle.

Principles That Minimize Extraneous Load

The Coherence Principle

“People learn better when extraneous material is excluded rather than included.” (p. 89)

The coherence principle is about minimizing extraneous processing. Instructors should not include information in their multimedia messages that will not be assessed, is merely intended to “spice up” the presentation, or distracts from learning goals overall.

Mayer also warns against including seductive details (interesting but irrelevant material that the presenter might include to re-engage the audience or create emotional responses), which the audience often retains better than the intended core message (p. 97). Given that learning is an active process, these extraneous details may interfere with learners’ construction of mental models to represent the material.

To address this principle:

  • Include only graphics, text, and narration that support learning goals (i.e., don’t use decorative images or supplemental materials).
  • Don’t use background music.
  • Use simple visuals (as opposed to realistic or detailed visuals).

The Signaling Principle

“People learn better when cues that highlight the organization of the essential material are added.” (p. 108)

Particularly when multiple pieces of information are on-screen, learners need to know what to pay attention to, where they are in the presentation, and how to integrate the information to construct their own mental models. Accordingly, the signaling principle recommends that instructors add cues that direct learners’ attention to salient material. Mayer is careful to point out that this can be overdone, so presenters should use signals sparingly.

To address this principle:

  • Use arrows, highlighting, and other signals to draw attention to important information.
  • Include an advance organizer (content that presents the organizational structure of your multimedia presentation) and refer back to it when you advance to a new section.

The Redundancy Principle

“People learn better from graphics and narration than some graphics, narration, and printed text.” (p. 118)

Many multimedia presentations involve a combination of spoken words, graphics, and on-screen text. However, the redundancy principle suggests that multimedia messages are most effective when learners encounter just spoken words and graphics. When instructors include text on-screen, they risk overwhelming their learners’ visual channels with both pictures and words, and inadvertently direct their cognitive processes to resolving differences between the spoken text and the printed text.

To address this principle:

  • When delivering a narrated presentation, use either graphics or text, but not both.
  • Minimize the use of text during a narrated presentation.

The Spatial Contiguity Principle

“Students learn better when corresponding words and pictures are presented near rather than far from each other on the page or screen.” (p. 135)

The specifics of the spatial contiguity principle may be somewhat more intuitive than Mayer’s other principles. In short, it suggests that instructors should keep text (such as labels or captions) near to the graphics that they describe. If they do so, they minimize the cognitive effort that learners must expend to align the meaning of text and images themselves. Thus, instead of scanning the screen to make such connections, learners can devote that cognitive effort to integration and connection building.

To address this principle:

  • Place text in close proximity with the graphics it refers to.
  • Provide feedback close to the questions or answers it refers to.
  • Present directions on the same screen as an activity.
  • Have people read any text before beginning an animated graphic.

The Temporal Contiguity Principle

“Students learn better when corresponding words and pictures are presented simultaneously rather than successively.” (p. 153)

To maximize learning, the temporal contiguity principle dictates that narration and animation should be delivered concurrently. For example, students shouldn’t hear about a process and then watch an animation of it afterward; instead, instructors should time the narration to play along with the animation.

To address this principle:

  • Time narration appropriately to play along with animations.

Principles That Manage Intrinsic Load

The Segmenting Principle

“People learn better when a multimedia message is presented in user-paced segments rather than as a continuous unit.” (p. 175)

Mayer’s experiments involved presenting asynchronous multimedia messages to research subjects (messages that largely focused on describing processes, such as how lightning forms). He determined that when students had the ability to control the pace of the lesson, they performed better on recall and transfer tests. Thus, the segmenting principle has two implications: (a) users should have control over the pace of the multimedia lesson, and (b) instructors should chunk material appropriately to allow for adequate processing on each slide or screen.

To address this principle:

  • Allow users to control the pace of the lesson, such as speed controls or “next” buttons.
  • Break down long segments of material into smaller pieces.

The Pre-Training Principle

“People learn more deeply from a multimedia message when they know the names and characteristics of the main concepts.” (p. 189)

The necessity of managing essential (or intrinsic) load suggests that it’s easy for novice learners to become overwhelmed by the quantity or complexity of the information in a multimedia message. The pre-training principle accordingly recommends that instructors define key terms or concepts before diving into descriptions of processes. Otherwise, students will be stuck trying to learn a process’s component parts while also attempting to build a mental model of the process itself, which may hinder learning. In essence, pre-training is about scaffolding learning and helping students establish appropriate prior knowledge before beginning a multimedia lesson.

To address this principle:

  • Define key terms (such as names, definitions, locations, and characteristics) before beginning a process-based presentation, either in a separate presentation, handout, or similar material.
  • Ensure people know how to use a tool (such as Excel) before asking them to perform learning activities within it.

The Modality Principle

“People learn more deeply from pictures and spoken words than from pictures and printed words.” (p. 200)

The dual-channel and limited-capacity assumptions lead in part to the modality principle, which recommends that instructors use narration instead of on-screen text when pictures are present. If multimedia messages contain pictures and on-screen text, the combination may overwhelm learners’ visual channels. Instead, instructors should only speak words (rather than include them on-screen), which spreads the load across both the visual and the verbal channels (also known as “modality offloading”; p. 204).

To address this principle:

  • During a narrated presentation with graphics, avoid using on-screen text, unless it:
    • Lists key steps
    • Provides directions
    • Provides references
    • Presents important information to non-native English speakers

Principles That Optimize Germane Load

The Multimedia Principle

“People learn better from words and pictures than from words alone.” (p. 223)

You could argue that the multimedia principle is a starting point for all the other principles, given that it indicates that learners perform better when exposed to words and pictures rather than just words. Given that multimedia presentations may or may not be narrated, it’s important to underscore that the “words” in this case should be either printed or spoken, but not both (in keeping with the other multimedia principles). Effectively leveraging pictures and words together fosters generative processing.

To address this principle:

  • Include images to illustrate key points.
  • Ensure that all images enhance or clarify meaning (rather than being purely decorative).
  • Favor static images over animations (with some exceptions).

The Personalization Principle

“People learn better from multimedia presentations when words are in conversational style rather than formal style.” (p. 242)

According to the personalization principle, having a more relaxed tone in an online class can actually positively impact learning. Thus, instructors should avoid stiff, academic language, and instead use more approachable colloquial language. Try to think of the presentation as a one-on-one conversation with each student. Informal language has the effect of creating social cues within the presentation that “prime the activation of a social response in the learner—such as the commitment to try to make sense out of what the speaker is saying” (p. 247).

To address this principle:

  • Use contractions.
  • Use first and second person (“I,” “you,” “we,” “our,” etc.).
  • If using a script, try to make an extemporaneous-sounding performance.
  • Use polite speech (“please,” “you might like to,” “let’s,” etc.).

The Voice Principle

“People learn better when narration is spoken in a human voice rather than in a machine voice.” (p. 242)

The voice principle is perhaps the oddest of the group, but it is still worthy of mention, particularly given the speed at which technology is developing. This principle suggests that narration is better done by a human than a computer. Mayer stresses that the research on this principle is still preliminary.

To address this principle:

  • Include narration that’s performed by a human rather than a computer.

The Image Principle

“People do not necessarily learn better when the speaker’s image is added to the screen.” (p. 242)

The image principle is the only multimedia principle that’s not affirmative in its phrasing. It states that including an image of an instructor’s “talking head” during a multimedia presentation doesn’t necessarily improve learning outcomes. Just as with the voice principle, Mayer is careful to point out that the research on the image principle is still preliminary. Nonetheless, early results suggest that you don’t necessarily add value by showing your face during a narrated presentation.

To address this principle:

  • Avoid including a video of yourself during an asynchronous multimedia presentation containing pictures and words.
  • Consider including your face when:
    • There are no words or pictures.
    • You wish to establish instructor or social presence.

Boundary Conditions

Mayer is careful to set “boundary conditions” for his multimedia principles—situations in which the principles may not apply as strongly. For example, with respect to the segmenting principle (which advises multimedia designers to chunk their materials and allow users to control pacing), Mayer’s research suggested that its effects may not be as strong when the material is simple, when the material is slow paced, or when learners are experienced with the material.

Although each principle has its own set of these conditions (which we encourage you to read about in Mayer’s book if you’re interested), there is at least one high-level qualification that’s worth mentioning. Mayer proposes an overarching individual differences principle, which suggests that “certain of the twelve design principles reviewed in this book may help low-experience learners but not help high-experience learners” (pp. 271–272). This speaks strongly to the role of prior knowledge in multimedia learning—indeed, in learning overall. Mayer argues, in fact, “prior knowledge is the single most important individual difference dimension in instructional design. If you could know just one thing about a learner, you would want to know the learner’s prior knowledge in the domain” (p. 193).

You may be wondering about what kind of media Mayer is addressing in his research. Although generally his experiments involved asynchronously produced multimedia presentations (that is, there were no live lectures), they were presented across a variety of media. Accordingly, he believes these principles embody best practices across a variety of media:

The cognitive theory of multimedia learning is based on a knowledge-construction view in which learners actively build mental representations in an attempt to make sense out of their experiences. Instead of asking which medium makes the best deliveries, we might ask which instructional techniques help guide the learner’s cognitive processing of the presented material. (p. 231)

With these conditions in mind, it’s important to sum up the conditions that make up the cognitive theory of multimedia learning:

  • The principles apply to low-knowledge learners.
  • The principles apply to multimedia messages that describe processes.
  • The principles are medium agnostic.


Mayer’s overarching thesis—that people learn better when you use pictures and words together—may be intuitive to many instructors. What may be less intuitive, however, is how to maximize the efficacy of multimedia messages based on the specifics of how humans process information during learning. Mayer’s theories are a rejection of multimedia learning as knowledge transmission (transplanting information from instructor to learner) and response strengthening (promoting recall through drill and practice methods). Instead, the theory embraces a knowledge construction perspective: “that multimedia learning is a sense-making activity in which the learner seeks to build a coherent mental representation from the presented material” (p. 17).

Here are the big takeaways from this article:

  • When it comes to learning, the human mind is a dual-channel, limited-capacity, active-processing system.
  • Instructors should manage their learners’ essential processing, optimize their generative processing, and minimize their extraneous processing through thoughtful construction of multimedia presentations.
  • These principles are most applicable when the multimedia messages describe processes and when learners are inexperienced.

Clearly, Mayer’s multimedia principles provide quite a few guidelines for the design of multimedia presentations. For convenience, we’ve summarized them in a table in a separate document, which you can download here.

Mayer’s theory aligns with contemporary thinking on effective learning, which embraces a constructivist perspective: Students learn most effectively when they have to construct their own knowledge structures and mental models. As Mayer tells us, “instructional design involves not just presenting information, but also presenting it in a way that encourages learners to engage in appropriate cognitive processing” (p. 168). By following the principles of the cognitive theory of multimedia learning, instructors can help ensure that their multimedia presentations will enhance student learning.


Mayer, R. E. (2009). Multimedia learning (2nd ed.). Cambridge, England: Cambridge University Press.

Posted July 19, 2016
Author Galen Davis and Marie Norman
Categories Multimedia Use