6.1.09

The challenge of making dynamic dialogue in a sim

A first person perspective is critical in educational simulations. It adds intensity and context. But creating a satisfactory, dynamic dialogue between the player and any avatars is incredibly difficult. There are a lot of decisions designers have to make, all of them imperfect.

From an output perspective, here are some choices:

  • Recorded voice: We can write a script (including many variations), and have actors speak it out in a recording studio. The advantage is there is a very high quality, emotional set of dialogue. The bad news is that it is expensive, very hard to change, and the file size is get very large very quickly.
  • Text, no voice: We can use word balloons (or some other variations). The advantages are that they are easy to change, we can have a lot of them because there is no recording expense, and their size is very small. The problem is that they don't have much of an emotional punch. It increases the cognitive load to have to process text for some people. It also makes for a more intellectual and less emotional experience.
  • Text, mumble voice: We can also do a variation of the above, with text, but avatars speaking in a mumbled voice, very similar to The Sims. The mumble quotes can provide emotional shading, like angry or excited. Despite the massive popularity of the Sims however this mumbled voice throws off many people, especially when it is in the forefront of action, as with a first person perspective sim.
  • Computer rendered voice: If I didn't mention this one, someone else would. There is the ability to have computer generated voices in a sim. In theory, this would have all the advantages of options one and two. However, the technology is not very, and the effect sounds cheap and distracting.
  • Icon, no text, no voice: Another Sims inspired approach is to have icons representing what the person is saying rather than words. This is nice because it forces the conversation to strategic level, were players often enough that should be anyway. This can get around some of the distractions and costs of having specific words. Players are also more tolerant of having the same icon flash again and again then hearing the same quote. The trouble is that a lot of people have a hard time translating the icons; this can significantly disrupt the play experience of a person in the simulation, and more importantly, the transferability of the skills outside of the simulation.

Even more importantly, here are some of the structural approaches, with corresponding pros and cons:

  • First, one can use a traditional branching structure. Here, players respond to dialogue with typically one of three options. Based on the option they take, they hear a response and are given another set of decisions. The nice part about this approach is that the designers have total control over the experience, minimizing any weirdness. The bad news is that it's very expensive, cumbersome, and limits the amount of dynamic interaction a player has. It also can feel manipulative and capricious to the player if not implemented rigorously.
  • Second, one can use a structure of syntax and buckets. This is similar to what we did with Virtual Leader. We created a high level syntax for all dialogue, and pulled from the appropriate buckets of quote as they were engaged. This system created a framework that enabled potentially infinitely long and infinitely varied dynamic conversations. Both idea specific (custom) and generic dialogue could be played. The bad news is that some of the combinations of dialogue sounds a little off. It also required a huge number of quotes which was fine across a 10 hour experience that may not be possible for a one-hour experience or less.
  • Third, dialogue can be structured to be system driven. When there's an underlying open-ended System (or systems) to the conversation, such as a conceptual map, then dialog can be triggered or otherwise generated at milestones or positions. If a player switches states, such as angry to despondent, then a specific quote would be played ("it seems hopeless"), or if the conversation was probing and uncovered some fog of war ("I know, we could use a treatment combining two chemicals") the right quote would be played. Dialogue could be played every turn, or only at key junctures.

I don't have an easy solution here. Each approach has their pros and cons. But as with many things, sometimes just understanding the trade-offs make the relationship between sim sponsor, designer, and user much more productive.

6 comment(s):

Steven Egan said...

There is another possibility with the text, adding the pros and cons of icons. Dynamic formatting such as color, size, background and more can give the emotional flavoring.

Clark Aldrich said...

That's a great point, Steven.

Justin Gibbs said...

Coming from a screenwriting background, I believe a lot of it will come down to how the dialogue is written. Screenwriters are trained to show emotion through their words and actions. Given their actions might be a bit limited in this format a good screenwriter should still be able to craft their words to express emotion. Of course this would be no trivial task either.

Clark Aldrich said...

Justin,

I agree with you, and thank you for a great point. I wonder if writing for non-linear environments ever beomes a discipline? I also would like to better understand how projects at different budget levels productively engage real writers.

Peter Shea said...

Clark,

Have you ever used music in your simulations? Something subtle, but indicative of the player's status?

Clark Aldrich said...

Peter,

No I have not, but hmmmmmmm, I never thought about that..... maybe.....