nfirvine.comwiki

AdvancedDialogueSystem

Filed in: Ideas.AdvancedDialogueSystem · Modified on : Mon, 17 Jan 11

The Advanced Dialogue System is a new system for presenting dialogue and other textual information, namely for video games (as per usual).

Evolution of Presentation of Dialogue in Video Games

Note: I'm largely assuming this as the natural progression; it's not based on fact.

  • --More-- : In the beginning, dialogue was little more than the UNIX more command, in that it was a stream of text, paginated and word-wrapped. This is evident in text adventure games like Zork, et al., but is still around in the form of Zelda,
  • Typewriter: In order to make the dialogue seem more like it was happening in real time, the text was "typed out", as if the character speaking were typing onto the screen. In my opinion, this is just a waste of time: the effect is robotic, and basically means I have to press the button twice to get past a page of dialogue.
  • Voice-overs: In the age of CD-ROM, we were introduced to dialogue spoken by professional voice actors and played back via the DAC. Really, this is an ideal situation: what better way to present characters' speech than to have them actually speak it? There are several problems though:
    • It requires an actual human being to recite the lines. This is both temporally and monetarily expensive. Plus it prohibits dynamically generated conversations for all intents and purposes.
    • It requires vast amounts of storage for the audio data. This is prohibitive on platforms where storage is at a premium.
    • It requires a different set of audio data for each locale.
  • Text effects: Standard typographical effects, like font weight, size, and family, as well animated effects like shaking text to show a character being nervous.
  • Variable speed typewriter: The speed at which text is streamed from the typewriter varies in time, adding some humanity. For example, Phoenix Wright: Ace Attorney. This effect is especially useful because it provides something sorely lacking in previous systems: pacing. However, it does add some significant complexity to the writing.
  • Gibberish-speak: Instead of having prerecorded human recitations of the dialogue, we capture various syllables and phonemes, and randomly play them during the speech. This is evident in Animal Crossing and The Sims. This adds another few layers to the dialogue, like tone, speed, and vocal characterisation. It alleviates the problems of voice overs almost entirely, but adds some new ones, namely:
    • It's gibberish, not actual words. The only way to work around this problem is fully synthesised speech.
  • Background sound: sound effect to zaz up the speech.

Prior Art

The dialogue presentation system in the Mario and Luigi series of Nintendo RPGs has always seemed very flexible and capable of adding a lot of character to the text. I believe this is part of the key to the well-executed humour in the series. After all, apparently the key to comedy is timing.

Proposal

The Advanced Dialogue System (ADZ) adds to the above with the following:

  • Finer pacing control: Lines of dialogue are broken into sentences, words, and syllables, which are all spoken at a different rate. I believe this is one of the features lacking in general in games today. (It's possible that the systems do possess the ability to control the speech rate, but it seems to me very underutilised.) Some of this can be automated to a certain extent by a simple text parser. For example, separating words is fairly automatable, but separating syllables should generally be done by a human.
  • Gibberish control: The ability to annotate parts of speech with different types of gibberish. For example, a question sentence could have a rising inflexion at the end, which can be emulated easily with audio filters. Some of this too could be automated: Detecting a question should be simple enough.

MIDI SMF Extension

After much thought, there might be a better way to do this. A character's voice is really just a synthesizer with karaoke. Luckily, synthesizer+karaoke has already got a file format, the Lyric extension to the Standard MIDI File. Using MIDI files, we can encode pitch, slides, and all sorts of vocal queues without actual waveforms.

The Project

As I'm working on a prototype, I've created a page in the Projects section for this.

Other People's Work

  • Ren'Py: A "visual novel" system in Python.

Powered by PmWiki