The quest for the holo-grail

Andy Finney is looking beyond simple red-button interactivity. He is looking forward to the day that television is as immersive as the Star Trek holodeck.
Article first published: Autumn 2005

Marc Price from BBC R&D’s 3D Interactive Media Lounge. Image: BBC.

Streams and timing on interactive TV. Courtesy of MECiTV Project.
A couple of years ago, I boldly went on an exploration of the exhibition floors at IBC in Amsterdam to see what kind of things were being produced for the emerging interactive television platforms: satellite with oodles of channels, cable with a return path and digital terrestrial with potential (TV over the internet – IPTV – was not then on the agenda).

There was talk of middleware, carousels and red buttons – strategies for placing quarter-screen video nicely alongside textual information – and several ingenious companies wringing the last drops of functionality out of a variety of set-top-boxes. It was all strangely familiar: the kinds of problems and solutions being faced by interactive media pioneers in the 1980s were being addressed yet again by interactive TV producers in the new millennium. Surely we’ve come further down the technological road than this?

Well, maybe we haven’t as yet. While the average home computer (not to mention games console) has a NASA’s worth of computing under the lid, the processing and storage available to a digital TV set-top box seems positively puny by comparison. Yes, they can unzip compressed video and audio without breaking sweat, but their interactive abilities are not so impressive. Information has to cycle round on a carousel (just like good old-fashioned teletext) and many of the most interesting interactive applications in the past few years have relied on the ability to switch channels without really seeming to do so.

I can’t blame the manufacturers for this: interactive functionality probably doesn’t sell boxes and the public want to switch to digital without spending very much – and if they do want to spend more then they probably want a PVR. Add to this the imperative of switchover and you can see the current consumer market has enough on its plate.

Let’s take a step back from this. Just as the famous researchers at Xerox PARC assumed everyone had a supercomputer of their own and so developed things like mice and WYSIWYG text editing, what can we imagine if we temporarily forget the limitations of today’s set top boxes? What kind of interactive experiences should we be aiming for? How do we get from the red button to a fully immersive interactive entertainment experience like the fictional Star Trek holodeck?

The challenge

A holodeck is a fictional entertainment and information system used in the future world of the Star Trek TV series (you knew that, didn’t you?). In the holodeck, the user/player/ inhabitant is presented with seemingly real and solid objects and people (human or otherwise) with whom to interact. It’s a three-dimensional interactive environment into which the users can enter, and the user influences the development of the scenario by their actions and, occasionally, by giving verbal instructions to the controlling computer. It is programmable by its users before entering and while using it, or they can enter a scenario programmed by someone else.

Lawrence M Krauss’ engaging book The Physics of Star Trek tells us that the holodeck is a development of a replicator machine that can build solid objects from patterns held in a computer. Just as Xerox turned a photocopier into a laser printer, the fictional Federation engineers turned a replicator into the holodeck. The concept first appeared in the Next Generation series, which debuted in America in 1987 and was famously exploited by the crew to inhabit fictions. Commander Data became Sherlock Holmes, Captain Picard programmed himself into something like a Raymond Chandler plot and Captain Janeway (of Voyager) even played at being a Jane Eyre clone.

Science fiction has occasionally given us glimpses of future media and the way we might use them, so this aspect of Star Trek was continuing a noble tradition. Isaac Asimov came up with one in a 1955 short story called Dreaming is a Private Thing. In this story, Asimov suggests that a future industry would sell dreams as entertainment and that the key dreamers would be stars to be exploited by this new medium. Several SF movies have explored applications of what we would call virtual reality, of which The Matrix is a notable example.

Another writer who explored territory close to our holodeck paradigm is Greg Bear. His 1995 novel Eon describes a future society where people can upload copies of themselves into a database. In this society it has become acceptable to send a partial version of yourself, derived from this database, to carry out tasks both in the real world and in the ‘cyberspace’ of the societal database at large. They could even do your job for you when you went on holiday. These ‘partials’ are agents who are empowered to a certain level of authority, and this concept is something potentially useful for us in our quest.

Software agents were developed originally as extensions of object-oriented programming. Program objects are set up to carry out well defined tasks and, rather like engineering black boxes, you put certain information into the box and it will either do something or it will reply (a somewhat simplified but, I hope, appropriate explanation). If the object is also programmed with some degree of flexibility whereby it can modify its behaviour depending on circumstances, and if it can be left to get on with a task while your main program does something else, then it’s becoming an autonomous agent. Remember the true definition of delegation: set the ground rules, delegate the responsibility and the limits of authority and – most importantly – accept the consequences. Telcos were looking into agents as a way of carrying out maintenance tasks in highly complex communications networks but the idea has been taken up by researchers in interactive narrative.

Content authoring

How do you author something for the holodeck? Commander Data could set up his program by just telling the system to use the Holmes canon as inspiration and also a character pool, but to derive a new story – in just about so many words. In this instance, the holodeck built a story involving Professor Moriarty, but managed to create a character that was able to exist outside the story being programmed and to interact with the outside world to some degree. This Moriarty was presumably an autonomous software agent. We can learn from these fictional incidents. Data’s first attempt at being Sherlock Holmes was something of a damp squib because he simply programmed the holodeck to follow the plot of the Conan Doyle stories; so he already knew ‘who did it’ – and where’s the cyber-fun in that? This seemingly simple issue is actually a very important one. Is it possible to have a truly interactive drama without literally losing the plot? Philip Pulman touched on this issue in a piece in the Guardian newspaper (23 July 2005). The His Dark Materials author was referring to video games when he wrote:

“It’s a question of preserving the integrity of the story. And the integrity of the story depends on there being a controlling intelligence thinking out every point of view. In the programmed waywardness and digitized serendipity of a video game, the shape of a story can get subverted, weakened, undermined, distorted and abandoned.”

It’s gratifying to have such distinguished (and not to say eloquent) support for my thesis here. The best drama has an element of surprise that makes you want to follow the story to find out more. They’re called ‘Who Done It?’ plots for a reason. Once upon a time, there was a prolific thriller writer (my memory tells me it was Frances Durbridge) who would write regular mystery series for the fledgling BBC Television service. The series were broadcast live and, so the story goes, the intrepid author would listen carefully to everyone else’s theories about how the story would end and then do his best to make sure it happened another way – just to increase the suspense. Comedy also relishes surprises: prat-falls and slapstick have only a limited appeal.

Interactive storytelling

So the characters in an interactive drama could be software agents who work within a set of rules and have tasks to perform and goals to achieve. By interacting with them you influence their behaviour and can change the plot to some extent, but it should be possible to design the agent behaviour such that the main arc of the plot can continue. Of course, if you (as player) are in some kind of subsidiary role, this would be more easily achieved: in other words, it would be easier if you were Dr Watson than if you were Sherlock Holmes. But one of our participants in the IBC holodeck session, Marc Cavazza from the University of Teeside, has done some fascinating work on interactive storytelling. One strand involves mixed realities, in which you become not only a character in a James Bond ‘movie’, but are placed inside the scene to some degree, rather than merely being an observer of it. You become the villainous ‘professor’ and the way you behave will influence the way the Bond character behaves and the plot develops. Bond himself, in the meantime, has his own agenda.

I asked Marc to outline the areas he’ll be covering in his contribution to the ‘holodeck’ session. “Interactive films that reconcile user involvement with the traditional form of narrative have been a long-term endeavour. However, research in interactive storytelling has developed significantly over the past few years: the background for such development being not only long-awaited media convergence, but also recent technical advances in synthetic actor technology and new interfaces, such as mixed reality systems.

“Current interactive storytelling prototypes can be described as comprising two main components: an interface/medium component in charge of the immersive presentation of the narrative, and an artificial intelligence component which controls the overall story evolution and computes the consequences of user interaction. Interactive storytelling is strongly based on artificial intelligence technology, which underlies synthetic actor behaviour, user interaction (language understanding), narrative control and real-time display (virtual camera control). This constitutes a specific challenge of its own, as real-time artificial intelligence techniques have to be integrated with the imaging component.” Marc will make use of the recent research he’s been involved to introduce the main concepts of interactive storytelling and present various examples of interactive storytelling systems. He will also discuss ways in which the viewer gets involved in the scenarios, which affect the kinds of interaction as well as the kinds of interactive media that enliven the viewer’s experience.

The switching concept

With coverage of real events, rather than drama, your ability to influence events could be nonexistent. Carmen Mac Williams from NOMADS Lab in Cologne – it stands for Non-linear Media, Art, Development and Science – has produced an interactive documentary (for an EC-funded project called MECiTV) called Vision Europe, which literally explores an event, in this case a Hungarian festival. As a viewer, you can choose to follow people who pass by and, as you do, you follow different strands of the documentary.

Carmen likens the staging of interactive TV to composing and conducting a masterpiece of music for an orchestra. “The music composer has the parameter of time and the possibility of using several different instruments playing parallel at the same time to create a masterpiece of music,” she says. “By staging for iTV, the video composer is also restricted by the time limit of the broadcast, and they can use parallel videos to create a story composition. The composer can synchronize and link the parallel video streams at special points in time to allow the viewer to switch channels due to their interests. The added value for the viewer is a deeper emotional and intellectual experience by intuitive mood -based navigation and the serving of their individual interests.” But, as with dramatic narrative, we must tread carefully. As Carmen says: “The key question for the switching concept is: how can the author stimulate the consumer to actively switch the video stream at the right time without disturbing her/his immersion into the story?” Carmen is now looking towards the 2008 Olympic Games, which should provide rich territory for interactive exploration. “We have to create the missing methods and tools for the live video conductor to stage real-time parallel broadcast video streams to create a multi-perspective show around one live media event such as the Olympics, which changes due to the interests of the viewer.”

Many of the issues being researched relate to the user interface. Sometimes it’s a question of how you inform the viewer that an interactive decision can be made, such as when to follow one of Carmen’s documentary subjects. Sometimes it’s how the viewer actually tells the system what they want to do, and Marc Cavazza’s Bond project includes the ability to interpret gestures as one way of moving the story forwards: “Sit over there Mr Bond!”

Collaborative environment

An interesting third option to explore is that of a collaborative user environment. This is beamed into the IBC holodeck session by Marc Price from BBC R&D. Marc’s ‘3D Interactive Media Lounge’ project embeds conventional broadcast media into a networked 3D game environment. It’s a virtual space in which several viewers/players can interact with the programmes they are watching as well as with games and with each other, so in this case the future interactive experience can be shared. It’s the cyberspace equivalent of sitting down, playing networked games and watching TV with your friends, and broadband internet allows the participants to be anywhere on the planet. Watching television is no longer the shared experience it was in the ‘bad old days’ when tens of millions of people regularly watched exactly the same programmes. But just as digital technology fragmented our viewing habits, it can also provide a means for us to share the experience in a different way. Partly this is because new generations of set-top boxes – for games – do have supercomputing capabilities. Marc makes the point that broadcasters have long ignored games consoles.

His take on the future of interactive entertainment is this: “Interactive broadcasting has the potential to revolutionize the way we use our home platforms. This can only be achieved by providing a means for the audience to interact with the content itself – rather than with the platform (as is done now), or with the broadcaster directly (via the return path).

“This paradigm involves the exploitation of highly interactive home platforms, such as PCs and games consoles, empowered by the connectivity of the home network. To move forward with this vision, the broadcast industry must focus its R&D efforts towards the necessary authoring tools for production of ‘intelligent content’.”

3D imaging

The final jigsaw piece that the Towards the Holodeck session at IBC will include is a state-of-the-nation presentation on three-dimensions.

Our SF sources have long espoused three-dimensional imaging. Asimov’s 1957 novel The Naked Sun envisaged a society on a sparsely-populated but affluent planet where people just don’t want to be near each other. So, they use 3D imaging to communicate and this leads to a seismic shift in the language where the term ‘to see’ has personal and real connotations that the mere ‘to view’ does not: you view an image but you see the person for real.

In a 1980 TV drama on the BBC called The Flipside of Dominick Hyde, the protagonist is a time traveller coming back in time to the present day, if I remember correctly, to observe our traffic chaos! He is familiar in his own time with a form of 3D video entertainment that projects apparently solid objects. In the play, there is a drawn-out piece of business set in the present day where he walks straight through a real musician in a restaurant thinking it’s a 3D image.

Part of the holodeck illusion is that of real people and objects inhabiting the same space as you. Of all the elements we are discussing in the session this is the most established. That’s the good news. The bad news is that, so far, it doesn’t seem to have achieved its potential.

We’ll be hearing from a Network of Excellence – 3DTV represented by Ralf Schäfer from the Heinrich Hertz Institut in Berlin – set up with European funding to coordinate European research efforts in this field. The 3DTV team believes things are starting to look up:

“Recent achievements in research and development and demands from the application perspective have triggered an increasing interest in 3D audio-visual technologies. From a technological point of view this includes improvements over the whole processing chain, starting from image acquisition, over signal processing, 3D representation, compression, transmission, to interactive rendering and 3D display. Applications include any area where classical multimedia is employed today, such as 3DTV broadcast, 3D cinema, DVD.”

In fact, the D-Cinema stream at IBC will also be waving the 3D banner, with screenings on Friday and Sunday, organized by Doremi Technologies.

We have had the technology for many years to record and reproduce a scene in three dimensions: to give it the illusion of depth and even solidity. Primarily, our visual system builds a depth picture of what we see by means of binocular vision parallax, since our two eyes see a scene slightly differently. How do you produce the two images for the two eye-viewpoints? The simplest way is to use two cameras, side by side or a single camera with a special lens system that places the two images next to each other on a single photo or video frame. With computer graphics you can produce the two images of a stereoscopic pair in a similar way, by generating pairs of images with the viewpoints a little way apart.

There are several techniques for presenting our two eyes with the two separate images for left and right views of a scene. A century ago, stereoscopic pairs of photos of scenes were a popular form of home entertainment. You used a special viewer to force each eye to see the appropriate one of the image pair or you just crossed your eyes and stared until the two images merged. A modern version of this is the so-called ‘eye phone’, which are basically glasses with a small TV screen over each eye. Other systems use different coloured lenses (anaglyph), polarized lenses or shuttered lenses.

But do we want to force our holodeck users to wear special glasses?

If the two parts of a stereoscopic image are split into narrow vertical strips, and these strips are then placed alternately across the field of view, and then some method such as narrow prisms is used to ‘feed’ alternate strips in opposite directions, a 3D image results. This is called a lenticular image.

All of these techniques can suffer from a depth-perception dilemma that can result in discomfort for the viewer over long periods and which becomes especially important in moving images. It can be uncomfortable because the brain normally receives some depth clues from the way the eye focuses on the objects, and in any photographed image there is only one place to focus and that is the surface of the image itself. So binocular clues are saying one thing about the relative positions of objects and the focus may be saying something else. Fortunately, there are other ways of producing and displaying 3D.

Volumetric displays

Volumetric displays are a kind of display that builds an apparently solid object in space. Instead of pixels you build an image/object using voxels. This can be done by rapidly rotating a flat or helical screen. Then we can rely on persistence of vision to make the viewer see a solid object in the space through which the screen is spinning. Or you can use a pair of lasers to excite molecules in a gas or transparent block: the molecules only glow where the laser beams intersect. The inverse of this is the 3D scanner, which literally maps the surface of an object point by point.

And last, but by no means least, is the hologram. In many ways this is the simplest concept of all: you capture and then recreate all the light coming off a scene that passes through the holographic plate. This involves playing subtle and sophisticated tricks with light and lasers, but when it works well you have to stop yourself reaching behind the plate to feel the object you see. Holograms have become much easier to make and view since Dennis Gabor first made one in 1948, but making them move has proved more difficult.

In any case, whereas volumetric and holographic objects appear to be there so that you can even walk around them, they are also likely to look transparent.

Once you have several ways of capturing or making a 3D image and several ways of displaying it, there is also the question of how to store and transmit the images. In such a world, depth and distance information can be considered to be metadata, so that a 3D image can be a two-dimensional, flat one with added depth metadata. A 3D object can be considered to be real and solid or it can be considered to be a skeletal wire frame with its surface wrapped around it. As a final thought, will it be possible to make all the imaging and display methods interoperable so that the producer can choose how to shoot the 3D and the viewer can choose how to display it? It’s a fascinating subject.

I should add that movies and television have developed techniques that mimic the holodeck as seen from outside: chromakey and virtual studios make the viewer see real presenters inhabiting a completely artificial world on the other side of the screen (sometimes those presenters are computer-generated as well, with varying degrees of realism). For those presenters, some holodeck-like techniques are being developed to make them feel more at home; seeing the scene they are inhabiting with correct sight-lines for example.

So we already have the ability to produce 3D images that could line the walls of the holodeck and we can generate seemingly solid, if translucent, objects inside. We can add realistic surround sound (a subject for another holodeck session perhaps) and we can go some way towards producing the characters with whom our holodeck viewer/player interacts. What we can’t do is make them feel solid – short of forcing the viewer to wear a servo-controlled diving suit! That’s even worse than glasses, so maybe we can go towards the holodeck but never actually reach it.

The last word must go to one of the three laws that SF writer Arthur C Clarke gave us in his Profiles of the Future (1962 and 1973).

“Any sufficiently advanced technology is indistinguishable from magic.”

To which we could add writer Dave Langford’s version, applying this to science fiction: “Any sufficiently advanced technology is indistinguishable from a completely ad-hoc plot device”.

Andy Finney

Andy Finney is an interactive media consultant and developer with a background in radio and television broadcasting. He has been bridging the gap between computers and audio/video for around 20 years since he first hooked up a BBC Micro and a videodisk player during his time at the BBC. Andy is a past-chair of the British Interactive Media Association and co-author of Managing Multimedia, published by Addison-Wesley.