Narrative Navigation:

Intersections at Third and Indiana

 

 

 

 

 

 

 

 

 

 

 

Benjamin O’Connor

Advanced Undergraduate Project

Spring 2000

Prof. Brian Smith and Heidi Gitelman

Explanation Architecture, MIT Media Lab

Introduction

During the IAP and spring terms of 2000, I worked with the Explanation Architecture Group on the Narrative Navigation and Multilinear Narrative project.  Heidi Gitelman is creating a multilinear, interactive documentary television prototype for digital television.  The documentary’s knowledge representation, story structure, and navigation are based upon a viewer’s interaction with the story and are dynamically created by viewers’ personal reactions to, and understandings of the program’s content, in this case a multilinear documentary, “Intersections at Third and Indiana.”

            During IAP and spring, the aspects of the project that I worked on included:  interactive hardware construction and setup, test subject interviewing, and test subject data collection, evaluation, and analysis.  In working with these evaluations and data collection software, I had an opportunity to learn and study the Isis programming language, also developed at the Media Laboratory.  Prototype versions of the interactive hardware have been set up, and test subjects watched the documentary. Their responses were tracked and recorded via software and hardware modules of the project.  In the future, the results of these evaluations will be used to create algorithms for navigation through the documentary based on the subject’s interaction with our hardware. 

            A large part of this project was in constructing and setting up the project hardware. The project also involved software engineering skills, working on the collection and data analysis software.  The culmination of this project will be applying artificial intelligence algorithms to the actual video dynamic navigation software.


Project Overview

The purpose of this project is to create a multilinear, interactive documentary television prototype for digital television.  The documentary’s knowledge representation, story structure, and navigation are based upon a viewer’s interaction with the story and are dynamically created by viewers’ personal reactions to, and understandings of the multilinear documentary, entitled “Intersections at Third and Indiana.”

Developing this Narrative Navigation has involved the development of a great deal of software.  The current software platform used for the project is the ISIS platform.  It is specifically designed for such uses with multimedia and video manipulation.  A force sensitive egg-shaped device held by the viewer constantly monitors the viewers’ personal reactions to the documentary.  This egg-shaped device continuously transmits a signal to a computer.  On this computer, software will be running which analyzes the received signal in real time and makes any necessary adjustments to the sequence of video and audio being viewed by the viewer.  This project deals with the analysis of human test subjects interacting with prototype systems, and the development of the system software to enable manipulation of the viewed video and audio streams.


System Design

The ISIS platform allows for the instantaneous manipulation of these sounds and video images.  A major challenge will be to recognize various signal patterns and then act upon them in the appropriate manner – by dynamically executing the video playback according to the viewer’s reactions.  This can be accomplished by segmenting the documentary footage into “pieces.”  Some viewers will connect to certain types of “pieces,” and others will connect to other types of “pieces.” 

Viewer testing has already given us information about when different viewers have “connected” to the documentary, and when they have not.  This information is in the process of being analyzed to identify the elemental “pieces” of this documentary, and to categorize them, label them, and eventually correlate them to other “pieces.”  The end result will be that we will know that if a certain viewer connects to piece X, then we will adjust the video stream so that the rest of the documentary will be told by showing piece Y, which has been linked positively to X, and by not showing piece Z, which has been linked negatively to piece X.

In order to accommodate this linking and labeling, a separate database must be kept on the computer, and accessed by the program in real time correlated with the documentary footage.  The database will hold a label for each segment, a code number needed to retrieve that segment from our footage, and another set of numbers used to link that segment with others. 

The first, and longest, step in this project thus far has been to collect viewer data from human test subjects interacting with prototype versions of the hardware and software as they are being developed simultaneously.  The computer hardware used to record signals from the handheld force feedback device and to control the storage and playback of the documentary video is a Compaq Alpha processor based workstation.  On this computer resides the ISIS-OpenGL software platform, and the project’s code, written in ISIS.  The computer is connected to the receiver of the handheld force-feedback device via a standard serial port. 

Following the collection of the prototype evaluation information, the next necessary step is to complete the analysis and correlation of the viewer data collected thus far.  This allows us to identify the documentary’s “pieces” and to correlate them as much as possible.  This is the most difficult aspect of this project.  These correlations have to be developed by observing how groups of viewers react to the documentary as a whole – what parts have provoked distinct reactions from the viewers in general.

I believe that the most difficult software aspect of the implementation of this system is the design and implementation of a database system and the means to access it and use the retrieved data “on the fly.”  This database is used to correlate segments of documentary video and provide the necessary links between segments, which allow the software to determine the “groups” of video segments to be played or skipped. 

This database must keep track of a number of correlations.  First of all, the software needs to know the frame numbers for the beginning and end of each “piece” of the video.  This way, the software always knows which piece it is currently playing, and where to go in the video to jump to a particular “piece.”  Also, the database must correlate certain “pieces” together.  This means that the database will have a line for each “piece” containing a list of what pieces are very similar to that one, and which ones are not similar.  In other words, this is a list of what pieces would be called upon if the viewer had a strong connection to this particular piece, and what pieces would be called upon if this viewer had a weak, or no connection to this piece.  This will keep track of which scenes will be kept or deleted based on the viewer’s reactions to a section.(Figure 1)  This database will be constructed individually for each documentary to be used interactively based on the analysis of gathered user tests.

 

Sec#

Similars

Differents

Beg. Frame#

End. Frame#

1

4

3

0

3507

2

4

None

3507

8003

3

None

6, 1

8003

13905

4

2, 1

None

13905

17201

5

6

None

17201

20091

6

5

3

20091

23023

Figure 1:  Example Database

            When the program begins its execution, a play list, initially containing every section, is constructed.  As the video is playing, a loop executes in the program that monitors the values received on the computer’s serial port by the handheld force-feedback device.  Throughout an entire section of video, as defined by the begin frame and end frame entries in the database, this data is analyzed.  The software recognizes peaks and troths in the data as well as average values of user response for a given section.  It is determined that a user “connected to” a section, or had no connection to a section based not only on the average value of force applied to the user’s handheld device, but also by the presence or absence of a certain number of sharp peaks or lengthy plateaus, which signify a marked response to the piece in question.

            If the current piece is associated with a negative response, all of the pieces listed in the similars column of that piece’s database row are removed from the play list, and all of the pieces listed in the differents column are added back into the play list if they have been removed.  If the current piece is associated with a positive response, all of the pieces listed in the similars column of that piece’s database row are added back into the play list if they have been removed.  Also, the pieces listed in the differents column of that piece’s database row will be removed from the play list.  So, using Figure 1 as an example, if a negative response is received overall during the playing of section 1, section 4 will be removed from the play list.  However, if section 2 causes a positive response, section 4 will be added back into the play list and ultimately played.  As each section nears its end, the play list is consulted to determine which section gets played next.  To enable the real-time processing and handling of all of this data, the database is small enough such that it is loaded into memory before accesses begin, to eliminate the access time necessary to retrieve data repeatedly from disk.


Conclusion

Currently the project team is still in the midst of scheduling and carrying out viewer and producer evaluations, which will eventually enable us to analyze the viewer data and eventually begin the process of making the documentary interactive.  Hopefully some of the planning and software structure put forth in this document will be of some assistance to those responsible for writing the software necessary to make Narrative Navigation possible in the future. 
Pseudocode

 

DB <= Load database array into memory as 2-dimensional array

N <= Number of lines in DB

P <= list defined to contain no duplicates

P <= initialize to contain all sections 1,2…N

Reaction <= Boolean value for positive or negative reaction

X <= current section to play, initialize as 1

 

LABEL: PlaySection

 

Start video play at frame(DB[X,3])

 

WHILE (current frame playing < DB[X,4])

{

     (analyzedata)   ;; This procedure modifies value Reaction

}   

 

IF Reaction == “negative” THEN

{

     FOR each element J in DB[X, 1]

{

     P = PX ;; Remove things from play list

}

     FOR each element K in DB[X, 2]

     {

           P = P + X ;; Add things back to play list

}

}

 

IF Reaction == “positive” THEN

{

     FOR each element J in DB[X, 1]

     {

           P = P + X  ;; Add things back to play list

     }

}

 

IF (there is a next element in list P) THEN

{

X <= next element in list P

GOTO PlaySection

}

 

END