Narrative Navigation:
Intersections at Third and Indiana
Advanced Undergraduate Project
Spring 2000
Prof. Brian Smith and Heidi Gitelman
Explanation Architecture, MIT Media Lab
Introduction
During the IAP and spring terms of 2000, I worked with the Explanation Architecture Group on the Narrative Navigation and Multilinear Narrative project. Heidi Gitelman is creating a multilinear, interactive documentary television prototype for digital television. The documentary’s knowledge representation, story structure, and navigation are based upon a viewer’s interaction with the story and are dynamically created by viewers’ personal reactions to, and understandings of the program’s content, in this case a multilinear documentary, “Intersections at Third and Indiana.”
During
IAP and spring, the aspects of the project that I worked on included: interactive hardware construction and setup,
test subject interviewing, and test subject data collection, evaluation, and
analysis. In working with these
evaluations and data collection software, I had an opportunity to learn and
study the Isis programming language, also developed at the Media
Laboratory. Prototype versions of the
interactive hardware have been set up, and test subjects watched the
documentary. Their responses were tracked and recorded via software and
hardware modules of the project. In the
future, the results of these evaluations will be used to create algorithms for
navigation through the documentary based on the subject’s interaction with our
hardware.
A
large part of this project was in constructing and setting up the project
hardware. The project also involved software engineering skills, working on the
collection and data analysis software.
The culmination of this project will be applying artificial intelligence
algorithms to the actual video dynamic navigation software.
Project Overview
The purpose
of this project is to create a multilinear, interactive documentary television
prototype for digital television. The
documentary’s knowledge representation, story structure, and navigation are
based upon a viewer’s interaction with the story and are dynamically created by
viewers’ personal reactions to, and understandings of the multilinear
documentary, entitled “Intersections at Third and Indiana.”
Developing
this Narrative Navigation has involved the development of a great deal of
software. The current software platform
used for the project is the ISIS platform.
It is specifically designed for such uses with multimedia and video
manipulation. A force sensitive
egg-shaped device held by the viewer constantly monitors the viewers’ personal
reactions to the documentary. This
egg-shaped device continuously transmits a signal to a computer. On this computer, software will be running
which analyzes the received signal in real time and makes any necessary
adjustments to the sequence of video and audio being viewed by the viewer. This project deals with the analysis of human
test subjects interacting with prototype systems, and the development of the
system software to enable manipulation of the viewed video and audio streams.
System Design
The ISIS platform allows for the instantaneous manipulation of these sounds and video images. A major challenge will be to recognize various signal patterns and then act upon them in the appropriate manner – by dynamically executing the video playback according to the viewer’s reactions. This can be accomplished by segmenting the documentary footage into “pieces.” Some viewers will connect to certain types of “pieces,” and others will connect to other types of “pieces.”
Viewer
testing has already given us information about when different viewers have
“connected” to the documentary, and when they have not. This information is in the process of being
analyzed to identify the elemental “pieces” of this documentary, and to
categorize them, label them, and eventually correlate them to other
“pieces.” The end result will be that we
will know that if a certain viewer connects to piece X, then we will adjust the
video stream so that the rest of the documentary will be told by showing piece
Y, which has been linked positively to X, and by not showing piece Z, which has
been linked negatively to piece X.
In order to
accommodate this linking and labeling, a separate database must be kept on the
computer, and accessed by the program in real time correlated with the
documentary footage. The database will
hold a label for each segment, a code number needed to retrieve that segment
from our footage, and another set of numbers used to link that segment with
others.
The first,
and longest, step in this project thus far has been to collect viewer data from
human test subjects interacting with prototype versions of the hardware and
software as they are being developed simultaneously. The computer hardware used to record signals
from the handheld force feedback device and to control the storage and playback
of the documentary video is a Compaq Alpha processor based workstation. On this computer resides the ISIS-OpenGL
software platform, and the project’s code, written in ISIS. The computer is connected to the receiver of
the handheld force-feedback device via a standard serial port.
Following
the collection of the prototype evaluation information, the next necessary step
is to complete the analysis and correlation of the viewer data collected thus
far. This allows us to identify the
documentary’s “pieces” and to correlate them as much as possible. This is the most difficult aspect of this
project. These correlations have to be
developed by observing how groups of viewers react to the documentary as a
whole – what parts have provoked distinct reactions from the viewers in
general.
I believe
that the most difficult software aspect of the implementation of this system is
the design and implementation of a database system and the means to access it
and use the retrieved data “on the fly.”
This database is used to correlate segments of documentary video and
provide the necessary links between segments, which allow the software to
determine the “groups” of video segments to be played or skipped.
This
database must keep track of a number of correlations. First of all, the software needs to know the
frame numbers for the beginning and end of each “piece” of the video. This way, the software always knows which
piece it is currently playing, and where to go in the video to jump to a
particular “piece.” Also, the database
must correlate certain “pieces” together.
This means that the database will have a line for each “piece”
containing a list of what pieces are very similar to that one, and which ones
are not similar. In other words, this is
a list of what pieces would be called upon if the viewer had a strong
connection to this particular piece, and what pieces would be called upon if
this viewer had a weak, or no connection to this piece. This will keep track of which scenes will be
kept or deleted based on the viewer’s reactions to a section.(Figure 1) This database will be constructed
individually for each documentary to be used interactively based on the
analysis of gathered user tests.
|
Sec# |
Similars |
Differents |
Beg. Frame# |
End. Frame# |
|
1 |
4 |
3 |
0 |
3507 |
|
2 |
4 |
None |
3507 |
8003 |
|
3 |
None |
6, 1 |
8003 |
13905 |
|
4 |
2, 1 |
None |
13905 |
17201 |
|
5 |
6 |
None |
17201 |
20091 |
|
6 |
5 |
3 |
20091 |
23023 |
Figure 1: Example Database
When
the program begins its execution, a play list, initially containing every
section, is constructed. As the video is
playing, a loop executes in the program that monitors the values received on
the computer’s serial port by the handheld force-feedback device. Throughout an entire section of video, as
defined by the begin frame and end frame entries in the database, this data is
analyzed. The software recognizes peaks
and troths in the data as well as average values of user response for a given
section. It is determined that a user
“connected to” a section, or had no connection to a section based not only on
the average value of force applied to the user’s handheld device, but also by
the presence or absence of a certain number of sharp peaks or lengthy plateaus,
which signify a marked response to the piece in question.
If
the current piece is associated with a negative response, all of the pieces
listed in the similars column of that
piece’s database row are removed from the play list, and all of the pieces
listed in the differents column are
added back into the play list if they have been removed. If the current piece is associated with a
positive response, all of the pieces listed in the similars column of that piece’s database row are added back into
the play list if they have been removed.
Also, the pieces listed in the differents
column of that piece’s database row will be removed from the play list. So, using Figure 1 as an example, if a
negative response is received overall during the playing of section 1, section
4 will be removed from the play list.
However, if section 2 causes a positive response, section 4 will be
added back into the play list and ultimately played. As each section nears its end, the play list
is consulted to determine which section gets played next. To enable the real-time processing and
handling of all of this data, the database is small enough such that it is
loaded into memory before accesses begin, to eliminate the access time necessary
to retrieve data repeatedly from disk.
DB <= Load database array into memory as 2-dimensional
array
N <= Number of lines in DB
P <= list defined to contain no duplicates
P <= initialize to contain all sections 1,2…N
Reaction <= Boolean value for positive or
negative reaction
X <= current section to play, initialize as 1
LABEL: PlaySection
Start video
play at frame(DB[X,3])
WHILE
(current frame playing < DB[X,4])
{
(analyzedata) ;; This procedure modifies value Reaction
}
IF Reaction == “negative” THEN
{
FOR each element J in DB[X, 1]
{
P
= P – X ;; Remove things from play
list
}
FOR each element K in DB[X, 2]
{
P
= P + X ;; Add things back to play
list
}
}
IF Reaction == “positive” THEN
{
FOR each element J in DB[X, 1]
{
P
= P + X ;; Add things back to play
list
}
}
IF (there is
a next element in list P) THEN
{
X <= next element in list P
GOTO
PlaySection
}