|
The SNIF-ACT Model
The goal of our
modeling effort is to develop a computer program than
simulates the user in enough detail to reproduce
the user data. SNIF-ACT (see
figure below) is the model that we are currently developing
to simulate WWW users. SNIF-ACT
is an extension of the ACT-R theory and simulation environment,
a general production
system architecture
designed to model human psychology.
By using this system to model WWW behavior, we link
our analysis to the
same principles used to model cognitive behavior in
general. ACT-R contains principles concerning: (1)
knowledge representation, (2)
knowledge deployment (performance), and (3) knowledge
acquisition (learning). There
are two major components in the ACT-R architecture:
a declarative
knowledge component
and a procedural
knowledge
component. ACT-R
has two kinds of memory for these two different kinds
of knowledge.

Declarative
Knowledge
Declarative knowledge
corresponds to things that we are aware we know and
that can be easily described to others, such
as the content of WWW links, or the functionality of
browser buttons. Declarative knowledge is represented
formally as chunks
in ACT-R. Declarative
chunks in ACT-R have sub-symbolic activation.
Activation represents the log-odds
of how likely a piece of knowledge is needed at a particular
time, and may be interpreted metaphorically as
a kind of mental energy that
drives cognitive processing. Activation spreads from
the current focus of attention, including goals,
through associations
among chunks in
declarative memory. These associations are built up
from experience, and
they reflect how ideas co-occur in cognitive processing.
Generally, activation-based theories of memory
predict that more activated
knowledge structures will receive more favorable processing.
Chunks with higher activation
values take less time to use and have a greater chance
to have an impact on behavior. Activation is a way
of quantifying the degree of
relevance of declarative information to the current
focus of attention (mathematically, it represents
the posterior probability of how likely each piece of
declarative information is needed given the current
focus of attention). At any
point in time, there is a stack of goals encoding the
user’s intentions. Goals are also represented
as chunks. ACT-R is always
trying to achieve the goal that is on top of that stack,
and at any point in time, it
is focused on a single goal.

Procedural
Knowledge
Procedural knowledge
is knowledge (skill) that we display in our behavior
without conscious awareness, such as knowledge
of how to ride a bike, or how to point a mouse to a
menu item. Procedural knowledge specifies how declarative
knowledge is transformed into
active behavior. Procedural knowledge is represented
as condition-action pairs,
or productions.
For instance, our SNIF-ACT simulation contains the production
rule Use-Search-Engine.
The production applies
in situations where the user has a goal to go to a WWW
site, has processed
a task description, and has a browser in front of them.
The production rule specifies that a subgoal will be
set to use a search engine.
The condition (IF)
side of the production rule is matched to the current
goal and the active chunks
in declarative memory, and when a match is found, the
action (THEN)
side of the production rule will be executed.
At any point
in time, a single production
fires. When there is more than one match, the matching
rules form a conflict
set, and a mechanism
called conflict
resolution is used
to decide which production to execute. The conflict
resolution mechanism
is based on a utility function. The expected utility
of each matching production is calculated based on
this utility function, and
the one with the highest expected utility will be picked.
In modeling WWW users, the utility function
is provided by information foraging theory, and specifically
the notion of information
scent. This
constitutes a major extension of the ACT-R theory and
is described in greater detail below.
Utility:
Information Scent
As users browse
the WWW, they make judgments about the utility of different
courses of action available to them. Typically,
they must use local cues, such as link images and text,
to make navigation decisions. Information scent
refers to the local cues that
users process in making such judgments. The analogy
is to organisms that use local smell cues
to make judgments about where to go next (for instance
in pursuing some prey). The model of users’ judgments
of information scent is based on spreading activation.
The basic idea is that
a user’s information goal activates a set of chunks
in a user’s memory, and text on the display screen activates
another set of chunks. Activation
spreads from these chunks to related chunks in a spreading
activation network.
Through this spreading activation
network, the amount of activation accumulating on the
goal chunks and display chunks
is an indicator of their mutual relevance. The spreading
activation network is therefore content-based, as mutual
relevance of user goals and
contents are calculated each time the display changes.
The amount of activation is used
to evaluate and select productions. The activation of
content-dependent chunks matched by production rules
can be used to determine the
utility of selecting those production rules dynamically.
The spread of
activation from one cognitive structure to another is
determined by weighting values on the associations
among chunks. These weights
determine the rate of activation flow among chunks.
In the context of WWW browsing,
we assume that activation spreads from the user’s goal,
which is the focus of attention, through memory
associations to words and images
that the user sees on WWW pages. Associations have strengths
or weights
that determine
the amount of activation that
flows from one chunk to another. If the user reads some
link text on a WWW page,
and the link text is strongly associated to the user’s
goal, then we expect the user to judge the link as being
highly relevant to the goal.
The association
strengths among words in human memory are assumed to
be related to the probabilities of word occurrences
and of word co-occurrences. Consequently, the spreading
activation computation of information scent in
SNIF-ACT requires these estimates.
In past research, we derived these estimates from the
Tipster corpus. This
database contained statistics relevant to setting the
base-level activations of 200 million word tokens and
the inter-word association
strengths of 55 million word pairs. Unfortunately, the
Tipster corpus does not contain many of the
novel words that arise in popular media such as the
WWW. For instance, the movie title "Antz"
does not occur in the
Tipster corpus. Consequently, we augment the statistical
database derived from Tipster by estimating word frequency
and word co-occurrence statistics
from the WWW itself using a program that calls on the
AltaVista search engine
to provide data. The spreading activation networks needed
to perform the scent computations are
stored in a scent database
that is accessed
when production evaluations are computed by SNIF-ACT.

|