I am very sympathetic with Pickering and Garrod's central message, namely the need to develop a detailed, computationally-oriented account of the mechanisms underlying language use in dialogue, and moreover that these-not mechanisms developed for monologue-should be regarded as the primary mechanisms of the language faculty. 2 On a more technical front, I think that a key claim of the authors', that dialogue participants do not explicitly keep track of their interlocuters' information states, but rather that this is emergent from the dynamic alignment of each other's information states is an important one.
Nonetheless, in seeking to intrinsically couple dialogue participants via alignment, they seem to create a virtually symmetrical view of the information states of speaker and addressee. Thus, a key component of P&G's accounts of collaborative utterances as well of self-monitoring is the claim that who is speaking at a given point does not, in some sense, make a difference given alignment: the addressee can take over or the speaker can `change voice' and self correct. As I show in this commentary this claim is incorrect: there is significant evidence that the contexts available to the conversationalists are NOT identical. Thus, there is actually intrinsic contextual misalignment between conversationalists that can persist across turns.
That a common context (cf. the common ground prominent in work by [ Stalnaker1978 , Lewis1979 , Clark Marshall1981 ]) emerges in dialogue is an important insight. It yields a better picture of, for instance, querying than classical speech act views [ Searle1969 ]. So, in asking a question, to take one example, a speaker puts up a question for discussion and whoever takes over the turn can address it, either the original asker or the original addressee:
(1) A: Who should we invite to the conference? A/B: Would Phil be a good idea?
And yet, the actual situation
is not as symmetrical as all this: the speaker's options for
self-repair or indeed other follow up are quite distinct from
the addressee's options. This can be be illustrated succinctly
by a phenomenon I have dubbed the Turn Taking Puzzle [
Ginzburg1997a
,
Ginzburg1997b
]. Questions of the form `Why?' involve radical
context dependence-pretheoretically, the context supplies a propositional
referent of some kind [
Moore1995
]. Interestingly, (2a,b) show that the resolution
accorded to the bare `why' changes according to who keeps or takes
over the turn. The resolution that can be associated with `Why?'
if A keeps the turn is unavailable to B if s/he had taken over
and vice versa:
(2a) A: Which members of our
team own a parakeet? A: Why? (= Why own a parakeet?)
(2b) A: Which members of our team own a parakeet?
B: Why? (= Why are you asking which members of our team own
a parakeet?)
(2c): Which members of our team own a parakeet?
Why am I asking this question?
(2c) shows that these facts cannot be reduced to coherence or plausibility-the resolution unavailable to A in (2a) yields a coherent follow up to A's initial query if it is expressed by means of a non-elliptical form. In other words, the context is responsible for these interpretational asymmetries, or rather the fact that distinct contexts are associated with the conversationalists.
Similarly, a common strategy for requesting a clarification is by means of a reprise fragment -a word or constituent of the previous utterance (See [ Purver et al.2002 , Purver et al.2003] for corpus and experimental evidence on clarification requests, particularly reprise fragments.). Reprise fragments have two prominent understandings [ Ginzburg Cooper(in press) ], exemplified in (3a). However, it is quite strange for a speaker to follow up her utterance with a reprise fragment. This becomes felicitous only if followed up by additional correction such as ``Wait, did I say Bo, no I mean Lou'' or some such. However, even then the readings that arise in (3a), whose resolution is radically context dependent, are not manifested:
(3a) A: Did Bo leave? B: Bo?
(= Either: are you asking if BO of all people left?; Or:
Who were you referring to as `Bo'?)
(3b) A: Did Bo leave? A: #Bo?
It is worth noting that contextual asymmetries of
this kind can persist for quite a number of turns, essentially
as long as a given discourse topic remains under discussion. (4)
is an extract from the British National Corpus in which Chris's `Why?'
is naturally understood about Norrine's utterance 5 turns back,
an utterance which seems to be viewed as grounded [
Clark1996
]:
(4) Norrine(1): When is the barbecue, the twentieth? (pause) Something of June / Chris(2): Thirtieth. / Norrine(3): A Sunday. / Chris(4): Sunday. / Norrine(5): Mm. / Chris(6): Why? (= Why do you ask when is the barbecue?) Norrine(7): Becau Because I forgot (pause) That was the day I was thinking of having a proper lunch party but I won't do it if you're going out.
Note that the resolution associated with Chris's `Why?' is simply unavailable to Norrine at all subsequent points, as illustrated in the constructed variant of (4) in (5a). As with previous examples, this cannot be explained on "pragmatic" grounds, since the speaker can fairly coherently express the requisite reading in non-elliptical fashion, as in (5b):
(5a) Norrine(1): When is the barbecue, the twentieth? (pause) Something of June /Chris(2): Thirtieth. /Norrine(3): A Sunday. /Phenomena such as this suggest that the different roles conversationalists play with respect to a given utterance (speaker v. addressee) are not something that gets neutralized in the utterance's aftermath. The contextual possibilities available for one conversationalist differ from those of the other conversationalist (I am referring to dialogue here; as P&G point out, multilogue is a genre with various distinct properties from two person dialogue.). In other words, a single context is not fully adequate to describe dialogue, even when talking about "public" context, which results from overtly registered conversational actions. Instead, one needs to view dialogue as involving updates by each conversationalist of some type of a publically accessible domain which is relative to each conversationalist, and so parametrizable by unpublicized factors such as individual goals and intentions (cf. Hamblin's individual commitment slate , [ Hamblin1970 ]). A framework which spells out this view and develops theoretical accounts as well as computational implementations of illocutionary and metacommunicative acts, including a detailed account of puzzles like the Turn Taking Puzzles exemplified above, is KOS [ Ginzburg1996 , Ginzburg2002 , Ginzburg(forthcoming) , [Cooper et al.2000] , Larsson2002 ].
1 The research described here is funded by grant number RES-000-23-0065 from the Economic and Social Research Council of the United Kingdom and by grant number GR/R04942/01 from the Engineering and Physical Sciences Research Council of the United Kingdom.
2
The authors point out the dearth of work in mechanistic
psychology and theoretical linguistics (primarily by syntacticians)
on dialogue. Since the late 1990s, there has, however, been work
by formal and computational semanticists precisely on developing
theories of information states and their dynamics in dialogue-see e.g.
work within the EU TRINDI project [
Consortium2000
], and the annual series of conferences on the
semantics and pragmatics of dialogue (MUNDIAL, TWENDIAL, AMSTELOGUE,
GOTALOG, BIDIALOG, EDILOG).