Bruce,
I think it’s important to distinguish
between models in general and the kinds of models that exhibit the set of three
characteristics I was discussing earlier. I don’t disagree that much about
models in general; we have to approach things via analogies, and having a clear
analogy is much better than having a vague one. I have not argued that we can
do without models, and you’ve pointed out a crucial weakness with the way I
initially phrased my first characteristic – I didn’t distinguish between
the metalanguage and what it was describing in that. What I was trying to argue
against with that point was the idea that, since the model has to use objects
and rules as part of its metalanguage, what it’s describing does too – in other
words, the reification you rightly caution against. I’ve encountered that kind
of reification often enough that I think I’ve developed a knee-jerk reaction whenever
“symbols” or “primitives” come up. As long as they’re viewed as features of the
metalanguage only, I have no trouble with them. If someone tries to tell me
that human DNA encodes for “+/- N,” I break out in hives.
Those other two characteristics,
however, I think should still be in play. I keep focusing on what I’m calling “atemporal”
and “temporal” simplicity metrics mainly because I think it’s one useful way
to go about approaching the difference between functionalist approaches and some
(but not all) formalist ones. I started thinking about it when doing background
research on psycholinguistic studies of lexical storage (this is going to
repeat some stuff from a post a while back, but I need it for context). There
was a tug of war for awhile between proponents of the idea that complex words
were stored as single units, and those who thought they were stored as separate
morphemes and assembled via rules. Over time, the research seemed to show a kind
of “pox on both your houses” result: people appeared to use both kinds
of storage, even for the same words, and use whichever was best suited for the
task they were asked to do. From the standpoint of the traditional “rules and
objects” simplicity metric, which views the system as a static entity divorced
from actual processing, that was a hopelessly messy result. But from another
standpoint, it was simpler. Given a range of tasks in real time, dual
storage allows the subject to use whichever representation allows the task to
be accomplished with the least processing. And there’s no a priori
reason why processing considerations can’t be part of a simplicity metric – in fact,
the really squirrely thing about the use of simplicity metrics in most
linguistic theory is that the theorists don’t really discuss them, at least not
to the point where they try to tackle the issue of why one simplicity metric
might be better than another. Occam’s Razor simply says you shouldn’t posit
more stuff than you need to explain what you’re trying to explain – it doesn’t
limit what counts as valid stuff (unless you take it strictly in its original
medieval form, in which case rules don’t count either). And if you do incorporate
things like processing into your simplicity metric, then there went the
competence/performance distinction in its usual sense. Functionalists usually
claim that processing and suitedness-to-task considerations in part determine
the form of language, so temporal metrics fit nicely with functionalism.
I should point out here that I’m not arguing that atemporal metrics are invalid
– in fact, I’m not sure there’s a way to go about arguing conclusively that
either type is valid or invalid. However, I think there is a problem if one makes
an argument using an atemporal metric against a theory that uses a temporal one
– or vice versa.
The third characteristic I
mentioned, the explicit or implicit claim that rules are invariant, creates
some related issues. It’s part of the conceptual underpinnings of the
competence/performance dichotomy, and again, functionalists, in general, do not
see that as a strict dichotomy – performance, in real time, to accomplish real
tasks, partly structures competence. Going further, though, models in which
variance is incorporated as an inherent feature are hard to assess
mathematically in some ways (my knowledge may be too dated here, but that has
been the case in the past, at least); since many formalists determine the value
of a model partly on the basis of whether it’s mathematically evaluable,
inherently variable models always lose in those terms. But if language really is
fundamentally variable, then a nicely mathematically evaluable model of a determinate
system is….a well-behaved wrong model. If we had accurate observations
of planetary movements, but we did not yet have the mathematics to deal with
complex orbits and ellipses, Ptolemaic astronomy would still be wrong. It’s one
thing to require perfectly circular orbits when the only observations you have
seem to indicate they are circular, but it’s another thing entirely when
you know they’re not but demand they be anyway. Some of the work on “learnability
theory” (in Gold’s sense) that has been used to criticize functionalist accounts
seemed to operate along these lines – “it’s not modellable in a mathematically
evaluable way, so it’s not as good.”
Bill Spruiell
Dept. of English
Central Michigan University
From: Assembly for the
Teaching of English Grammar [mailto:[log in to unmask]] On Behalf Of Bruce
Despain
Sent: Friday, October 19, 2007 9:56 AM
To: [log in to unmask]
Subject: Re: Rules ad nausiam
Bill,
You
must remind yourself of the distinction between a language and a meta-language
used to model it. Science does in fact depend on a model of the most
primitive kind. It is modeling itself. There are symbols made
to represent phenomena. There are beginning (primitive) symbols and there
is a path defined by which to come up with derived expressions. These
expressions must be true to the phenomena symbolized and hence
falsifiable. I believe that the idea of model can be used to apply to the
workings of science itself, to the workings of language, to the workings
of anything that can be isolated as phenomenal in nature. This is made
abundantly clear in the work of John Casti, whom I referred to in my last
post.
The
statement that "noun" or the use of "+N" being part of
language itself is confusing to me. Words, sounds, meanings are
usually considered to be the stuff of language. These descriptive terms
used to refer to elements of language have to be part of a
metalanguage. Their applications are not confined to English, just their
expressional form. Most people would admit that "+N"
is not an element of the English language. Try using it in a sentence
without the quote marks around it. It doesn't work for me. True the
"+" and the "N" are familiar parts of Engish, but symbols
in different derived systems. This expression "+N" denotes an
object of a mathematical nature. The plus sign signifies the positive
value of a binary functor. It contrasts with a "-N" that
characterizes the opposite attribute that must occur with "+V"
(in some models).
There is indeed a sense in which a neural network model of
language has to presuppose objects as part of language. It has
to do with the concept of an object. The object may well be the phenomena
being observed and described by the model. In this sense it has to
be part of language; the model requires it. The model has other
objects that it uses in the description (symbols, expressions) that
are not part of the language being described. The word "object"
in a trivial sense is part of the English language. Linguistic
investigators, in the process of making observations, abstract the phenomena
from nature. This allows them to make the phenomena part of their primitive set
to be symbolized as what they choose to refer to as objects. Certainly
this process makes them neither real nor concrete. But it does bring
them to a part of consciousness that allows them to be treated in similar
ways. Here again we must be careful not to let the metaphor of language
reify them beyond the confines of the model. Maybe this is the problem
that you were getting at.
Bruce
>>> "Spruiell, William C" <[log in to unmask]>
10/18/07 4:32 PM >>>
Dear All:
I tried to post the message
clipped below at around 2:30 p.m. today, but as far as I can tell, it didn’t
make it – It doesn’t show up on my list and I never got a receipt note. This
may simply be a case of the internet exercising aesthetic judgment, but in case
it’s accidental, I’m trying again.
One addition – I’d like to make it
clear that what I’m kvetching about are three specific *characteristics*
of some model-theoretic frameworks. Many frameworks don’t have full determinacy
or atemporal metrics – but a lot of the ones in linguistics do, and those are
the ones I’m talking about.
__
Bruce,
Science does not, in
fact, depend on that kind of model. Quite a number of scientific theories are
couched in terms of such models, but that’s not why they’re scientific –
rather, it’s because they make predictions that can be falsified by reference
to mutually observable phenomena. Conflating that kind of model with
science itself is precisely the kind of thing I was whinging about.
We don’t even have to go far
afield to find scientific theories (or at least, theories as, or more widely
recognized as such, than those in linguistics) that don’t use models with the
three characteristics I was discussing. Biology, by and large, doesn’t use
atemporal simplicity metrics. The operation of a system, in real time,
to perform real tasks, is an integral part of biological models. If a system
with fewer rules and fewer primitives nevertheless requires far more steps to
achieve a particular goal than another system, its inefficiency counts for at
least as much against it as its simplicity (in terms of an atemporal metric)
counts for it.
I would also argue that neural
networks do not “use” objects in the sense that a standard grammatical model
uses a symbol like “noun,” although I did not make enough of a distinction in
my original post. You’re certainly right that people talk about neural networks
in terms of objects (nodes) and their properties (activation thresholds, etc.).
However, they’re not talking about language when they do that. The nodes
and thresholds are not within the domain itself that the network is modeling.
In an “objects and rules” model, the objects are themselves part of the
“content” the system is modeling; for example, “noun,” or a value
like “+N,” in a standard grammatical model is considered to be part of language
itself. In a neural network model of language, “threshold value” is not
considered to be a linguistic phenomenon being modeled, but rather part of the
description of the system by which a linguistic phenomenon is modeled.
So, you’re entirely right that you can consider neural networks to have
objects, but those are transparently presented as features of the model
not features of language. Saying something about a threshold
value is saying something about the device language is conjectured to be
running on, not saying something about language. While the network can be
described in terms of objects and rules, the network itself is not
manipulating objects when it operates. There’s no sense in which a neural
network model of language has to presuppose objects as part of
language.
Bill Spruiell
Dept. of English
Central Michigan University.
From: Assembly for the Teaching of English Grammar
[mailto:[log in to unmask]] On Behalf Of Bruce Despain
Sent: Wednesday, October 17, 2007 5:18 PM
To: [log in to unmask]
Subject: Re: Rules ad nauseam
Bill,
Thanks for the notice, though I was stultified (proved to be
of unsound mind?).
The set of symbols used as primitives and a set of rules, laws, or
principles, is not unique to linguists. In fact science itself consists
of this kind of model. To attack such a model is to attack science, which
has a much longer track record than any linguist or English maven.
The objects to be manipulated by language are sounds. (disputable?)
The objects to be manipulated by language are meanings.
(disputable?)
The objects to be manipulated by language are signs written down
in symbols. (disputable?)
The objects to be manipulated by language are combinations of the
above. (disputable?)
1) What seems to be the nature of neural nets doesn't mean
that no objects don't exist in the model. Indisputable is that fact
that there are potentials, thresholds, charges, etc.
involved. The model-theoretic framework says nothing about the
reality of the objects posited. The model is no more than a
metaphor for what is being modeled. 2) A fully determinate system does
not come to mind when the modeling is of phenomena like the weather.
There are simply too many elements in the determinate version. Even the
particle theory of matter must be abandoned though it is clearly
determinate. The sheer number of particles involved often becomes so
great as to make the model impractical in making predictions of any but
the roughest statistical kind. Certainly the determinate nature of the
phenomena described ought to be paralleled by that of the model that describes
it. The investigator who wants to put the former in doubt, has
no need for a model with full predictive power. 3) The
number of objects and rules has been a condition since the acceptance of
Occam's razor by most scientists. I would think that the simple nature of
these objects and the statement of these rules in the accepted vernacular of
mathematics would be another consideration.
I think that there is a likely confusion between a mathematical
model and a particular model for linguistic descriptions. The
model-theoretic principles of mathematics are unavoidable to any
formalization. Whether a specific mathematical model actually serves to
describe the phenomena it claims to can always be disputed and refuted with
appropriate data. But we don't thow away the language because of what people
can say with it. If the phenomena we are trying to describe are
indeterminate, then it doesn't make much sense to use a model that
requires determinate phenomena. But the model itself had better use
a determinate framework, or it would hardly be able to explain anything (have
any predictive power).
I would wonder that failures in the operation (?) of the model of
language could ever be made out to be performance issues. The performance
of the model would have to be faultless, in order to support itself. Maybe
people expect it to make predictions of the performance variety?
(Present-day minimalists are often far too broad in their expectations in
this area.) You say, "The problem is not the framework itself, but
rather its hegemonic status." I think that what many linguists call
the model-theoretic framework is their own model for natural language.
This is the area of hegemony, I believe. This then pits one camp against
another based on their own modeling principles established for their own
purposes, not the two elements you gave for a mathematical model in
science. Certain models can in fact be shown to be faulty based on the
way in which the nature of the elements and mathematical relationships posited
do not match in principle the behavior of the objects being modeled.
To do so does require a bit of sophistication, which, sorry to say, I do
not possess. (Cf. John Casti, Reality Rules)
Bruce
To join or leave this LISTSERV list, please visit the list's web
interface at: http://listserv.muohio.edu/archives/ateg.html and select
"Join or leave the list"
Visit
ATEG's web site at http://ateg.org/
NOTICE:
This email message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies of
the original message.