Here are more interesting phenomena in English that make us question or wonder about "rules" without exception.  

In describing the gradable adjectives in my last post I did not mention how to handle "tired."  This was based on the fact that for me it has two syllables, enough reason to use "more."  However, the Brits pronounce it with just one syllable.  So for them it is exceptional, unless they are able to skip their phonetic analysis and allow it to be taken as phonemically structured as having two syllables.  

There are more cases of suppletion that could be pointed out and taught in some depth, particularly for the (astute) foreign student.  The Indo-European change of intervocalic -r- to -s is apparent in the plural of Latin "corpus" -- "corpora."  In English we have the change (a rule) of "is" to "are" and "was" to "were" in the plural.  Here we see a present tense phenomenon in a verb with past meaning.  A verb in its present tense form has historically taken the place of a present tense in the paradigm.  (Suppletion has also happened with the paradigmatic past tense of "go" -- "went."  In this case we still have the present form of "wend" in the language, even with a new past form "wended.")  In this regard we may consider also the present meaning of the past form "must."  The other modal auxiliary verbs have both forms with present meaning (can, could; shall, should; will, would).  This reminds me also of the  plural form in "news" that has use having singular meaning: "This news is not good."  

I believe that these kinds of phenomena are quite commonly taught in ESL classes, but maybe not so much in the High School setting.  Is that maybe because we want to be rule based so that irregular paradigms do not have a place?  How do grammar teachers (and writers) approach suppletion?  With embarrassment or avoidance?

Bruce

  ----- Original Message ----- 
  From: Bruce D. Despain 
  To: [log in to unmask] 
  Sent: Friday, October 19, 2007 8:21 PM
  Subject: Re: Rules ad nausiam


  Maybe I should have given a language example in contrasting the algorithmic and table approaches to solving a mapping (modeling) problem.  The single syllable gradable adjectives in English add an -er to form their comparative.  The multiple syllable gradable adjectives use the degree adverb "more" to express the same thing.  There are also certain two syllable adjectives ending in -y, -ow,  or -le which might use either means.  Some of these optionally change rules when there are additional syllables, e.g., tidy, tidier; untidy, untidier, or more untidy (= less tidy).  But there are a small number of gradable adjectives that form no comparative, e.g., "little," for which people prefer to use "small" instead, and "good" where "better is always preferred (also: far, farther or further).  Then the single syllable gradable adjectives that use "more": like, real.  Clearly the algorithm (rule) is elegant, but the table can't be easily avoided where suppletion or an exception is involved.  

  The above explication of a "rule" in English is not conducive to exceptionless (invariant?) formulation by its very nature.  What would be invariant would be the means by which it would be formalized.  We would have to set up a consistent set of features that categorized adjectives, classified some of them as gradable, and registered their syllable count and the semi-vowel--like nature of their second syllable.  A rule formulated in these kinds of terms would still need to allow certain exceptions as noted above.  Perhaps "like" could be reclassified as a conjunction or preposition, but still be gradable(??).  Suppletion would have to be described by other kinds of overriding rules that allowed certain items to take the place of others in a final seemingly unavoidable ad hoc fashion.  (An different algorithm for each item in a table is hardly simpler than the table itself as the solution.  The paradigms of traditional grammar are no more than tables from which the student is expected to deduce rules to derive similar forms for items of the same class.)  

  These are just some ideas that may shed light on the mechanics, if not the philosophy of modeling.  I think the method used to describe the phenomena is not as important as the consistency and coherence of the formalization.  

  Bruce
    ----- Original Message ----- 
    From: Bruce Despain 
    To: [log in to unmask] 
    Sent: Friday, October 19, 2007 3:11 PM
    Subject: Re: Rules ad nausiam


    Bill,

    Thank your for clarifying your stance on models.  I do want to poin out that it is entirely possible for a model to be capable of modeling itself.  It only needs to contain the objects and relationships that will allow it to.  Goedel needed this realization when he proved his incompleteness theorem in mathematics.  But maybe we could say that he has reified in some sense numbers themselves.  I suppose this is the argument that still keeps two camps of mathematicians at loggerheads: the constructionists maintain that a proof is not a proof if it requires a mapping (model) that is infinite. . . .

    Your division between temporal and atemporal sounds to me like what I've come to call the algorithmic solution vs. the table solution.  In programming we often have the choice, and we make it on empirical grounds.  Sometimes it is easier to write a routine to change one expression to another -- say change a given date on one calendar to the corresponding date on another.  At other times, its better to look it up in a table, say when the relationship is arbitrary, like Christmas and Dec 25.  Both ways are often available. 

    I've got to go now, Bill, but I'll get back to this if it has been helpful.  

    Bruce



    >>> "Spruiell, William C" <[log in to unmask]> 10/19/07 2:18 PM >>>

    Bruce,

     

    I think it’s important to distinguish between models in general and the kinds of models that exhibit the set of three characteristics I was discussing earlier. I don’t disagree that much about models in general; we have to approach things via analogies, and having a clear analogy is much better than having a vague one. I have not argued that we can do without models, and you’ve pointed out a crucial weakness with the way I initially phrased my first characteristic – I didn’t distinguish between the metalanguage and what it was describing in that. What I was trying to argue against with that point was the idea that, since the model has to use objects and rules as part of its metalanguage, what it’s describing does too – in other words, the reification you rightly caution against. I’ve encountered that kind of reification often enough that I think I’ve developed a knee-jerk reaction whenever “symbols” or “primitives” come up. As long as they’re viewed as features of the metalanguage only, I have no trouble with them. If someone tries to tell me that human DNA encodes for “+/- N,” I break out in hives.

     

    Those other two characteristics, however, I think should still be in play. I keep focusing on what I’m calling “atemporal” and “temporal” simplicity metrics mainly because I think it’s one  useful way to go about approaching the difference between functionalist approaches and some (but not all) formalist ones.  I started thinking about it when doing background research on psycholinguistic studies of lexical storage (this is going to repeat some stuff from a post a while back, but I need it for context). There was a tug of war for awhile between proponents of the idea that complex words were stored as single units, and those who thought they were stored as separate morphemes and assembled via rules. Over time, the research seemed to show a kind of “pox on both your houses” result: people appeared to use both kinds of storage, even for the same words, and use whichever was best suited for the task they were asked to do. From the standpoint of the traditional “rules and objects” simplicity metric, which views the system as a static entity divorced from actual processing, that was a hopelessly messy result. But from another standpoint, it was simpler. Given a range of tasks in real time, dual storage allows the subject to use whichever representation allows the task to be accomplished with the least processing. And there’s no a priori reason why processing considerations can’t be part of a simplicity metric – in fact, the really squirrely thing about the use of simplicity metrics in most linguistic theory is that the theorists don’t really discuss them, at least not to the point where they try to tackle the issue of why one simplicity metric might be better than another. Occam’s Razor simply says you shouldn’t posit more stuff than you need to explain what you’re trying to explain – it doesn’t limit what counts as valid stuff (unless you take it strictly in its original medieval form, in which case rules don’t count either). And if you do incorporate things like processing into your simplicity metric, then there went the competence/performance distinction in its usual sense. Functionalists usually claim that processing and suitedness-to-task considerations in part determine the form of language, so temporal metrics fit nicely with functionalism. I should point out here that I’m not arguing that atemporal metrics are invalid – in fact, I’m not sure there’s a way to go about arguing conclusively that either type is valid or invalid. However, I think there is a problem if one makes an argument using an atemporal metric against a theory that uses a temporal one – or vice versa. 

     

    The third characteristic I mentioned, the explicit or implicit claim that rules are invariant,  creates some related issues. It’s part of the conceptual underpinnings of the competence/performance dichotomy, and again, functionalists, in general, do not see that as a strict dichotomy – performance, in real time, to accomplish real tasks, partly structures competence. Going further, though, models in which variance is incorporated as an inherent feature are hard to assess mathematically in some ways (my knowledge may be too dated here, but that has been the case in the past, at least); since many formalists determine the value of a model partly on the basis of whether it’s mathematically evaluable, inherently variable models always lose in those terms. But if language really is fundamentally variable, then a nicely mathematically evaluable model of a determinate system is….a well-behaved wrong model. If we had accurate observations of planetary movements, but we did not yet have the mathematics to deal with complex orbits and ellipses, Ptolemaic astronomy would still be wrong. It’s one thing to require perfectly circular orbits when the only observations you have seem to indicate they are circular, but it’s another thing entirely when you know they’re not but demand they be anyway. Some of the work on “learnability theory” (in Gold’s sense) that has been used to criticize functionalist accounts seemed to operate along these lines – “it’s not modellable in a mathematically evaluable way, so it’s not as good.”

     

    Bill Spruiell

    Dept. of English
    Central Michigan University

      

     

     

    From: Assembly for the Teaching of English Grammar [mailto:[log in to unmask]] On Behalf Of Bruce Despain
    Sent: Friday, October 19, 2007 9:56 AM
    To: [log in to unmask]
    Subject: Re: Rules ad nausiam

     

    Bill,

     

    You must remind yourself of the distinction between a language and a meta-language used to model it.  Science does in fact depend on a model of the most primitive kind.  It is modeling itself.  There are symbols made to represent phenomena.  There are beginning (primitive) symbols and there is a path defined by which to come up with derived expressions.  These expressions must be true to the phenomena symbolized and hence falsifiable.  I believe that the idea of model can be used to apply to the workings of science itself, to the workings of language, to the workings of anything that can be isolated as phenomenal in nature.  This is made abundantly clear in the work of John Casti, whom I referred to in my last post.  

     

    The statement that "noun" or the use of "+N" being part of language itself is confusing to me.  Words, sounds, meanings are usually considered to be the stuff of language.  These descriptive terms used to refer to elements of language have to be part of a metalanguage.  Their applications are not confined to English, just their expressional form.   Most people would admit that "+N" is not an element of the English language.  Try using it in a sentence without the quote marks around it.  It doesn't work for me.  True the "+" and the "N" are familiar parts of Engish, but symbols in different derived systems.  This expression "+N" denotes an object of a mathematical nature.  The plus sign signifies the positive value of a binary functor.  It contrasts with a "-N" that characterizes the opposite attribute that must occur with "+V" (in some models).  

     

    There is indeed a sense in which a neural network model of language has to presuppose objects as part of language.  It has to do with the concept of an object.  The object may well be the phenomena being observed and described by the model.  In this sense it has to be part of language; the model requires it.  The model has other objects that it uses in the description (symbols, expressions) that are not part of the language being described.  The word "object" in a trivial sense is part of the English language.  Linguistic investigators, in the process of making observations, abstract the phenomena from nature. This allows them to make the phenomena part of their primitive set to be symbolized as what they choose to refer to as objects.  Certainly this process makes them neither real nor concrete.  But it does bring them to a part of consciousness that allows them to be treated in similar ways.  Here again we must be careful not to let the metaphor of language reify them beyond the confines of the model.  Maybe this is the problem that you were getting at.  

     

    Bruce

     


    >>> "Spruiell, William C" <[log in to unmask]> 10/18/07 4:32 PM >>>

    Dear All:

     

    I tried to post the message clipped below at around 2:30 p.m. today, but as far as I can tell, it didn’t make it – It doesn’t show up on my list and I never got a receipt note. This may simply be a case of the internet exercising aesthetic judgment, but in case it’s accidental, I’m trying again.

     

    One addition – I’d like to make it clear that what I’m kvetching about are three specific *characteristics* of some model-theoretic frameworks. Many frameworks don’t have full determinacy or atemporal metrics – but a lot of the ones in linguistics do, and those are the ones I’m talking about. 

     

    __

     

    Bruce,

    Science does not, in fact, depend on that kind of model. Quite a number of scientific theories are couched in terms of such models, but that’s not why they’re scientific – rather, it’s because they make predictions that can be falsified by reference to mutually observable phenomena.  Conflating that kind of model with science itself is precisely the kind of thing I was whinging about.

     

    We don’t even have to go far afield to find scientific theories (or at least, theories as, or more widely recognized as such, than those in linguistics) that don’t use models with the three characteristics I was discussing. Biology, by and large, doesn’t use atemporal simplicity metrics. The operation of a system, in real time, to perform real tasks, is an integral part of biological models. If a system with fewer rules and fewer primitives nevertheless requires far more steps to achieve a particular goal than another system, its inefficiency counts for at least as much against it as its simplicity (in terms of an atemporal metric) counts for it. 

     

    I would also argue that neural networks do not “use” objects in the sense that a standard grammatical model uses a symbol like “noun,” although I did not make enough of a distinction in my original post. You’re certainly right that people talk about neural networks in terms of objects (nodes) and their properties (activation thresholds, etc.). However, they’re not talking about language when they do that. The nodes and thresholds are not within the domain itself that the network is modeling. In an “objects and rules” model, the objects are themselves part of the “content” the system is modeling;  for example,  “noun,” or a value like “+N,” in a standard grammatical model is considered to be part of language itself.  In a neural network model of language, “threshold value” is not considered to be a linguistic phenomenon being modeled, but rather part of the description of the system by which a linguistic phenomenon is modeled. So, you’re entirely right that you can consider neural networks to have objects, but those are transparently presented as features of the model not features of language.   Saying something about a threshold value is saying something about the device language is conjectured to be running on, not saying something about language. While the network can be described in terms of objects and rules, the network itself is not manipulating objects when it operates.  There’s no sense in which a neural network model of language has to presuppose objects as part of language. 

    Bill Spruiell

    Dept. of English

    Central Michigan University.

     

    From: Assembly for the Teaching of English Grammar [mailto:[log in to unmask]] On Behalf Of Bruce Despain
    Sent: Wednesday, October 17, 2007 5:18 PM
    To: [log in to unmask]
    Subject: Re: Rules ad nauseam

     

    Bill,

     

    Thanks for the notice, though I was stultified (proved to be of unsound mind?).  

     

    The set of symbols used as primitives and a set of rules, laws, or principles, is not unique to linguists.  In fact science itself consists of this kind of model. To attack such a model is to attack science, which has a much longer track record than any linguist or English maven.  


    The objects to be manipulated by language are sounds.  (disputable?)

    The objects to be manipulated by language are meanings.  (disputable?)

    The objects to be manipulated by language are signs written down in symbols.  (disputable?)

    The objects to be manipulated by language are combinations of the above.  (disputable?)

     

    1) What seems to be the nature of neural nets doesn't mean that no objects don't exist in the model.  Indisputable is that fact that there are potentials, thresholds, charges, etc. involved.  The model-theoretic framework says nothing about the reality of the objects posited.  The model is no more than a metaphor for what is being modeled.  2) A fully determinate system does not come to mind when the modeling is of phenomena like the weather.  There are simply too many elements in the determinate version.  Even the particle theory of matter must be abandoned though it is clearly determinate.  The sheer number of particles involved often becomes so great as to make the model impractical in making predictions of any but the roughest statistical kind.  Certainly the determinate nature of the phenomena described ought to be paralleled by that of the model that describes it.  The investigator who wants to put the former in doubt, has no need for a model with full predictive power.   3) The number of objects and rules has been a condition since the acceptance of Occam's razor by most scientists.  I would think that the simple nature of these objects and the statement of these rules in the accepted vernacular of mathematics would be another consideration.  

     

    I think that there is a likely confusion between a mathematical model and a particular model for linguistic descriptions.  The model-theoretic principles of mathematics are unavoidable to any formalization.  Whether a specific mathematical model actually serves to describe the phenomena it claims to can always be disputed and refuted with appropriate data.  But we don't thow away the language because of what people can say with it.  If the phenomena we are trying to describe are indeterminate, then it doesn't make much sense to use a model that requires determinate phenomena.  But the model itself had better use a determinate framework, or it would hardly be able to explain anything (have any predictive power).   

     

    I would wonder that failures in the operation (?) of the model of language could ever be made out to be performance issues.  The performance of the model would have to be faultless, in order to support itself.  Maybe people expect it to make predictions of the performance variety?  (Present-day minimalists are often far too broad in their expectations in this area.)  You say, "The problem is not the framework itself, but rather its hegemonic status."  I think that what many linguists call the model-theoretic framework is their own model for natural language.  This is the area of hegemony, I believe.  This then pits one camp against another based on their own modeling principles established for their own purposes, not the two elements you gave for a mathematical model in science.  Certain models can in fact be shown to be faulty based on the way in which the nature of the elements and mathematical relationships posited do not match in principle the behavior of the objects being modeled.  To do so does require a bit of sophistication, which, sorry to say, I do not possess.  (Cf. John Casti, Reality Rules)

     

    Bruce

     

     

    To join or leave this LISTSERV list, please visit the list's web interface at: http://listserv.muohio.edu/archives/ateg.html and select "Join or leave the list" 

    Visit ATEG's web site at http://ateg.org/ 


----------------------------------------------------------------------------

    NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.


----------------------------------------------------------------------------
    NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

  To join or leave this LISTSERV list, please visit the list's web interface at: http://listserv.muohio.edu/archives/ateg.html and select "Join or leave the list" 
  Visit ATEG's web site at http://ateg.org/ 

To join or leave this LISTSERV list, please visit the list's web interface at:
     http://listserv.muohio.edu/archives/ateg.html
and select "Join or leave the list"

Visit ATEG's web site at http://ateg.org/