see, this is the problem with syntax

Here I am, desperately scouring as many textbooks as I can lay my hands on, for tomorrow’s Sentence Processing lecture.

My eye falls on this sentence:

The fireman told the man that he had risked his life for to install a smoke detector.

And I scan the surrounding text for something to do with the “for to V” construction. Maybe the exposition is of the effect of nonstandard syntax on reading times? But that’s got nothing to do with it. It’s a garden path sentence, so you’re not even meant to be able to entertain the “for to V” option, because it’s meant to be an ungrammatical sequence making you reanalyse ([The fireman told [the man that he had risked his life for] to install a smoke detector]).

So, I have reanalysed. But since [The fireman told the man [that [he had risked his life for to install a smoke detector]]] is not “ungrammatical”, the sentence is only ambiguous, not a true garden path. I think.

And this is why phonology is so much better.


speaker as hearer

[Language post]

I’ve been impressed by the boldness of Fernández and Smith Cairns in devoting a chapter to “The Speaker” ahead of the chapter on “The Hearer” in their textbook Fundamentals of Psycholinguistics.

It’s one of the great fundamentals that there isn’t really a good model of how speech production works from a psycholinguistic perspective. The best established and most influential models of speech production certainly deal with linguistic units such as syllables or phonemes, but they don’t go any closer to articulation than that. These units serve as the input to whatever motor processes generate speech movements, but the motor processes themselves are generally treated as quite separate, if not trivial. (Fernández and Smith Cairns’s diagram of speech production has “articulatory system” well outside the box of interesting processes in their diagram at the start of the chapter.) The ‘perception’ side is generally much better understood than the ‘production’ side of things, so tackling production ahead of perception/comprehension is an interesting step.

But more striking – there’s a whole section of the Speaker chapter devoted to Producing Speech After It Is Planned. So might this be a place to find new insights linking mentally represented symbols to articulation? even tentatively, as befitting an introductory text?

Well, no – the section is acoustic, not articulatory. Shame! There’s a head diagram with the articulators labelled, but the diagrams are waveforms and spectrograms, not x-ray pellet tracings or EPG outputs. Not even so much as a diagram of a mass on a spring to help the reader feel warm and fuzzy.

It’s a perfectly fine section on the acoustic properties of consonants and vowels, I should add, but it does make you wonder what they’re going to talk about in the “Hearer” chapter now that all this talk of sine waves and formant transitions is out of the way.

looking back

“Few researchers feel that they have direct access to all of the truth that is worth seeking. Naturally, one looks to one’s contemporaries for help, but unless we hold with particular rigidity to the view that historical development is a matter of monotonically nondecreasing progress, with the present always ipso facto more enlightened than the past, there is no reason not to treat our intellectual ancestors with similar respect.”

SR Anderson (1985), Phonology in the Twentieth Century: Theories of Rules and Theories of Representations. University of Chicago Press. (p3)


English can allow the word-initial sequence ʃr. English can allow word-final sequences like -lfths. English words can be multisyllabic without being morphologically complex. English prose sentences can conform to highly regular rhythmic patterns.

shred /ʃrɛd/
twelfths /twɛlfθs/
military territory

Three cream scones please – Kate Snow will share Paul’s.
Thirty happy laddies whistled loudly all the way to Portmahomack.
Innocent mineral magazines reappeared yesterday.
Don’t you want a fundamental macaroni explanation?

But these are all rarities – these phonotactic sequences, this morpho/phonological fact, the extended consistency of these rhythmical patterns – they illustrate what is unusual about the forms of English, not what is typical.

So I’m wondering – we can play around with things like these (and a noted psycholinguist and a renowned anglicist are in print with apparently independent and beautifully elaborate manipulations of prose rhythm) and presumably there is something to be learned from the exercise – but what is that something?

“Our understanding of the complex and ‘irregular’ structure of ordinary prose can be sharpened,” says Angus McIntosh, “by exposure to … simple but abnormally iterative structures…” But how? Read this aloud, with a Tum-ti-ti rhythm:

Note, in a triangle having an angle of ninety degrees that the square that is made with its base the hypotenuse equals in area the sum of the squares that are made on the sides which are forming the right angle.

Examples like this show that the rhythms of ordinary prose include the raw materials for artfully constructed ‘abnormal’ structures, but doesn’t that just creatively exaggerate or parody the characteristics of naturally occurring text, rather than also providing much basis for insight into these forms?

Abercrombie, D. (1965). Studies in Phonetics and Linguistics. Oxford: OUP
Breen, M. & Clifton, C. (in press). Stress matters: Effects of anticipated lexical stress on silent reading.
Cutler, A. (1994).  The perception of rhythm in language. Cognition, 50: 79-81
Davies, M. (1986). Literacy and Intonation. In Couture, Barbara (ed.) Functional Approaches to Writing: Research Perspectives. Norwood, N.J.: Ablex, 199–230.
McIntosh, A. (1990). Some elementary rhythmical exercises and experiments. Anglo-American Studies, X (1): 5-19.

segment sceptics

Selected, in chronological order

• Paul (1886), according to Abercrombie (1991): “In contrast to Pike’s view that a stretch of speech has a natural segmentation is the view that it is an indissoluble continuum, with no natural boundaries within it. This view is at least a hundred years old. It is clearly stated, for example, by Hermann Paul in his Principien der Sprachgeschichte in 1886. The word, he says, is ‘eine continuerliche reihe von unendlich vielen lauten,’ ‘a continuous series of infinitely numerous sounds,’ as HA Strong translates it in Principles of the History of Language. … As he puts it, ‘… A word is not a united compound of a definite number of sounds, and alphabetical symbols do no more than bring out certain characteristic points of this series in an imperfect way.’” (Abercrombie 1991: 29-30)

• Twaddell (1935) – the phoneme is “a fiction, defined for the purpose of describing conveniently the phonological relations among the elements of a language, its forms,” p53; “it is meaningless to speak of ‘the third phoneme … of the form sudden’, or to speak of ‘an occurrence of a phoneme’. What occurs is not a phoneme, for the phoneme is defined as the term of a recurrent differential relation. What occurs is a phonetic fraction or a differentiated articulatory complex correlated to a micro-phoneme. A phoneme, accordingly, does not occur; it ‘exists’ in the somewhat peculiar sense of existence that a brother, qua brother, ‘exists’ – as a term of a relation,” p49.

• Firth (1935) – “It is all rather like arranging a baptism before the baby is born. In the end we may have to say that a set of phonemes is a set of letters. If the forms of a language are unambiguously symbolised by a notation scheme of letters and other written signs, then the word ‘phoneme’ may be used to describe a constituent letter-unit of such a notation scheme” (Firth 1957 [1935]: 21)

• Firth (1948) – on using literacy-inspired transcriptions as a basis for phonological analysis (from the 1930s onwards, the writings of JR Firth show him distancing himself from over-reliance on transcriptions in alphabetic notation, for phonological analysis): “The linearity of our written language and the separate letters, words, and sentences into which our lines of print are divided still cause a good deal of confused thinking in due to the hypostatization of the symbols and their successive arrangement. The separateness of what some scholars call a phone or an allophone, and even the ‘separateness’ of the word, must be very carefully scrutinized” (Firth 1957 [1948]: 147).

• Ladefoged (1959) – quoted by Lüdtke (1969: 151): “The ultimate basis for the belief that speech is a sequence of discrete units is the existence of alphabetic writing. This system of analysing speech and reducing it to a convenient visual form has had a considerable influence on western thought about the nature of speech. But it is not the only possible, nor necessarily the most natural, form of segmentation.”

• Lyons (1962), commenting on Firth: “the practical advantages of phonemic description for typing and printing should not of course be allowed to influence the theory of phonological structure. It has been argued that phonemic theory has been built on the ‘hypostatisation’ of letters of the Roman alphabet: cf Firth, [‘Sounds and Prosodies,’ 1948], p134”

• Abercrombie (1965) is quoted by Lüdtke (1969: 151) as saying, “The phoneme … is not something which has a ‘real existence’.”

• Lüdtke (1969) – abstract, “the phoneme segment is not a natural item but a fictitious unit based on alphabetic writing”

• Householder (1971), summarised by Vachek (1989: 25): “[Householder] formulates the question whether, instead of postulating Chomskyan artificial underlying forms, it would not be more realistic to regard the graphical shapes of words as starting points from which the language user obtains their spoken, phonological shapes.”

• Linell (1982) – a whole book providing comprehensive, detailed coverage of the topic, Written Language Bias in Linguistics.

• Kelly and Local (1989) – the question of notation – aim to avoid doing phonetic transcription with the same symbols as are then used for doing phonological transcription/analysis.

• Abercrombie (1991) – “Segment, then, is the name of a fiction. It is a transient moment treated as if it was frozen in time, put together with other segments to form a ‘chain’ rather than a ‘stream’ of speech. Methodologically it is a very useful fiction. A segment, isolated from the flow of speech, can be taken out of its context, moved into other context, given a symbol to represent it, compared with segments from other languages, placed in systems of various sorts, singled out for special treatment in pronunciation teaching; and used in dialectology, speech therapy, the construction of orthographies. (The same is true, of course, of speech-sound and phone. They do not give rise, however, to the possibility of a word for the process, ‘segmentation.’)” (p30)

• Faber (1992) – “segmentation ability, rather than being a necessary precursor to the innovation of alphabetic writing, was a consequence of that innovation” (p112); “segmentation ability as a human skill may have been a direct result of (rather than an impetus to) the Greek development of alphabetic writing. Thus, the existence of alphabetic writing cannot be taken eo ipso as evidence for the cognitive naturalness of the segmentation that it reflects” (p127)

• Derwing (1992) – “the segment (or phoneme) may not be the natural, universal unit of speech segmentation after all, and that the orthographic norms of a given speech community may play a large role in fixing what the appropriate scope is for these discrete, repeated units into which the semi-continuous, infinitely varying physical speech wave is actually broken down.” p200

• Port & Leary (2005) in Language, 81

• Port (2006), ‘The graphical basis of phones and phonemes.’

• Ladefoged (2005) – “We should even consider whether consonants and vowels exist except as devices for writing down words … [they] are largely figments of our good scientific imaginations,” p186; “We also lose out in that our thinking about words and sounds is strongly influenced by writing. We imagine that the letters of the alphabet represent separate sounds instead of being just clever ways of artificially breaking up syllables,” p190; “the division of the syllable into vowels and consonants is not a natural one. Alphabets are scientific inventions, and not statements of real properties of words in our minds. … vowels and consonants are useful for describing the sounds of languages. But they may have no other existence,” p191; “The alphabet, which regards syllables as consisting of separate pieces such as vowels and consonants, … is a clever invention allowing us to write down words, rather than a discovery that words are composed of segment-size sounds,” p198.

• Silverman (2006) – p6, p11-13, and elsewhere.

• Lodge (2007): “There has been a long history of warnings against the notion of the phonological segment (eg Paul 1890, Kruszewski 1883, Baudoin de Courtenay 1927), as pointed out succinctly by Silverman (2006). Later the concept was criticised by Firthian prosodists (see Palmer 1970) and more recently reviewed by Bird & Klein (1990); the most recent exposé of the misguided acceptance of alphabetic segmentation in phonology can be found in Silverman (2006).”


language acquisition upside down

One place where the thorny problems of linguistic theory become most obvious and demand the most determined engagement is in the area of child language acquisition. (The other, I think, is language variation and change, unless I just say that because these are what I find the most interesting.)

Take the concept of language structure. The belief that language has structure is, naturally, fundamental to the discipline of linguistics. But it is possible to understand this in radically different ways.

According to Hirsh-Pasek and Golinkoff in their book, Origins of Grammar, theorising about child language is generally done according to one of two broad approaches, which they characterise as “outside-in” versus “inside-out”. “Outside-in” includes social-interactional theories and cognitive theories; “inside-out” includes the various permutations of nativism.

One of these approaches, they say, “contends that language structure exists outside the child, in the environment.” If I didn’t tell you any more, would you be able to say which of the two options – ‘interactionist’ or ‘nativist’ – was being described here?

In fact, HP&G are referring to social-interactional/cognitive theories as believing that language structure exists outside the child (nativist theories rely instead on the innate language-specific knowledge).

Now it is quite possible that some theorists on the interactionist side do believe in language structure as having some sort of real-if-‘abstract’, independent existence. This would betray itself by, for example, the use of terms like “finding” or “discovering” things like “units” (or the boundaries between units) such as segments, morphemes, phrases, clauses in the ambient language. Such interactionists would then share with nativists the view that (spoken) language embodies or comprises real-if-‘abstract’ units organised in a real-if-‘abstract’ structure, and as the job of speakers is to produce speech with these properties, so the job of the listener is to recognise or calculate the identity of the units in what they hear and the relations between these units.

But a much more interesting prospect is the type of ‘interactionist’ approach that does not impute such reality to language structure at all. That is the view that the raw data of spoken language must be clearly distinguished from the analysis which an observer (lay or specialist) might undertake of it. In other words, there is no implicit structure lurking there in speech, whether phonological or syntactic: structures are inferred by analysts and act as handy descriptive/analytical tools, but they’re not really there. It is a serious criticism of some schools of thought that they treat the analysts’ analysis as being in fact what language is composed of – as though analytical constructs such as noun, verb, IP, DP, etc, actually are somehow or somewhere embodied in utterances. It’s one thing to say that when linguists want to get a handle on what people produce/hear they need to identify units and categorise things – these units and categories are convenient as technical descriptions in order that specialists can spot patterns and talk to each other about them. It’s another thing to say that spoken language consists of these units and categories such that the linguist’s task is to discover them (rather than impose them).*

As Joseph et al (2001: 60) put it, “whereas for the psychologistic structuralist speech comes about through implementation of the speaker’s knowledge of a systematic linguistic structure, for Firth the systematic structure is a linguist’s fiction, resulting from the attempt to understand speech.”** Thus (for example) the nativist scours the child’s productions in order to establish which aspects of linguistic structure must have unfolded in their mind by that point – the more interesting varieties of interactionism make use of structure, on paper, in the analysis, only as a tool to understanding what the child understands.

If both sides in the field of language acquisition, the interactionist and the nativist, share the conceptualisation of the linguist’s task as being one of discovering linguistic structure that actually exists out there/in language, then the differences between the two approaches shrink rather dramatically. But when this conceptualisation is not shared, it makes the ‘interactionist’ approach much harder to evaluate on ‘nativist’ terms, for one thing, and more importantly it keeps the idea of “language structure” where it belongs, in the realm of open questions needing discussion. Linguistic descriptions are convenient (-to-the-linguist) if not indispensible ways of categorising bits of utterances, but they have no life of their own.

*Some books/articles talk about things like Ross’s “discovery” of his island constraints: it would be better to think of things like this as inventions, not discoveries.

** Note the F-word. Amazing chap, obviously, this Firth. I was mightily relieved and heartened to come across that section of Joseph (2001) shortly after tortuously writing an essay labouring to express this point in an essay many moons ago.

anybody’s guess

Everyone blames phonological representations for language-related impairments, or deficits in phonology-related tasks like nonsense-word repetition. But what is a phonological representation? What do impaired phonological representations look like? In what specific ways do they differ from unimpaired representations, and how can you tell? What does it all mean?!

Munson (2006) in a commentary on Gathercole’s keynote article in Applied Psycholinguistics expatiates thus, and I can only concur:

Although there are many different perspectives on the factors that drive nonword repetition performance, we can all agree that the relationship between nonword repetition and word learning is due to the association of these constructs with phonological representations. The relevant question to ask, then, concerns the nature of phonological representations themselves. What are they? Textbook descriptions of these generally posit that they look something like the strings of symbols that we are taught to transcribe in phonetics classes. However, phonetic transcriptions, even narrow ones, are abstractions of the signals that are being transcribed. The level of detail that they code is ultimately related more to the perceptual abilities of the listener, the degrees of freedom in the symbol system, and a priori assumptions about the quantity of detail that is relevant for transcription than to the signal being transcribed and its associated phonological representation.

What, then, do “real” phonological representations encompass? What is being represented? The answer to that is anyone’s best guess. Representations themselves are latent variables. We can never see them, we can only posit them as explanations for the sensitivity that people have to variation and consistency in the speech signal in different tasks. (p578)

A welcome reality check in perhaps a slightly unexpected place, even though, of course, it still doesn’t solve the fundamental problem. Everybody’s preferred solution for testing the true nature of implicit phonological representations is different, and inadequate to different degrees and in different ways, but in the nature of the concept of phonological representations itself, that is simply how it has to be.


Munson, B. (2006). Nonword repetition and levels of abstraction in phonological knowledge. Applied Psycholinguistics 27: 4

back from baap

And what a fascinating time it was. I went with the expectation of finding out about lots of new ideas, and there was certainly a lot of new findings, new measurement methods, new and refined analyses.

But by far the most engaging sessions (I thought) were the ones that looked back to the early days of phonetics and linguistics. The phonetics crew at UCL have recently discovered some forgotten film reels dating right back to the 1920s, and took the opportunity to show the conference what this collection consisted of. The films showed everything from early x-ray images of the vocal tract, to the first machine which could recognise speech, to the exciting kymography techniques which feature so prominently in some of Firth’s papers. (Wikipedia on kymograph; in the 20s they also used the sensitive flame, described in Wikipedia in its application in the Rubens’ tube.)

There was also a fascinating account of the work that was done in Japan in the 1940s. Somehow the groundbreaking work from the Japanese labs had featured in some of the reading I did for my thesis (completely unconnected to my thesis, just like lots of the most interesting stuff I read those years), but Michael Ashby and Kayoko Yanagisawa’s presentation of the London-Tokyo links also brought in some intriguing detective work as they tracked down the source of their collection of glass lantern slides, and threw light too on the development of the stylised “head diagram” used by everyone from Daniel Jones onwards for illustrating the articulators (see here, eg, p79 onwards).

Which got me thinking. On one hand, it was amazing how technologically advanced they all seemed to be in the early days – they had all sorts of innovative techniques for observing and imaging the production of speech, and they had no hesitation in making use of the newest technology available in order to apply it to questions of articulation and acoustics. That spirit, I think it’s fair to say, is still alive and well in phonetics, with people using all sorts of technologies to investigate different aspects of articulation (electropalatography, laryngoscopy, ultrasound, not to mention electromagnetism…), and so we continue to increment our knowledge of what goes on in the vocal tract when people speak.

On the other hand, a lot of the theoretical understandings were also in place about what speech means, or is, or does, in the context of human communication more broadly considered. Knowing what acoustic effects arise from air flowing across articulators arranging themselves in particular ways is one thing – knowing what contribution these sounds make in the enterprise of making each other understood, is a different matter. Yet for people like Firth and his direct intellectual descendants, their views on the phonological system (and other parts of the language system) grew out of the best understanding they had about phonetics, both in terms of their explicitly stated principles and to a large extent also in their descriptive and analytical practice.

Compare this to a talk I was at last week (not at BAAP) where a valiant attempt was made to integrate changing conjunctions of formant values into the generative understanding of what phonology is (ie, to allow phonological grammars to accommodate – even ‘predict’ – sound variation and change). I am tenatively, but increasingly, of the view that there is simply no way to validate the staples of the generative apparatus (is that a mixed metaphor?) on the basis of speech data. It may be possible to tweak a generative grammar so that it becomes something that can handle variation and change, but that’s what it becomes – it doesn’t start, from its first principles, with that capability. If you believe that “sounds”* can be decomposed into distinctive features, what aspects of the speech stream can you offer as evidence for such features? Increasingly, the defence that phonological features need not make reference to the speech stream by virtue of existing on an altogether different plane of being is unconvincing, particularly when it is coupled with an expressed wish to make allowances for phonetic variation within the phonological system.

In one of the presentations, Michael Ashby mentioned that the 1930s was the decade of international congresses (the first three ICPhS’s!) and commented, quite rightly, on what an exciting time it must have been, in terms of who was meeting who, and when, and what ideas influenced who, and the impacts of all of these developments right down to the present day. You can’t help feeling that even though the scientific study of speech sounds is so relatively young, we could be in danger of falling prey already to a sorry historical amnesia. Keep alive the sensitive flame of phonetics, the man said, but keep alive too the story of where we’ve come from, not just to make sense of the present, but to equip us for the future too! (Best read to a particularly jubilant trumpet fanfare, I would suggest.)


* Always bearing in mind Roy Harris’s immortal analogy, “To ask ‘How many sounds are there in this word?’ is to ask a nonsense question (for the same kind of reason as it is nonsense to ask how many movements it takes to stand up).” Precisely.

on phonematic units

Firthian Prosdic Analysis provides a way of thinking about language and phonology which is fundamentally different from approaches in the ‘American’ and/or generative tradition.

As Anderson’s overview points out, “While one might be tempted to compare the phonematic units of the former with the phonemes of the latter [ie phonemicist analyses], for example, this would be a clear mistake. Both are essentially segment-sized units, it is true, and form systems of paradigmatic contrasts, but the similarities end there” (Anderson, 1985: 189).

The extremely helpful (clear and informative) JL article by Ogden and Local (1994) makes the same point very forcefully – it is thoroughly misguided to use the concepts and categories of generative approaches as a way of understanding Firthian ones, as though the differences between the analyses were simply terminological, or as if Firth was merely fumbling, in isolation from the American mainstream and in a quaintly eccentric English gentlemanly way, towards the same understanding as SPE-style analyses ended up with.

“Phonological units are, according to FPA, in syntagmatic and paradigmatic relations with each other. Syntagmatic relations are expressed as prosodies. Prosodies can also be in paradigmatic relations; this is what it means to be ‘in system’. Thus one can talk equally well of a ‘prosodic system’ and a ‘phonematic system’ (such as ‘C-system’ or a ‘V-system’). Both prosodies and phonematic units must also be stated in relation to ‘structure’ which in turn expresses syntagmatic relations” (Ogden & Local, 1994: 480).

“In making a Firthian Prosodic statement, the analyst typically begins by paying attention to the syntagmatic ‘piece’ and stating the prosodies relevant to the description of the piece under analysis; but the information is explicitly not thereby ‘removed’ or ‘abstracted away’, and the phonematic units are not ‘what is left’: in particular, phonematic units are not ‘sounds’ (Goldsmith 1992: 153), since phonological representations according to FPA  are not pronounceable; nor are they merely the ‘lowest’ points on which all else hangs, like the skeletal tier. Phonematic and prosodic units serve to express relationships: prosodies express syntagmatic relations, phonematic units paradigmatic relations. All else that can be said about them depends on this most basic understanding” (Ogden & Local, 1994: 481).

It may possibly be worth adding that when Anderson speaks of phonematic units being ‘segment-sized’, this likely needs to be qualified by saying that in a Firthian-inspired approach, establishing the size of a segment is actually part of the analysis – segments and phonemes are emphatically not equivalent – a syllable or a foot could equally well be a “segment” in a Firthian analysis, if descriptive or analytical adequacy called for these units to be the terms in the paradigm. Hear Lodge:

“there is nothing that tells us a priori that paradigmatic relations that establish the meaningful contrasts of a language have to be between segment-sized entities at the phonological level any more than at any other level. In syntax, for example, a ‘segment’ is usually word-length, and certainly morpheme-length; the ‘segment’ is the smallest bit of the speech chain suitable for describing the patterns of a particular level. We segment speech in different ways for different purposes. Such segments include syllable places: onset, rhyme, nucleus and coda, the foot, the intonation group, the morpheme, and so on” (Lodge, 2007: 80).


(Post inspired by the surprising discovery that “phonematic units” is a search term that leads to this blog.)

(Also in the back of my mind being the Friendly Humanist’s talk about silos – phonologically speaking, the Ogden & Lodge article is superb for such a purpose, not that I would particularly claim to be anything more than firth-sympathetic.)

Anderson, SR (1985). Phonology in the Twentieth Century: Theories of Rules and Theories of Representations.  Chicago: University of Chicago Press.

Lodge, K (1997). ‘Timing, segmental status and aspiration in Icelandic.’  Transactions of the Philological Society 105: 66-104

Ogden, R & Local, JK (1994). ‘Disentangling autosegments from prosodies: a note on the misrepresentation of a research tradition in phonology.’ Journal of Linguistics 30: 477-498

justly stressed

Somebody reached this blog by searching for, “Is the word “just” a stressed syllable?”

No doubt they’re long gone, but what a question!

Stress is inherently relational: you can only identify something as stressed in comparison to something else.

When a word is monosyllabic, there is no question about where its lexical stress is located: on the only syllable there is. So in citation form, I suppose there could just about be a sense in which you could call it a stressed syllable. It’s stressed enough, I suppose, to make it utterable.

But considering citation forms isn’t the best way of going about any phonological analysis. You need to see (for which read: hear) the word in context, so that it can be considered in its relation to the surrounding words. Only then is it possible to decide whether it is stressed (in relation to the surrounding words) or not.

Syntagmatics is the way forward, folks.