coarticulation is not a design flaw

I’m just back from a talk* where it was argued that language is far from a perfect or optimal system, but something that happens to work, most of the time, in spite of being bodged together in a clumsy and inelegant way (it could never have been designed to be this way, but with a bit of tinkering it comes to have the properties which make it at least functional).

The argument itself is coming from a background in American generative syntax, and so most of the argumentation was directed towards showing that human minds don’t and can’t represent entire trees for complicated syntactic structures. Which is actually, and thankfully, not even slightly controversial in many linguistics departments today, although apparently not all.

The phon-link, however, came in the discussion session following the talk, when one example of a clumsy solution to the language problem was drawn from speech production. According to the speaker, it’s not ideal that speech is produced via a single-tube system (ie the vocal tract) – because it gives rise to problems such as coarticulation.

For a single-sentence tutorial on coarticulation, consider the way that you say the word ‘ten’ on its own, and the way that you say the word sequence ‘ten past’ – the end of the word ‘ten’ becomes more similar to the start of the word ‘past’ when you say them together, particularly in fast speech. It might sound a bit more like ‘tem past’, in other words.

But coarticulation isn’t a problem. It’s not a problem for speakers, it’s not a problem for hearers – if it’s a problem for anyone, it’s only for people who adopt the troublesome assumption that the components of words have their own form in some sense independently of the words they belong to, and that this form somehow changes to take on the shape of adjacent or nearby segments when the segments are all assembled in order to be articulated. The problem in speech analysis is not how coarticulation can happen, but how segmentation can be motivated, for what is an inherently continuous (non-segmented) stream produced by the overlapping movements of the tongue, lips, jaw, and so on – and it’s in precisely the “transitions” between what could be thought of as “segments” that so much of the information that is most valuable for hearers is located.

To paraphrase someone else’s slogan – coarticulation in speech is not noise, but information! and the perception of inelegance and clumsiness is very much just in the eye of the beholder.

* Actually, I’ve just discovered this wee rant languishing as a draft in a folder somewhere – the talk was so long ago I can barely remember what the speaker looked like. But I need to post it, if only for my own phon-related health. On account of unavoidable weekly commitments I haven’t been to the departmental phonetics/phonology seminar for weeks – months even – and the p-side of my brain (p-centre?) is getting worryingly undernourished.