coarticulation is not a design flaw

I’m just back from a talk* where it was argued that language is far from a perfect or optimal system, but something that happens to work, most of the time, in spite of being bodged together in a clumsy and inelegant way (it could never have been designed to be this way, but with a bit of tinkering it comes to have the properties which make it at least functional).

The argument itself is coming from a background in American generative syntax, and so most of the argumentation was directed towards showing that human minds don’t and can’t represent entire trees for complicated syntactic structures. Which is actually, and thankfully, not even slightly controversial in many linguistics departments today, although apparently not all.

The phon-link, however, came in the discussion session following the talk, when one example of a clumsy solution to the language problem was drawn from speech production. According to the speaker, it’s not ideal that speech is produced via a single-tube system (ie the vocal tract) – because it gives rise to problems such as coarticulation.

For a single-sentence tutorial on coarticulation, consider the way that you say the word ‘ten’ on its own, and the way that you say the word sequence ‘ten past’ – the end of the word ‘ten’ becomes more similar to the start of the word ‘past’ when you say them together, particularly in fast speech. It might sound a bit more like ‘tem past’, in other words.

But coarticulation isn’t a problem. It’s not a problem for speakers, it’s not a problem for hearers – if it’s a problem for anyone, it’s only for people who adopt the troublesome assumption that the components of words have their own form in some sense independently of the words they belong to, and that this form somehow changes to take on the shape of adjacent or nearby segments when the segments are all assembled in order to be articulated. The problem in speech analysis is not how coarticulation can happen, but how segmentation can be motivated, for what is an inherently continuous (non-segmented) stream produced by the overlapping movements of the tongue, lips, jaw, and so on – and it’s in precisely the “transitions” between what could be thought of as “segments” that so much of the information that is most valuable for hearers is located.

To paraphrase someone else’s slogan – coarticulation in speech is not noise, but information! and the perception of inelegance and clumsiness is very much just in the eye of the beholder.

* Actually, I’ve just discovered this wee rant languishing as a draft in a folder somewhere – the talk was so long ago I can barely remember what the speaker looked like. But I need to post it, if only for my own phon-related health. On account of unavoidable weekly commitments I haven’t been to the departmental phonetics/phonology seminar for weeks – months even – and the p-side of my brain (p-centre?) is getting worryingly undernourished.

spectrographic artwork

Can I tell you about something I came across on a university website recently – an intriguing combination of speech science and civil liberties, in the Tony Benn sonograms, created by an artist called Tracey Moberley.

Being clueless about copyright I don’t want to copy the images to publish here, so you’ll just have to follow the link:

But please do click on it – it’s a stunning image of a spectrogram as phoneticians don’t normally see it. More typically those vivid blues and yellows are simply shades of grey, and the tiny black squiggles at the bottom look like they might be a little confirmatory waveform – with the words at the bottom provided in ordinary writing, unaligned with the acoustic data.

I think most LabPhonistas would like it.

uptalk description & references

‘Uptalk’ is one of the names given to the phenomenon which has recently appeared in English where people use rising intonation at the end of sentences – a pitch pattern that is more usually associated with questions. Some people associate it with American English, others with Australian English – and theories abound as to why it’s emerged at all.

It’s not something I know a huge amount about, but a quick trawl around Google Scholar brings it up in a couple of articles by Paul Foulkes and Gerry Docherty, phoneticians with a sociolinguistic bent at York and Newcastle respectively. I’ll just quote the relevant passages here for information and list the references at the end so that they can be followed up if anyone is sufficiently interested to do so.

One article is titled ‘Phonological variation in the English of England,’ available here in pdf.

“One of the most noticeable innovations in recent years has been the development of rising intonation in the Closed tone category in dialects which traditionally use falls. This has been found in the USA, Australia and New Zealand as well as Great Britain, and has been variously labelled high rising tone (HRT), Australian Question(ing) Intonation (AQI) and uptalk (see Cruttenden 1995, 1997: 129-131, Fletcher, Grabe & Warren in press). The pattern is associated with the upwardly mobile (‘yuppies’) in England (Cruttenden 1997: 130), but lower class and/or female speech elsewhere.

“Because of its perceptual salience, HRT has been the subject of much comment by non-linguists, including the mass media. Some of these comments are highly speculative and empirically untested, for example, that Australian soap operas are responsible for the spread of HRT (Bathurst 1996, Lawson 1998). Others, taking up the mantle of John Walker and others in lamenting change of any kind, identify HRT as a sign of unstoppable decay in modern English (e.g. Bradbury 1996, Norman 2001). Still others draw a logical but naïve conclusion, based on comparison with standard English, that rises indicate questions, and thus the use of rises in declaratives reflects a psychological state of uncertainty. The voice coach Patsy Rodenburg, for example, is quoted by Kennedy (1996) as claiming ‘that rising inflection is about being unsure…you make a question rather than a statement because you are scared’. Such statements are ill-founded in that they equate a particular intonation pattern with a single linguistic function. They thereby fail to take account of issues raised earlier: the form-function problem; the fact that intonational meaning is derived from a complex set of sources; and that social and linguistic evaluation of features may vary from speaker to speaker. It is obvious from examination of intonation patterns in dialects such as Newcastle and Liverpool that rises may be employed in the Closed category without any indication of interrogative meaning or uncertainty. Furthermore, linguists who have analysed HRT have identified its positive discourse functions. It has been shown that HRT serves to track the listener’s comprehension and attention, especially when the speaker is presenting new information. Listeners perceive HRT to be deferential but friendly (Guy & Vonwiller 1984). It also acts as a turn-holding mechanism in narratives (e.g. Warren & Britain 1999).”

The other article is Foulkes and Docherty’s 2006 paper in the Journal of Phonetics (a pre-publication version is available here in pdf); the section I’m quoting is useful mainly for the references to descriptive work in other varieties of English:

“Rising contours in declaratives have begun to emerge recently in English dialects where they are not traditional features, a phenomenon variously referred to as ‘uptalk’ or ‘high rising terminal’ (see Cruttenden, 1995, 1997). This innovation has been observed in the USA (Arvaniti & Garding, 2005), Australia (Guy, Horvath, Vonwiller, Disley, & Rogers, 1986), New Zealand (Britain, 1992; Warren & Britain, 2000), and England (Cruttenden, 1997). In most locations, it is characteristic mainly of young speakers. In the USA, Australia, and New Zealand it is also most common in lower class and/or female speech, but by contrast it seems to be associated with the upwardly mobile in England.”

Selected references in full:

  • Arvaniti, A., & Garding, G. (2005). Dialectal variation in the rising accents of American English. In C. T. Best, L. Goldstein, & D. H. Whalen (Eds.), Laboratory phonology 8. Berlin: Mouton de Gruyter.
  • Britain, D. (1992). Linguistic change in intonation: The use of high rising terminals in New Zealand English. Language Variation and Change, 4, 77–103.
  • Cruttenden, A. (1995). Rises in English. In J. Windsor Lewis (Ed.), Studies in general and English phonetics: Essays in honour of Professor J. D. O’Connor (pp. 155–173). London: Routledge.
  • Cruttenden, A. (1997). Intonation (2nd ed.). Cambridge: Cambridge University Press.
  • Guy, G., Horvath, B., Vonwiller, J., Disley, E., & Rogers, I. (1986). An intonational change in progress in Australian English. Language in Society, 7, 23–51.
  • Warren, P., & Britain, D. (2000). Intonation and prosody in New Zealand English. In A. Bell, & K. Kuiper (Eds.), New Zealand English (pp. 146–172). Amsterdam: John Benjamins.

The 2006 paper in the Journal of Phonetics is useful for a variety of reasons and worth reading if phonetics/phonology interest you at all. The full citation is:
Paul Foulkes & Gerard Docherty (2006), ‘The social life of phonetics and phonology.’ Journal of Phonetics 34: 409-438

when your h = 0 and f = 1

Interesting and useful fact of the week:

If you’re using d′ as a measure of discrimination sensitivity, and if you have small numbers of trials from which to calculate proportions of hits and false alarms, you are likely to end up trying to get z-transformations of 0 or 1, which means that d‘ is undefined.

There is however a range of conventions for how to deal with this.

Wickens (2001) says a value can arbitrarily be assigned to the otherwise empty category, eg, for f, a value corresponding to 1/(N+1), or 1/(2N+1), or 1/(10N+1) can be assigned, where N = number of noise trials.

This chap says, for proportions of 0 use instead 1/N and for proportions of 1 use (N-1)/N, where N is the number of trials used in calculating the proportions.

Macmillan and Creelman make two suggestions. One is to convert proportions of 0 to 1/(2N), and proportions of 1 to 1-1/(2N), where N is the number of proportions used in the calculation. The other is to add 0.5 to all data cells regardless of whether there are zeroes present.

Most usefully of all, this information can all be found online, in Google Books in the case of Wickens and Macmillan and Creelman. Although thanks to the prodigious – the triumphant – efficiency of the note-taking skills of one of my officemates, we didn’t even need to resort to googling in order to have the information at our fingertips. It’s very satisfying when that happens.