Words, roughly speaking, in the psycholinguistic sense of ‘items in the mental lexicon’, consist of a phonological form coupled with semantic content. They mean something, and they have a sound structure, and these two properties can theoretically be analysed and discussed independently of each other. To give a phonological description of a particular word, for example, you would want to discuss what kind of consonants and vowels it was composed of, how many syllables, the structure of the syllables, the stress pattern, and so on; what the word actually means in the language can be treated as a separate question altogether.
You can also manipulate certain characteristics of the phonological properties of the words of a given language. You could, for example, observe that English allows the sequence “pr” at the start of words (prince, press) and “nd” at the end of words (wind, sand), and so construct the sequence “prend”. It sounds a bit like “friend”, and “pretend”, but it isn’t really related to either, and it doesn’t actually mean anything. It’s a pseudo-word, or a non-word – a phonological form which is legitimate according to the rules governing English sound sequences, but which has no meaning associated with it.
This would be just so much abstruse blether, except that non-words have been put to use in practical real-life contexts, with intriguing consequences. There exists a particular kind of language impairment in which, out of all a child’s cognitive abilities, only their language development seems to be impaired (in the absence of factors such as brain damage, hearing impairment, and so on). This is called Specific Language Impairment, or SLI. It runs in families. It has a genetic component. And geneticists have demonstrated that there is a linkage between particular regions of particular chromosomes, and particular language-related skills – most interestingly, the ability to accurately repeat lists of “nonsense words”, in tests known as nonword repetition tests.
What these tests consist of is, generally, a pre-recorded list of non-words, such as “doppelate” and “ballop”. The child hears these items played one at a time, with enough of a pause in between for them to attempt to repeat what they’ve just heard. Children with SLI not only show less accuracy in producing these items (dokkelate, toppelate, toppate might be the kind of errors you’d elicit), but performance on this kind of test is, as they say, a good marker of a heritable phenotype.
The idea behind using nonword tests was, at least originally, that it would allow us to see what the child had really mastered of the English sound system, or what his or her phonological skills were really like, once divorced from the messiness attached to their production of real words (all sorts of factors affect a child’s acquisition of real-language vocabulary, and it’s quite possible for a particular sound to be mis-pronounced in one word but produced accurately in another word). If we’re interested in “pure phonology”, then seeing how children handle phonological forms which have no semantic, pragmatic, or lexical baggage would seem to be the ideal method.
Unfortunately, large numbers of practical difficulties very quickly emerged as soon as researchers started using nonword repetition tests. One is that you need to control exactly how similar a non-word is to real words: it matters that the nonword “ballop” is really quite reminiscent of both “gallop” and “ballot”. You also need to control what combinations of sound-segments appear in your nonwords: the sequence /mf/ is legal in English (“triumph”), but much rarer than the sequence /st/, and so much harder to repeat accurately. Longer nonwords are of course more difficult to remember and repeat than shorter ones, so if your set of nonwords includes many three-syllable items with rare sound sequences and many four-syllable items which are highly reminiscent of real words, it becomes much more difficult to pin down whether a child’s poor performance is due to specifically phonological issues (such as the rarity of the sound-sequence), versus more general memory-related issues such as the number of syllables they have to remember.
This, I think, feeds into a further problem which needs to be addressed, especially in the context of trying to design new sets of nonwords which would steer clear of these early problems and allow hypotheses to be tested to distinguish between what is “phonological” and what is general “memory” (or whatever). That is the question of what, precisely, are the aspects of phonology which are of most interest to researchers investigating language impairments with a genetic component. Taking an overview of the lexicon of, say, a typically developing 7-year-old, what are the specifically phonological properties of the lexical items which we can use to test the phonological competence of language-impaired children and their family members? Or, from the other direction, what are the properties, or hypothesised properties, of the putatively phonological impairments in SLI which would allow nonwords to be designed so as to elicit, or elucidate, error patterns of theoretical importance?
In other words, for example, should a good set of nonwords rely on CVCV structures only to the extent that these exist in the two-syllable words in the lexicon? Is it useful to include presumably articulatorily complex sequences such as triconsonantal clusters, or rare consonant sequences across syllable boundaries? What is the relationship between the relative frequency of particular consonants (eg dh) and their being late-acquired?
And what exactly would a specifically phonological impairment look like? Should errors be predicted mainly in one natural class, such as fricatives (but how would you differentiate a phonological difficulty with a natural class from an articulatory or perceptual difficulty with fricative production or perception?), or mainly in syllable structure, or stress assignment? Would you predict that a nonword where all the consonants were voiceless stops would be easier or harder than one where all the consonants were nasals, and if so, why? would it be useful to have multisyllabic items with all front vowels, or all back vowels, rather than a mixture?
This matters because presumably, the usefulness of nonword repetition tests is the light which they are supposed to shed on phonology – but of course speech sounds can only be described as phonological to the extent that they mirror the properties of real words as really used in a real language. (You can’t use nonword minimal pairs to demonstrate a phonemic difference, for example: minimal pairs can only be drawn from the lexicon.) So nonwords have to reflect in some way the actual characteristics of the items in a person’s or a population’s actual lexicon. Phonology can’t exist without a lexicon, but while on the one hand nonwords that are too similar to real words undermine the rationale behind using non-words in the first place, on the other hand nonwords that are too dissimilar from the lexicon make the task into one of attempting to pronounce non-native sound sequences, rather than plausible-but-non-existent native word. Erring in either of these directions will no doubt leave us better off than with stimuli which are poorly controlled for phonological properties, but there are still plenty questions which need an answer.