For Aristotle and Aquinas, voice is produced by, yet distinct from, a speaking subject. It is the medium through which a subject’s internal image—stored in the imaginative faculty—encounters the external world. Because voice in these definitions is not only a linguistic medium, medieval grammarians further distinguished types of voice by their semantic potential. According to Donatus, “Every sound [vox] is either articulate or confused. Articulate sound can be captured in letters, confused sound cannot be written.”14 Priscian describes four different types of vox based on two binary pairs: articulate or inarticulate (depending on whether the sound has intentional meaning) and literate or illiterate (depending on whether it is writable). He acknowledges that unwritable sounds (such as a human groan) may have meaning, and writable sounds (such as a nonsense word or transcribed animal noise) may lack signification.15 Later grammarians drew on these four types to classify sounds ranging from human spoken language, applause, and whistling to animal noises and the sounds of natural phenomena like crashing waves.16
The difference between Donatus’s and Aristotle’s definitions of vox captures a medieval concern with the relationship between voice and speaker. Whereas for Aristotle, the ontological status of the producer of sound determined whether or not it was a “voice,” for Donatus and the early grammarians, a sound’s resolution into the phonetic alphabet was central. This association between writing and voice not only speaks to grammar’s close relationship to performed oratory in the early period (it was a foundational discipline for the study of rhetoric) but also reveals how writing and voice are co-constitutive in premodern culture. Their interdependence is particularly evident in drama: the multivocal drama of ancient Greece owes its conception to a phonetic alphabet that weds sound to letters, for example, and medieval drama borrows many of its conventions from legal rhetoric.17
But while access to writing was controlled, in the Middle Ages, by institutions of literacy like the schools and the church, voice is a biological attribute shared by humans and animals. As poets have long recognized, human language draws on both aspects of voice, its grammatical and systematic principles and its nonrepresentative sounds. Ezra Pound described these, respectively, as logopoeia (meaning) and melopoeia (sound).18 Later medieval grammarians like John of Garland recognized both aspects of language. In addition to his better-known Poetria Parisiana (discussed in the next chapter), John composed an “equivocal” grammar, a medieval genre of long poem that distinguishes and contextualizes like-sounding words.19 As he puts it in the prologue to this poem, “equivocum celat sub eadem plurima voce, / quorum nomen idem” (an equivocum hides under a voice [word] these many [meanings], which have the same name).20 The treatise (like others in the genre) distinguishes among like-sounding words in a mnemonic verse:
Augustus, -ti, -to Cesar vel mensis habeto,
Augustus, -tus, -ui vult divinacio dici
Mobile si fiat, augustus nobile signat,
Augeo dat primum, dant gustus avisque secundum.
[Augustus, -ti, -to means Caesar or the month (of August); Augustus, -tus, -ui means divination. If it becomes an adjective, augustus means noble. The verb “augeo” (to grow) gives us the first meaning; “gustus” (taste) and “auis” (bird/omen) give us the second.]21
Just as equivocal grammars unite the melodic and grammatical qualities of voice, rhetorical treatises also give primacy of place to voice, an essential component of the rhetorical canon of actio, or delivery. Quintilian acknowledges that a “good voice” is among the natural gifts necessary for success in oratory22 and devotes part of his discussion of delivery to the correction of vocal infelicities: “Again our teacher must not tolerate the affected pronunciation of the s, with which we are so familiar, nor suffer words to be uttered from the depth of the throat or rolled out hollow-mouthed, or permit the natural sound of the voice to be over-laid with a fuller sound, a fault fatal to the purity of speech.”23 The ethics of oratory are implicit in the privileged faculty of speech: “If therefore we have received no fairer gift from heaven than speech, what shall we regard as so worthy of laborious cultivation, or in what should we sooner desire to excel our fellow-men, than that in which mankind excels all other living things?”24
Following the newfound popularity of the pseudo-Ciceronian Rhetorica ad Herennium in the twelfth century, delivery became a subject of serious rhetorical discourse in the later Middle Ages, as well as an influence on the broader culture, permeating scholastic treatises and performative practice.25 The Herennium describes tones of voice appropriate for different kinds of material: “Conversational tone comprises four kinds: the Dignified, the Explicative, the Narrative, and the Facetious. The Dignified, or Serious, Tone of Conversation is marked by some degree of impressiveness and by vocal restraint. The Explicative in a calm voice explains how something could or could not have been brought to pass. The Narrative sets forth events that have occurred or might have occurred. The Facetious can on the basis of some circumstance elicit a laugh which is modest and refined.”26 The author then proceeds to describe the physical qualities of each tone. For example: “For the Dignified Conversational Tone it will be proper to use the full throat but the calmest and most subdued voice possible, yet not in such a fashion that we pass from the practice of the orator to that of the tragedian. For the Explicative Conversational Tone one ought to use a rather thin-toned voice, and frequent pauses and intermissions, so that we seem by means of the delivery itself to implant and engrave in the hearer’s mind the points we are making in our explanation.”27 To a later medieval audience, the Herennium’s distinctions among oratorical voices might well have recalled Isidore of Seville’s discussion of song and its varieties of voice: “A song (cantus) is the voice changing pitch, for sound is even-pitched; and sound precedes song. Arsis (arsis) is elevation of the voice, that is, the beginning. Thesis (thesis) is lowering the voice, that is, the end. Sweet (suavis) voices are refined and compact, distinct and high. Clear (perspicuus) voices are those that are drawn out further, so that they continually fill whole spaces, like the blaring of trumpets.”28
In short, the concept of voice integrates theory and practice in several medieval liberal arts. The qualities of vocal expression were understood to be artful, in rhetorical delivery and in singing, as well as meaningful. Thus, discussions of voice in these treatises reveal the performative foundations of medieval knowledge practice. The alliance of knowledge and its delivery is perhaps best expressed in John of Salisbury’s introduction to the Metalogicon:
Just as eloquence, unenlightened by reason, is rash and blind, so wisdom, without the power of expression, is feeble and maimed. Speechless wisdom may sometimes increase one’s personal satisfaction, but it rarely and only slightly contributes to the welfare of human society. Reason, the mother, nurse, and guardian of knowledge, as well as of virtue, frequently conceives from speech, and by this same means bears more abundant and richer fruit. Reason would remain utterly barren, or at least would fail to yield a plenteous harvest, if the faculty of speech did not bring to light its feeble conceptions, and communicate the perceptions of the prudent exercise of the human mind.29
Far from a transparent medium of knowledge, voice here is generative and productive, “fecund.” Indeed, in the twelfth century, eloquentia (communication) came to be regarded as half of knowledge, complementing philosophia