One of my tangential work-related projects has involved developing materials for labelling disfluencies¹ in speech, especially as they interact with prosody. Disfluencies include a number of phenomena such as filled pauses (um, uh) false starts (Hey! Those are my pa- trousers!) and unexpec- -ted … pauses or lengthening of woooords or partsss of words.² Disfluencies occur very frequently in natural speech, especially in spontaneous speech.³
Since they occur so frequently, it should be really easy to find examples of them, right? Well, yes and no. The trouble is that we’ve been looking for examples that we can redistribute, as part of training materials. Some of the materials we have used in previous research has not been licensed in this way. (A lot of people made such recordings before even imagining the web, let alone that their voices might show up there.) So, I have been on the hunt, on and off for several years, for materials that are suitable: high quality recordings of spontaneous speech produced by native speakers of American English. Those constraints right there limit things more than you might think. And then add on to that the desire to find things that have been released into the public domain, or shared with a creative commons license allowing derivative works and redistribution.
I don’t remember exactly when I came across the Internet Archive, and considered it as a potential resource for finding such soundfiles, but searches had been only moderately fruitful. But when I found the tomato guy, the Internet Archive bore fruit.
Buried among hundreds of podcasts, I found Musings with Sherman Oak, and I found Sherman Oak rambling about eating a tomato. This podcast was not only chock-full of examples of disfluencies, but tagged as public domain. And on top of that, I found it hilarious. If you have a minute, go have a listen to Sherman talking about tomatoes. (The whole episode is only 3 minutes and 17 seconds, but you can get a pretty good idea from the first minute. Or jump ahead to one of my favorite bits, around 1:30: “raw tomatoes are an evil vile thing.”)
I have no idea who Sherman Oak is. Sherman had a blog for a while, also called Musings with Sherman Oak. The blog has very little content, and a suspiciously large number of typos and other quirks. Which is completely in character with Sherman Oaks.
But I have this strong suspicion that Sherman Oak is a just that: a character. I think there’s a strong possibility that it is an actor or comedian, or comedian-actor, creating Sherman. In fact, I have a candidate: Thomas Lennon. (He was on, and one of the creators of, Reno 911, and has done a lot of stand-up and sketch comedy.)⁴
Listening to Sherman, I had this sense that I recognized his voice. And poking around more through some videos of Thomas Lennon on YouTube, I haven’t yet found anything that dissuades me.⁵ In fact, if Sherman Oak is not a character of Thomas Lennon, then he should be.
¹ Yeah, I just linked to Wikipedia. For a much denser and more academic discussion of disfluencies, see Shriberg’s 1994 dissertation.
² i.e. segmental lengthening that is not obviously in the service of phrasing or pitch accents.
³ Spontaneous speech typically is contrasted with read, elicited, or rehearsed speech.
⁴ He also, as is the case with many actors, lives in L.A. There is a district of LA called Sherman Oaks, which could be a coincidence. He is also from a town in Illinois called Oak Park. I only know these things from reading his Wiki page. I never previously had any cause to stalk him.
⁵ See, for example, this interview [youtube] or this stand-up bit [youtube].