Speak & Spell

I hadn't planned that my first post should be about Speak & Spell, but last week I was in contact via LinkedIn with my first ever boss - Irfan Salim. Irfan was always marked for success and after setting up Lotus (Lotus1-2-3 remember?) in Europe, he has gone on to lead many successful companies and now does the same living in San Francisco.

But back in 1981 he was the Marketing Manager for Texas Instruments Consumer products division in Bedford, England and he hired me fresh from university to be marketing product manager for Education Products. My portfolio included a few handheld calculator type games, plus the much more expensive Speak & Spell and its stable mates. And this lead to me becoming recording producer and editor of the British Speak & Spell.

I joined Texas Instruments with a group of about 30 other graduates in September 1981, and just like them I was told that for the next year I wouldn't really have an active role. We were told to watch, listen and acclimatise ourselves to the company. We'd be moved around to see how things worked. That was pretty much how it was for fellow graduates, but not for me, I was put straight to work - and I liked it. It was a challenge and I had to learn fast.

On my 2nd day I was given the Argos catalogue (a major customer) and asked to write the descriptions for my products to go into the next issue. I was told a good description can raise sales by 25% and there were tight editorial guidelines. In my second week I had to script the TV advert for Speak & Spell. Speak & Spell had been on sale for a year or so and had enjoyed moderate success, but there was a feeling that sales might have already peaked or that it was not going to keep selling for much longer.

After 3-4 weeks Irfan asked me "what are you going to do to make Speak & Spell sell for another year". (Don't forget I was still a raw recruit) I mumbled something about making it sound British, rather than American and he agreed. A few days later he came back and said "you've got $112,000 budget and I need it in full production my April next". It took me a moment to work out what he was talking about ... right there and then I had begun my journey to create the the 'British' Speak & Spell.

After my initial panic I realised that we needed to re-record everything and that we needed a list of words. Whilst we could easily use the American word list I had found letters on file from a few educationalists complaining that the words being used were wrong and suggesting better lists. I took the easy way out and agreed with them, and traveled to Newcastle to meet a Professor (I can't remember his name), to get the new list, making sure that there were no rights issues to use it.

Then I had to select a voice artist, a speaker, to record. This was fun, the advertising agency (McCann Erickson) set up a session so that I could hear recordings of a number of male* speakers reading the test list of words. I chose John McGuinn who did a lot of voice overs and was a BBC Radio 2 news reader.

* At the time a female voice couldn't be rendered by the speech chip. We all knew that a female voice would improve the product, but it wasn't possible back then.

Both John and I traveled to the only 'digital' studio in Europe at the time - it was inside the beautifully positioned Villeneuve Loubet facility of Texas Instruments (near Antibes, France). So we were off to the South of France. This was only my 3rd time on any flight and my first ever business trip. It all seemed so unreal, and that sense became more acute when we arrived in the studio and I realised I was the producer. I was surrounded by recording experts, and John was a man from the BBC, but only John and I knew how to pronounce the words in an English style, and so, I had to call the shots in the studio - it was great fun.

The recording was over in a couple of days and so we headed home, job done ... or so I thought!

We recorded was a few megabytes of digitally recorded speech...... It was so easy to write that line. But in these days when only GigaBytes are mentioned a few megabytes sounds innocuous, but back then the concept of a megabyte was mind blowing. You have to recall that the first IBM PC had only just been launched and it cost £1,500 (that's about £4,100 in today's money). It had 16kB of RAM and for storage you could choose from either a cassette tape recorder or a 160kB floppy disk drive (that's 7 floppy disks to record 1Mb!). It would be another 2 years before a PC had a 'hard disk'.

As for digital recording, no-one knew what I was talking about. I already said that we used the only digital studio in Europe, in fact outside TI no-one used them. That will sound unbelievable to today's generation, and when writing this my own son asked 'what could it be if not digital' ; analogue is as ancient as the dark ages. Coincidentally as part my recently completed degree, I had (in 1981) written a thesis on music recording and playback technology. I had described every part of the process and technology, but at no point in my research did I hear those two words 'digital recording' - for the time it was absolutely cutting-edge. 

Speak & Spell, really was a technological breakthrough. And, like the IBM PC of that era, it too had only had 16kB of memory. In Speak and Spell this was ROM not RAM, but it was the total amount of memory to store the recorded speech. 16kB was massive in those days and it was inconceivable to add more without a huge price rise. Therefore to make the product, and make it affordable even at a premium price point, we had to squash the megabytes of recording into the tiny 16kB ROM. But how?

To give a sense of scale here. Speak and Spell contains enough dialogue for about 4 minutes. As an MP3 file today that might be about 4MB of data. Now MP3 is a fantastic technology which condenses sound files, and delivers way better output that we could achieve. But we had to condense the sound to 1/250th of an MP3.

Enter the genius of Larry Brantingham who, along with a couple of colleagues, had patented some remarkable technology inside TI's speech chips. The technology called LPC (Linear Predictive Coding) took samples of the recorded speech at regular intervals and then invented (predicted) the sounds that had happened in between the samples. Sampling sound at regular intervals is how all digital sound systems work today - but today there is no perceptible gap between the samples. CD's use digital samples taken every 1/44,000th of a second but in 1981 the sample rate was much slower. In fact we sampled every 1/10th of a second. To fill the gaps which spanned 9/10 of every second we needed a high level of 'prediction', and of course that prediction could go very wrong. In short the recordings went off to be processed and heavily compressed. They also came back as garbage. Not only that, there was about 24kB of garbage. We were a long way off from a finished product and somehow 24kB needed to be squashed by another 1/3rd AND turned back into comprehensible speech.

The only way rescue the sounds and shrink it further was to edit it manually. What do I mean? Well each 1/10th of a second sample had been converted into a digital code that could be displayed on a computer terminal. As I recall there were 20 numbers on each line, and each line represented 1/10 second. From left to right these numbers described the position of parts of the mouth of a someone saying that tiny 1/10 second sound. Smaller numbers indicated that part of the mouth being closed or nearly closed. Therefore a "P" would have low numbers towards the left hand columns to describe the lips being together, but then higher numbers towards the middle of the line to describe the position of the tongue, as it gets ready to launch the plosive p. Each line on the screen was a 1/10 second snapshot describing the shape of the mouth, throat and vocal chords. It was a mass of data.

Detailed, line by line, 1/10 of a second editing had to be done on every single word to make the product work, to crush the data further and to make it sound right. It was a very specialist job, and only two people in the world where trained for it - both were trained linguists. One was Larry, and a colleague ( a lovely lady of Asian extraction whose name eludes me). The problem was that neither were used to our British accent. It would have been fruitless for them to do the work as we'd just end up with another US sounding product. There had never been any British accented digital speech product, anywhere, this was going to be a world first. 

It was my project, and time was moving on. So there was no other option, I had to give up marketing for 3 months and do it myself. I spent time with Larry Brantingham and his small team. They were great, really helpful despite all my dumb questions. We all knew that I had to succeed. And so I taught myself the rules and locked myself in an anechoic chamber in Antibes for 3 months and get the product ready. They were long hard days. No background sounds, just slogging through trying to get each 1/10 second to sound right. Some words were relatively easy and just needed to be cropped. Some were unintelligible, and took ages to recover. That's why I cannot bear to listen the product even now. I remember the word 'butcher' took 3 days - I still don't know if it sounds right. 

I delivered the product, and I also helped complete the French, German and Italian versions of Speak & Spell too. Larry offered me a move to Dallas but I demurred. I had had enough of this - I wanted to get back to marketing. It was quite a first 6 months at TI!

NB: Some words were so mangled they were beyond redemption. But one word was mangled so much it became another. Thus the word 'Ghost' came back from processing as a perfectly announced 'Bullshit'. I left it in the test EPROM which I put into the first unit, just as a joke. It nearly backfired when I arrived back at TI in Bedford. The UK MD was touring the plant and came to see the new product... You can guess the first phrase he heard "can you spell Bullshit?" I am pretty confident it's not in the production model and I can't recall the key sequence I used to select individual words from the list.   

And the result? Everyone was delighted by the UK version. It went on to be sold for at least another 5 years ( remember: we hoped to add 1 year to its life). What's more it held its premium price point when other cheaper copy-cat type products arrived on the scene. It had a 'British' accent when all others were 'American'.

Speak & Spell was accepted as a high technology icon and was used by a number of (then) popular bands in their singles - including Depeche Mode and Orchestral Maneuvers in the Dark. 


Speak & Write

And it didn't end there. Just as I was getting ready to leave France after working all winter editing the words in Speak and Spell, my boss, Irfan Salim visited and delivered bad news. In the US they had launched a cheaper version of Speak & Spell - called the Speak & Spell Compact. It had no display which meant it was cheaper. But sales were really bad. We had foreseen this. It was clearly a lower cost derivative of the premium product and no-one would want to be seen to buy that! We had forecast zero sales, and we weren't launching in the UK. But US sales were so bad that we'd been 'told' we had to take 50,000 units (I recall) to help them out. We sat in the bar of my home for that winter, the Novotel in Cap 3000 outside Nice, trying to hatch a plan to shift them. After a few beers we decided that we'd have to rename the product so that it looked like a stablemate to Speak & Spell rather than a poor cousin. We wanted to try and make a virtue of its lack of display and therefore we decided it should become and aid to writing. We'd include a pencil and paper pad and call it Speak & Write. It was a great idea, but there were problems.

First of all it needed to look different to Speak & Spell. We decided upon a blue colour and therefore all the cases had to be remade (or resprayed I was told). The packaging had to be redesigned too but that was relatively easy. And of course we needed the pencils and pads and new manuals; but that was fairly easy too. The biggest problem was the digital speech. We could re-use much of the dictionary  from the new British Speak & Spell but there were phrases and sentences in the 'compact' product which we simply not have in the British Speak & Spell - of course these included the key phrases about having to 'write'!

I had no time to re-record anything because the whole process would take too long, and cost a lot too. So there was just one alternative. I had to fabricate the missing words and phrases. I'd need to build them up 1/10 second at a time and make digital speech from nothing but data. Using much of the process I had just been using on Speak and Spell, except I had no starting material to edit. I started with a blank screen and each 20 character row was created entirely by hand to recreate digital speech. There was no actual recording used at all. It was just me, sitting in the studio mouthing the words parts and typing in the values that I thought would approximate my mouth shape. (NB: the digital editor was a 20 character line that described the shape of mouth and tongue and throat for each 1/10 second). I had been doing this editing for 3 months, and so actually I found creating the words not too much harder than editing some of the processed originals. When I think back on it now I was possibly creating some of the first purely digital speech - not based on a recording nor created from stringing phonemes (word sound building blocks). Just pure data.

In all, over the space of 1 final week I manually created about 10% of all the dialogue inside Speak & Write - its not faultless but it worked. We sold out. I wish I had one now, but at the time all I wanted to do was get rid of them all.

About us

We are a Marketing Consultancy for the 21st Century. 'Simply Improve Your Business' to get closer to your customers, increase sales and profits.

Contact Us

Office: 2 Harbour View, Cork, Ireland
Phone: (+353) 24 20634
E-Mail: info@smart-tactics.com
Web: http://www.smart-tactics.com

Privacy Policy
© 2018 Smart-Tactics Ltd. All Rights Reserved.