Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 28

[MUSIC PLAYING]

0.07
RAYMOND: Dave is largely responsible for the ARM architecture. I'm going to give
you the briefest possible intro to ARM.
0.14
It's why this is not this big and burning a hole in your pocket.
0.20
That's it. With that, Dave. DAVE JAGGAR: Thanks so much, Raymond.
0.27
I'm going to skip through the first slides fairly quickly, because I hope it's
fairly well-known stuff.
0.34
On that introduction, just a wee detail about ARM is in a lot of products these
days, it is dominant in cell phones, but a lot of other products
0.43
as well. ARM was formed in 1991--
0.48
late 1990, start of 1991-- with 12 engineers from a British company called Acorn.
0.54
We started intellectual property company with no intellectual property. This is not
something to try at home.
1.00
We got $2.5 million from Apple. Why all that came together will become clear in a
moment.
1.07
Chip designers-- ARM doesn't make chips, but everyone else does, pretty much. And
ARM makes a royalty.
1.13
Just recently the 150 billionth chip shipped. But none shipped in 1991 to 1995.
1.20
And about $23 billion last year. So that's about 20 for everyone on the planet.
1.27
More than 60% now of the world have access, use ARM every day. And that's about the
same as having access
1.32
to basic sanitation. 730 chips per second are manufactured 24/7.
1.38
They're a tiny company compared to Google, of course. Everyone is. But $1.83
billion turnover in 25 years is not too bad.
1.47
And about $1.1 billion is from royalties. 6,000 employees, nearly 1,700 people
manufacturing ARMS.
1.53
And they were bought by SoftBank in, uh-- whether it's for better or for worse,
2.00
the Pope was inaugurated in 2005. And there's a photograph.
2.05
And the change that perhaps ARM had. In 2013 it kind of looked like that.
2.13
I think maybe the Pope thinks the whole world is covered in purple and green
splotches from all the flashes going off.
2.20
Shipments [INAUDIBLE] $150 billion growth curve. That's to the end of 2018.
2.26
So that's why it's not quite $150 billion. I was largely responsible for the yellow
and orange
2.35
and the start of the red. ARM stopped reporting those products as individuals. And
that's why it becomes that beige color.
2.41
That's the combination of those three after that time. And then the Cortex M is
largely the development
2.47
of the yellow stuff that I did. And it's a direct descendant. And then the power--
the big chips that are on your phone-- that's the Cortex A
2.54
series-- the purple at the top. So you can see a lot of ARMS are going into
everything other than the main processor in a cell phone too.
3.02
Annual shipments look like that, approaching $25 billion a year.
3.09
Quarterly shipments are really jumpy, because you fill the pipeline with products
in about the end
3.14
of second and third quarter. And the fourth quarter and the first quarter are
pretty quiet after Christmas for production.
3.22
So a little bit about the background. This processor was originally developed by a
company called Acorn.
3.28
And that's the name. ARM is a Acorn RISC Machine.
3.34
They had a lot of success in the early '80s. They were kind of like a British
Apple.
3.39
Had a educational computer. And they sold so many of them that they then decided to
do this computer, which was a follow-on.
3.45
Unusually, they decided to develop everything themselves, right down to the
keyboard and the mouse. They had multiple operating systems, networking,
3.54
file systems, and the core processor, and three support chips-- the memory
controller, the I/O
4.00
controller, and the video controller. Kind of unusual. Unfortunately, it was never
particularly successful.
4.06
It was probably just overtaken by the IBM PC, like most things. In parallel in
1990, Apple were developing the Newton--
4.14
the first PDA handwriting recognition. You can roughly imagine it as a large cell
phone without any connectivity.
4.21
Probably why it wasn't successful is it didn't have any connectivity. But the
advanced technology group led by Larry Telser
4.28
were building this Newton. And they really wanted a low power 32-bit processor. It
was actually Jony Ive's first job at Apple
4.33
was to design the Newton. And because Apple and Acorn competed in the UK market,
4.40
they decided to spin out this company. The other bit of serendipity was the timing
of this company was it was just the start of the world going
4.47
digital. So just to cast your mind back, about 1990 a lot
4.53
of applications that were developed on PCs wanted to be put into sort of portable
products that ran on batteries. So as far as that was concerned, the ARM processor
5.00
came online just about the right place at the right time. At high school, my math
teacher said I'd never be an engineer.
5.09
This is kind of ironic, because one of the reasons I'm on a tour of the US at the
moment is Dave Flynn and I--
5.15
another senior engineer at ARM-- were awarded the James Clerk Maxwell medal from
the IEEE.
5.21
So I think maybe I can talk to my teacher with my head up now and say, actually,
I'm probably a decent engineer.
5.29
But because of that I actually did computer science instead. So I did every single
paper at my university
5.35
in New Zealand, which was unusual. And again, it was a bit of serendipity.
5.41
We just seemed to have all the right people with-- we only had 10 teaching staff.
But we just seemed to have the right bits of technology.
5.47
We had especially Tim Bell. If you know anything about text compression, the
original bible was written by Bell, Clary, and Witten.
5.55
And Tim Bell was one of our lecturers. We also had compiler technology. We had the
source to an operating system--
6.00
a good one. But we had almost no hardware expertise. So I spent a lot of time in
the engineering library
6.09
learning about hardware. John Mashey was from MIPS computers.
6.15
He has an esteemed career. And he gave a guest lecture in 1988 and really taught me
that there was such a thing called
6.22
a computer architect. And I worked with lots of old mainframes. If you look at
anything I design and you squinted it
6.28
a little bit, you get a PDP 11. So. So for my master's thesis I looked at this ARM
processor.
6.36
And the word "interesting" is in quotes because it's a slightly crazy architecture.
6.42
But it is interesting. So my thesis was called "A Performance Study of the Acorn
RISC Machine." And I wrote a C Compi--
6.48
sorry, I wrote a C Compiler. I wrote a instruction set simulator called ARMulator.
6.53
I put MIP style floating point on the side of it. And put a complete software stack
6.59
on top of that-- compilers, assemblers, the Sun OS sources, and ran the whole lot.
And I really think I learned a lot in those couple of years,
7.07
because you really haven't lived until you've debugged a complete stack like that.
It was actually a couple of bugs on the Sun OS source
7.13
when it was compiled for a new machine all the way down to a bunch of bugs, of
course, in my code. And later I modeled a 16-bit ARM.
7.19
And this comes in later as a replacement for the teaching simulator.
7.25
A couple of days after handing in my thesis I saw my first copy of Patterson and
Hennessy's "Computer
7.31
Architecture-- A Quantitative Approach." And I remember standing in the university
new book section
7.36
at the library going, well, you could have told me sooner.
7.42
RAYMOND: If only I had known. DAVE JAGGAR: I could have just saved myself so much
time. But yeah.
7.49
So inspiration number 1 one was John Mashey from MIPS and Silicon Valley. And I was
very fortunate to give a talk in Stanford last week.
7.56
And John was there. So after all these years-- 30 years-- I was able to sort of
repay a little bit of the thank
8.02
you's. He walked in with a stack of overhead projector slides
8.07
about this tall, all messy, straight out of his attache case, and said, I won't
have time to present all this,
8.12
and then did it. And it was like drinking from a fire hose for 90 minutes. It was
fantastic.
8.18
And I guess at the end of that I knew what I wanted to be when I grew up. He also
came out with this taxonomy
8.25
of a complex instruction set computer and a reduced instruction set computer, and a
continuum of how you grade
8.31
these things as far as what a processor might be on that continuum.
8.37
And I've never really entered into this argument. ARM is not a pure RISC. Our CTO,
Mike Muller said that way back in 1992.
8.44
It's kind of on the spectrum. I found out recently that mushrooms are closer to
animals than they are to plants.
8.51
And I think it's kind of like that. It's just a different thing. And you shouldn't
try and rule it in too much.
8.59
Maybe it's a MISC for miscellaneous. And maybe you have to bacronym the M. And
we'll call it a M for modest.
9.05
It was a modest little chip, mainly because it was designed not to have any on-chip
caches. It was designed to connect directly to DRAM.
9.14
And that has a ton of architectural implications that good and bad. And it took a
long time to sort of undo some of those.
9.23
Probably the single biggest mistake they made is they had a limited 26-bit of
address bus. They artificially limited the address bass.
9.30
And so this thing could only address 64 megabytes of memory. But it was low cost,
it was low heat, and it was low parts.
9.38
They had to fit in a plastic package back then with no heat sync, no fans.
9.43
It was probably no smaller than the early MIPS machines. We used to say it was. But
with hindsight, when I learned more about
9.48
how the MIPS machines were laid out, there was probably no simpler. Anyway. The
implications of having no caches on-chip
9.55
meant a long cycle time. And this meant that the whole instruction pipeline
10.01
of the machine stalls whenever you access memory, because the machine's trying to
load
10.06
an instruction every cycle. That's what RISC machines do. They're always trying to
load an instruction. And as soon as you need to access memory
10.12
for a load or store, you have to stop fetching instructions. And the whole pipeline
stalls. This is kind of unusual.
10.18
Because that whole pipeline stalls, you've actually got time to do other stuff
while you're accessing memory. And they went ahead and baked a lot of the stuff
10.25
into the instruction set. Really quite unusually in an ARM,
10.30
a single instruction can do a shift and an AOU op in a single cycle.
10.36
I don't know any other machines that do that. It also has load installed multiple
instructions, which allow you to get fast DRAM access for data.
10.43
This is not how our computer pipeline should look.
10.49
But this is how ARM2 looks. A trained computer architect will look at this, and I
see an Anaconda that has swallowed a goat.
10.59
So you've got that empty, empty, empty, huge bulge that just doesn't look right,
empty, empty, empty.
11.05
And there's just everything's done in the execute stage of the ARM pipeline. It's
really not pipelined at all-- the back end
11.10
of the machine. Because it's so simple, that thing has to just loops while it's
accessing the single memory
11.17
system. A little bit on code. It's cute and fun to write assembly code for this
thing.
11.25
For example, the top instruction multiplies register 1 by 5, which is kind of a
handy thing to do in one cycle.
11.32
You can do other things like move bytes around in and out of registers to do bit
field and certain [INAUDIBLE] in a single cycle.
11.39
The AOU and shifter combination was also used to form addresses from loads and
stores.
11.45
And that meant you can do quite complex addressing mode. So the C programmers in
this world will understand the R5 + + nomenclature.
11.54
You can do auto increment and auto decrement built into every instruction. Register
15 was the program counter.
12.00
And register 14 was the return address. So to return from a subroutine, you just
12.07
put register 14 back into 15. And that returned. The last one is a conditional
return
12.14
from function call all in one hit. So if something is equal to 0, that's EQ bit.
12.20
You load multiple increment after. Register 13 is the stack pointer.
12.25
And the exclamation mark means, I'm going to update the stack pointer after I've
done this operation. It loads register 4, 5, 6, and 7, and the program counter.
12.33
So it does a return all on one instruction. So that's all kind of cute and fun.
12.40
The trouble is, the instructions say it also defined all the bottom ones. And they
had to work as well, because they really
12.46
didn't have a concept of, we shouldn't allow people to do this. We'll just let them
do whatever the pipeline can achieve.
12.53
So if you want to multiply by 33,554,431,
12.58
you can do that in a single cycle. It's just not particularly useful. You can
branch to that funny address.
13.05
You can load a byte was funny offsets like 75-byte offsets. The load instruction
underneath that-- that LDR R15--
13.12
that takes the program counter and rotates it by 15 bits. And then it adds it to
the program counter.
13.18
Then it accesses that memory address and loads that into the program counter. Now
this is almost completely useless.
13.25
But it still had to work in every implementation we did after that, because the
programmers used weird instructions like this early on.
13.33
The last one loads register 13 and updates register 13 as the stack pointer. So
after their instruction, it's not clear what register 13 it.
13.42
And so this was a lesson about architecture versus implementation. They had a tight
little implementation that worked well for DRAM.
13.49
But the world was moving very quickly towards high level languages. And Steve
Furber, the original implementer of the design,
13.56
said this recently. "We expected to get into the project finding out why it wasn't
a good idea to do it.
14.02
And the obstacle just never emerged from the mist. We just kept moving forward
through the fog." Now we've all been in that situation
14.08
with a new design of something where we just really don't know what we're doing. If
we're honest, we're just progressing through the fog,
14.13
trying to work out where things are. But I really love this very honest
description. It explains why the architecture was more or less missing.
14.21
Because to have a chip architecture you really need to have visibility over lots of
implementations.
14.26
You really need to be able to look forward several implementations to design
14.34
a good architecture that's not going to be costly to implement in the future. I'll
skip this slide.
14.39
It unpacks that a little bit more for those that are interested. It basically says
why those initial interesting things
14.46
in the pipeline become hazardous. So if ARM2 was M for modest, as soon as you add
on-chip caches,
14.55
M has to begin something else. And it's muddled or messy or any other kind of sick
adjective.
15.01
Just beyond this, on-chip caches became affordable. And as soon as you do that to
the ARM architecture,
15.07
the whole architecture starts to look a bit strange and silly and hard to
implement. But they went ahead and made one of these things anyway.
15.14
Acorn did ARM3 It was an ARM2 with a 4-kilobyte instruction and data cache. It had
no write buffer.
15.22
And this was due to self-modifying code. They had a habit of writing the
instruction stream directly ahead of executing it.
15.27
This was mainly for bit blit graphics. But they would just spit out instructions,
and then expect that to be loaded straight into the pipeline
15.34
and executed. That means that you really can't have anything buffered in the write
stream, because it needs to come straight back
15.40
into the cache. And it means you can't have separate instruction caches and D
Cache-- data caches, because they become incoherent.
15.48
My thesis predicted that this would be a 40% performance loss. And that number
probably got me the job at ARM.
15.54
So yeah. They really had no idea of Gordon Moore and Moore's law
16.02
it was really-- I don't know whether they just ignored it or just missed the memo.
I guess it was somewhere in the fog.
16.09
So the next generation process would have given them twice as much silicon. But
they didn't exploit it.
16.14
So just a quick summary. At the start of 1991, there was this joint venture between
ARM
16.20
and Apple with 12 engineers. Sophie Wilson, the original ARM instruction set,
stayed at Acorn. She did not join ARM.
16.25
Steve Furber took the professor of computing at Manchester University. So he was
gone. Al Thomas-- this name is going to be an important in moment.
16.32
He was the ARMS3 cache designer. He took over all CPU design at ARM. And we had no
patents.
16.37
No coverage at all. Acorn had never filed any patterns on this technology. And the
money from Apple.
16.43
And we had fab space from VLSI Technology in San Jose. Robin Saxby joined as CEO.
16.50
He brought a lot of experience. We'll talk about Rob in a minute. And then later on
there was a layout engineer. The office manager, Simon Segars is the current CEO.
16.58
That's the tall one in the picture. I'm the other one in the picture. And then Dave
Flynn joined just after me.
17.04
And he's the co-recipient of the middle. Robin brought a lot of experience to ARM.
17.12
I'm a hoarder of emails. I never throw away an email. And I have all my emails. And
you can look back and see that he completely
17.19
predicted this domination. He has a bunch of sayings there. And you'll hear him say
those things a lot.
17.24
We did work hard. We did have fun. So ARM started in this tiny little barn--
17.31
in a 17th century barn-- in a town called Swaffham Bulbeck-- can that name be any
more English--
17.38
about eight miles Northeast of Cambridge. We added about 10 staff per year. And we
had almost no money.
17.44
We were almost always going bust. In fact, Brexit is kind of funny, because we
would have not been in business
17.50
if it wasn't for the European government funding that we received. So if Europe
hadn't been part-- if England
17.56
hadn't have been part of Europe, ARM wouldn't exist. Acorn and Apple Commitments.
18.01
Had to do an ARM6, and an ARM7, an ARM8 for Acorn plus floating point and video
controller.
18.08
They really wanted high performance workstation processes. Apple really wanted
something that would fit in a thing that looked a wee bit like a phone.
18.15
And that was Robin's balancing act for years. I think, as he's practicing on the
unicycle there.
18.21
First thing ARM did-- this was just before I joined-- was the ARM6 family for
Apple. The nomenclature is if it's a single digit like a 6, that's
18.29
just a processor core. Can't really use it by itself. 60 is a processor core bonded
out. And this is the very first ARM development
18.37
card with an ARM60 right here. At the same time as I got the metal, Dave Flynn
presented this to me as a gift.
18.43
I'm so proud to have the very first ARM development card. I've since promised it
without asking Dave yet-- sorry, Dave,
18.51
if you're watching this-- to give it to the Computer Museum in San Jose, because
it's kind of a start of the revolution.
18.58
So that's an ARM60, which is ARM6 bonded out. An ARM600 or 16 would have caches on
it.
19.04
If there was a 6,000 later, we started going up to four digits, that would be an
ESOC. So that's how the naming worked.
19.09
They put a write buffer on this for Apple when they pushed it out at 32-bit wide
address bus for Apple. And hey, the write buffer produced a 40% performance
19.17
increase. So that was handy. I've only ever had one job.
19.22
I worked at ARM for nine years, and then I retired. So straight out of university I
joined. I sent them my thesis by post and heard not a thing.
19.31
Not a single sausage was heard in New Zealand. And of course, postage back then
from New Zealand to England,
19.36
I didn't actually know the means by which my parcel would travel, whether it was on
an airplane or a boat. So I waited patiently.
19.43
And on the 2nd of May I sent an email to Jamie Urquhart, who was running the VLSI
group,
19.48
asking if I'd got my thesis. And John Mashey admits asking if they had any jobs.
19.53
John came back and said contact HR, which made me kind of think probably not.
19.59
But on the 3rd of May I heard from somebody called Lee Smith at
advancedRISCMachines.co.uk. And he said that following, I have your CV.
20.05
I've been impressed by it. And he's currently looking for a software person start
around the end of June.
20.10
So this was another piece of good timing. I got a telephone interview on the tenth
of May. And I had a job offer on the 17th of May.
20.17
And I arrived in the UK on the 20th of June. As part of those emails going
backwards and forwards,
20.22
I had the following paragraph. Lee said, "Over the past few days it has come to my
attention that our understanding of ARM
20.28
at the software level is insufficient." This really troubled me. I couldn't quite
understand how that sentence could exist.
20.35
These were ARM. How did they not understand their processor at a software level? He
was actually talking about doing HDL and VHDL models--
20.43
Verilog and VHDL models of the ARM. And I went on to do some of that too.
20.49
So I joined about two months after ARM600 taped out. Robin Saxby lived two hours
away. And he didn't want to move his family up
20.55
to Cambridge at the time. This was a start-up, so he didn't want to disrupt his
whole family. So we ended up renting an apartment together in Cambridge.
21.01
We have a lot in common, including the same birthday, 20 years apart. And we're
both Cambridge outsiders.
21.06
So we got on really well. We still do. We see each other a lot. I had a very modern
software development background then.
21.13
I was used to symbolically [INAUDIBLE] C and Unix. Acorn's way of hand-coding
things and they
21.19
used a lot of interpreted Basic was kind of archaic to me. I certainly knew that
the ARM processor
21.24
was too slow to compete with the big boys. I knew that we had a decent modest
implementation. But the architecture was pretty much non-existent.
21.31
And ARM really didn't understand the concept of architecture back then. And I knew
that we didn't have John Mashey's experience.
21.38
So day one was write an instruction set simulator. Day two, I handed in my thesis
code.
21.43
That was the easiest day's work I've ever done, cause I pretty much had that
written. [LAUGHTER] [INAUDIBLE] Actually, I spent about three months
21.49
fighting X86 compilation back then. And then as I said, Dave Flynn and I developed
this development card.
21.54
I did the software, he did the hardware. And then we did Verilog and VHDL models by
wrapping that C code.
22.01
I was made the head of technical marketing because I was the only one in technical
marketing. So therefore I was the head and the body.
22.09
Because I knew how to benchmark code and had a good experience with this, I was
just flying around the world benchmarking
22.15
code for people. Just so you know what a high tech startup looked like in 1991, we
called Cambridge once an hour
22.22
at five minutes to the hour with a 2,400 baud modem to send and receive all the
email for the entire company.
22.29
So if you had an important email to send and it was 10 to the hour, you were typing
very quickly to catch that dial-up.
22.36
We had no wireless. Wireless really hadn't been invented then. 10-bit Ethernet
everywhere. A few Sun workstations.
22.42
A few Acorn base workstations. But all pretty crude.
22.48
So a summary is, we had two low volume customers with very different needs. We had
one CPU designer.
22.55
We had a modest ARM6. We had that with 600 cache, MMU, and write buffer. We had
some software tools.
23.00
But we had no experienced architect or complete CPU design team. We had no
development cards. We had no HDL models.
23.06
We had no general purpose operating system. No way to debug an ARM6 if it was
buried in SOC.
23.13
And as I said before, no volume customers. But most shockingly, we had no patents.
23.20
Shockingly, in 1992, as I said, Sophie stayed at Acorn. Steve went to Manchester,
took a professorship.
23.27
And Al Thomas passed away halfway through 1992. The other half of the company--
about half the company--
23.33
were working on Acorn parts, another quarter on software tools, and the remaining
quarter were support sales marketing.
23.39
It turned out, 12 months after leaving university, I was the only one in the
company that really had an in-depth knowledge of the ARM.
23.46
And I had absolutely no clue about processor design. So I was really thrown in at
the deep end.
23.52
I did point out that maybe perhaps it would be a good idea if we had some patents.
So they immediately made me the chairman
23.58
of the patent committee. RAYMOND: Were you the entire patent committee? DAVE
JAGGAR: Yeah. I was the chair of the patent committee.
24.03
So I had to walk around and bribe people into writing things up as patents.
24.09
So I was the entire CPU team. I understood bits of their design,
24.14
because it was written in C. Other bits to instantiate it into their timing
simulator I did not understand at all.
24.20
We needed a follow-on processor quickly. I did have a lot of background with
software architectures
24.26
in general. And this was really the rebirth of ARM. Back then RISC was very
popular.
24.34
All the big guys were doing the RISC processor in some way. Intel had the I960, the
I860 going on.
24.39
Motorola had the 88K. And all the old MIPS and Sun really started all this. But
everyone followed.
24.45
Down the bottom there was a bunch of small embedded cores. And in between there
actually wasn't much.
24.50
The Motorola 68k really owned that market back then. There was a little bit of X86,
but not much.
24.56
I remember Robin rented my room Monday to Friday. And we had some pretty candid
talks every evening.
25.01
I need to rewrite this line. I think we convinced each other pretty quickly we
couldn't compete with the big fish, and we should just go somewhere else.
25.08
Richard Feynman has that term, there's plenty of room at the bottom. And I really
like that term. There's plenty of room at the bottom. I think there's still plenty
of room
25.14
at the bottom of this market. So that's what we-- we started to go down into the
embedded side of things.
25.22
And RISC was the buzz of the industry. It was much better than CISC. So we kept
calling it a RISC.
25.27
But I'm really sticking for the MISC. M is now for the embedded instruction set
computer.
25.34
I did a really quick spin of the ARM6, made it go faster. And there's a big
critical path I knew about.
25.40
I learned about transistors real quick. You can't have big stacks of them if you
wanted to run at low voltage. So you rearrange a few things to get it down to 3.3
volts.
25.48
I put a tiny bit of debug in. I removed a reset from the return address
25.53
after the processor was reset. I removed the reset wire from that latch. The
hardware guys go nuts when you
25.59
do this, because they point at it and say things like hi-Z. And I didn't even know
what hi-Z was. Sounded like an energy drink that hadn't even
26.06
been invented then. But what they let you do is you reset the processor. And then
at least you knew where it was when you pressed reset.
26.13
That's how crude our debug was at the start. And I filed a very narrow patent on
that,
26.18
because it's quite an unusual thing to do to not reset part of your chip when you
hit the reset button. And that was, I think-- that was my first patent.
26.27
We called it ARM7. Those changes gave it enough to give it a new name. I then went
on and started to get into DSP.
26.34
So at this time we were looking at MP3 code for doing digital audio players like
the iPod.
26.40
And so we added a faster multiplier. And I did proper integrated debug so that we
could debug the processor when
26.47
it was buried under an SOC. I think I'll skip this. I did multiply properly to get
us into DSP.
26.55
Simon Segars, who's the current CEO, freed up from the video controller. And he
most of ARM7DMI.
27.02
It's really great to have a CEO with a technical background. It was very well
received. The debug interface really revolutionized
27.09
a lot of the design tools. Because I'm a software guy, interfaces kind of come
naturally to me-- well-defined interfaces.
27.16
And so that really started the ARM ecosystem where people could write a debugger
once and interface
27.23
to a lot of different chips, because it was a proper interface at that level. I was
traveling a lot at this time,
27.29
doing a lot of benchmarking. The performance was great. The power consumption was
great. The die size was fantastic. I was spending a lot of time in America.
27.36
So the weather was much better than England. But code density bit us, and it bit us
hard.
27.41
We were trying to replace eight and 16-bit controllers.
27.49
And obviously the reason you're putting a new microprocessor in your product is you
want-- either it's a brand new product, or you're trying to put
27.55
a bunch of new features in. And we ended up having code size that was bigger than
the original products.
28.00
We originally thought we would be smaller. But it turned out being bigger. And of
course, the way memory works
28.06
is you don't go from 12 kilobytes to 13 kilobytes. If you go from 12 kilobytes to
17 kilobytes,
28.12
you then probably need a 32-kilobyte memory system. That's the first problem.
28.17
We blew the memory budget such that they really needed to double their memory size.
The other problem is a 32-bit risk instruction set computer
28.24
wants 32 bits every cycle to hit full speed. It wants to swallow instruction as
much as it can.
28.30
And a 32-bit wide memory system then was two or even four chips. So this was
painful for everyone to maybe quadruple
28.39
the size of their memory system. What really drove this home to us-- a lot of
people think that the chip that this became
28.46
was for Nokia. It actually wasn't. It was for Nintendo. And back then games
cartridges plugged in.
28.51
And they were basically a bit of plastic, a tiny little bit of brass, and a stack
of memory.
28.57
And so if we made the wrong cartridge twice as expensive or four times as
expensive, that really ate all their profit
29.04
at Nintendo. So this was against the industry.
29.11
Now Mike Horowitz-- the quote here was at Stanford last week. So I'm slightly OK
that I've told this joke to his face.
29.19
But it's unusual to see the word "ridiculed"
29.24
in a technical document. But the thinking at the time was very much this, that you
shouldn't try and do coding density.
29.31
You should do simple decode. And that's absolutely the correct thinking for a high
performance workstation.
29.36
And it's just the wrong thinking for embedded. So simple decode, simple decode,
simple decode
29.42
was the way everyone thought. And you'd be ridiculed if you tried to do anything
else. And so to swim against that tide was hard work back then.
29.51
But any instruction set was fixed links as wasteful. And as we saw on that code
side earlier,
29.58
not all combinations are very useful. So if you can get rid of them somehow, it's
good. So on a train from Nintendo to a ski weekend at Matsumoto
30.07
in 1994, and literally on a napkin, I started writing the 16-bit instruction set.
30.14
It was pretty much the same one that I used in my thesis. I'd learned a few more
tricks by then.
30.20
And so I crippled the C compiler. And what I did was I made the C compiler only
produce 32-bit instructions that weren't too complicated that I
30.28
knew I could compress down into 16-bit instructions. So that, because I was not
using the full power
30.34
of the instruction set, the programs actually got bigger. Because it was still 32-
bit instruction sets. But they had instructions-- but they had gaps in them.
30.42
And I knew that I could then take all those and squish them down to 16 bits. So
when the program size only went up by about 40%,
30.48
I smiled, because I knew I could halve that immediately back down to 70% when I re-
encoded them in 16 bits.
30.55
The real light bulb moment, though, was when I realized that this processor should
have two instruction sets.
31.01
Now at the time, remember, we're talking about reduced instruction set computers.
You should one instruction that does one thing at all times.
31.08
So having a machine that has two completely different instruction sets and codings
and two instructions
31.14
that do exactly the same thing was really weird. It's about as unRISC as you can
possibly get.
31.21
So I called this thing thumb, because that's the useful bit on the end of your arm.
31.28
It's a second instruction set, more compact than the original one. I recorded the
instructions. As I said, programs end up being 70% smaller.
31.35
And if you're running from narrow memory, the code runs faster because you get a
16-bit instruction
31.42
every cycle instead of having to halve the memory bandwidth to get a 32-bit
instruction.
31.48
I added some support for 16-bit data. I left in the ARM instruction set, so you can
still do full speed if you want to, especially
31.55
from on-chip memory. I also defined something called TOM. Tom Thumb, right?
32.01
A 32-bit data path with only the 16-bit instruction set. And that's what's called
Cortex M0 and M1 today.
32.07
The other really big volume chips. And I also defined and put all the hooks in TOM
16
32.13
with a full 16-bit data path. We never did that, and we really should have. A lot
of people don't know--
32.18
Unix runs really nicely on a 16-bit machine. It started life on a 16-bit machine.
32.23
And one of my few regrets is that bit.
32.29
So Thumb really put us on this different curve, this red curve where we could have
more performance and less cost.
32.34
And depending on which-- how you encoded your program, if it was an important bit
of code, you
32.40
encoded in the 32-bit instruction set. If it was a less important bit of code-- for
example, all the GUI--
32.47
you ran all that in 16-bit code. So you had the best of both worlds. It was really
on a different curve.
32.53
And it was really the breakthrough for ARM and embedded. I left all the original
stuff in because it
32.58
was a really easy sell to say you've got the best of both worlds. And I never would
have got away with replacing
33.03
the entire instruction set. Remember, by this stage I'm only a couple of years out
of university. So although it's exactly what I was doing,
33.10
I put a back door in that later we used Thumb2. I put a prefix instruction. And no
one spotted that, fortunately.
33.17
It was smart politically, because it looked like a relatively small change for the
chip. And for those who called it architecturally ugly,
33.23
I said, yeah, it's ugly. But gee, it works well. Sophie Wilson, who was the
original architect that
33.29
stayed at Acorn, she hated it. She wrote to ARM's board and said, to be brief, I
don't like Thumb. As a short-term hack it might be survivable.
33.36
As a long-term architectural component, as my view a disaster of enormous
proportions. It represents a backward step.
33.42
Now the first chip sold $30 billion units. So maybe not quite as backward as she
was expecting.
33.48
But it was a big deal. There was an emergency board meeting. Robin Saxby's bonus
was cut by 20% if he chose to do this.
33.56
They really tried to stop it. Steve Furber was called in from Manchester as the
judge and jury.
34.02
And narrowly a side of ARM. Steve recently said, "ARM addressed the code density
issue with an imaginative leap.
34.07
They introduced the Thumb 16-bit instruction set." So it went from a backward step
to an imaginative leap.
34.13
So that's a pretty good U-turn. And this is why I say my part in ARM's downfall. It
was downward in market position.
34.19
But it was very much upward in success. I will say it's much harder to simplify
something like this
34.24
than you think. Looking back on it, it looked so easy at the time. It was just, how
do I take this big complex problem
34.30
and make a simple solution? And RISC in general is a little like that. It's often
hard to look across.
34.38
It still looks like an Anaconda that swallowed a goat. But there's this little
Thumb decode in the front. There was fresh air in there.
34.44
And I could slip the decoder in so that we just decoded 16-bit instructions to 32-
bit instructions.
34.50
And the rest of the pipeline just thought it was being fed 32-bit instructions.
34.56
I fixed quite a lot of other things that were wrong with the architecture. I hid a
lot of the ugliness. And I really thought no one noticed.
35.01
But in the latest version of Patterson and Hennessy there's the statement at the
bottom. "In many ways, the simplified Thumb architecture
35.07
is more conventional than ARM." So someone actually noticed that I did a bunch of
cleaning up in there.
35.12
And they least miss the guy that originally gave me the job just said last year,
"Thumb was essential to our success."
35.18
That's his summary of it. 32-bit ARM sealed the deal, getting to 2/3 of the code
size
35.23
took 10 years, but they could see we're on a trajectory to an asymptote. Nokia were
driving round Finland with a van
35.31
full of equipment testing cell phones at the time. They looked at Thumb, realized
how much
35.38
it outperformed the competitors, and were sold on it. And so Ericsson and Motorola
were the other big names
35.44
in phones. Then they had to follow. And so we ended up selling an ARM license to
Motorola. So this was-- wow.
35.50
We've actually sold a license to the big guys. Texas Instruments loved it. They
combined it with a lot of their DSPs.
35.56
The chip was called MAD-- microprocessor and DSP. And I think it's fair to say it
really rewrote
36.02
the rulebook on what an embedded processor should look like. MIPS followed fairly
quickly with MIPS 16.
36.08
The latest RISC-V, if you're familiar with it out of Berkeley and Stanford, has
36.14
the C optional 16-bit instruction set that you can bolt on it for embedded control.
36.20
These are the two big patents. Notice that actually MISC is not a bacronym. Right
on the patent title back then,
36.26
multiple instruction sets. Multiple instruction set mapping. So MISC isn't a
bacronym really.
36.32
Multiple Instruction Set Computer. They were filed early 1994. I'm the inventor.
No, I do not get all the money.
36.38
Everyone asks that. That would be nice. But that's the ARM of the assignee.
36.44
The patent people in the audience might like to read this one at their leisure. We
had some narrow patents and some wide patents.
36.50
ARM7TDMI, the processor that came out of this, was never cloned successfully. The
little guy, ARMs 2, 6, and 7 when
36.56
they had less patent coverage, or almost no patent coverage, were cloned a lot.
37.03
I was flying a lot by this time. I was just selling this thing and benchmarking
this thing. We're still a pretty small company-- maybe 40 people.
37.11
And all the big names getting into printers, getting into hard drives, getting into
37.16
headless terminals and all this sort of stuff. Cars, of course. The printer and
camera guys really liked it.
37.25
We had some weird customers. NKK Steel, who were just a big steel company, took a
license.
37.31
I still don't know why. We had some on the eurofighter. That scared me. I didn't
want to be anywhere near a eurofighter, cause
37.36
I knew how many bugs we'd seen over the years. But anyway. There was one on the
eurofighter. And I accidentally visited the NSA.
37.42
They wanted me to put a backdoor on the processor. I thought I was honestly
visiting the National
37.50
Semiconductor of America. National Semiconductor used to be a firm. And "of
America" used to be a thing you put on your end of your title.
37.57
My boss gave me a bollocking for why I didn't hit return with any business cards.
And later I worked out what the NSA was.
38.05
That became that skipjack clipper program that came on much later.
38.11
In parallel we had a big project running at ARM-- ARM8 and 810 was using up about
half our resource
38.18
to try and do a fast processor for Acorn as best we could. We had a single
instruction data cache
38.24
for the self-modifying code problem. But we didn't put Thumb and debug on that. And
the floating point was difficult too.
38.30
But we did that to their specification. But it used up an enormous amount of
resource. So that's what most of the company were doing.
38.37
So ARM7TDMI was really successful. I was traveling a lot-- I'm starting to think
about how to go faster-- when Digital Equipment Corporation,
38.44
who were third or fourth biggest people in computer company in the world then, came
on long and said,
38.51
we'd like to do a fast ARM for Apple. Now digital had about four--
38.58
well, they had exactly four that I know of-- reduced instruction set programs going
on at the company at that time. They had the Hudson RISC.
39.04
They had Titan. They had Prism. And lately, Alpha.
39.10
And Alpha was originally called EV, because their programs kept getting canned
because Vex
39.16
was everything at digital. And if your program had nothing to do with Vex, when the
cutbacks came, they were just canned.
39.23
So the prism architecture was a beautiful little architecture. But it got canned.
So they started a new architecture, which
39.28
they called Extended VAX-- EV. And it didn't get canned, even though it had nothing
to do with VAX.
39.33
It just had VAX in the title. And I really learned about that. I thought, well,
that's kind of-- hide that from the board.
39.39
Later, by the way, the marketing people got hold of it. And they called it the
Alpha AXP. And the joke in the engineers was,
39.45
AXP stood for Almost Exactly Prism. They blew the doors off the industry,
39.51
they were running at 200 megahertz when everyone else was about 66. It actually
turned out to be too late to say digital.
39.56
But probably the best design team on the planet. Quite a lot of these people are
still active.
40.03
I went to Texas for eight weeks and wrote the ARM ARM-- the Arm Architecture
Reference Manual.
40.09
I just cleaned up the whole architecture and said, don't do this. We promise not to
halt and catch fire if you do do this.
40.15
We promise not to get privileged if you do do this. Otherwise, don't do this. And I
learn a whole lot about how to design
40.21
a chip from these guys. They were a very friendly bunch of people. I didn't
downplay Thumb.
40.27
But I didn't talk it up either. I basically said, you guys do the high end where
you've got 32-bit memory systems, 32-bit on-chip caches.
40.33
We'll stay at the low end. And that could be our differentiation. We all agreed
that was quite a good idea.
40.39
So the StrongARM processor came out. They basically cut an Alpha in half.
40.44
It was so fast that Apple started rewriting their self-modifying code. But it was--
did I say Apple?
40.50
Acorn started rewriting their self-modifying code, which
40.55
was the nail in our mate's coffin, really at the ARM company. But nothing could
save Acorn by then.
41.02
It was just too late for them. But I snuck back to Cambridge having learned
everything about how the StrongARM was designed,
41.09
and told ARM we should do an ARM8E. And this was that lesson about don't call
anything new
41.15
because it may well get canned by the board if it's not in line with the product
roadmap.
41.20
So I called it ARM8E, even though it had absolutely nothing to do with ARM8. It was
the StrongARM pipeline.
41.25
A direct rip-off. I add Thumb and debug to it. And a tiny little design team, again
including Simon Segars.
41.32
And it was launched at ARM as ARM9TDMI. There's the pipe.
41.38
It's starting to look a lot less like an Anaconda full of goat.
41.43
It's pretty streamlined, that machine. Digital taught us how to do that.
41.49
Those two chips together are still responsible for about 80% of ARM shipped today.
So they've been tremendously successful.
41.57
That TOM32 machine did get built as that Cortex M0 and Cortex M1. The little arms
have no 32-bit instruction set at all anymore.
42.08
Then we decided to do-- it was silly for ARM and Digital to be designing chips
together.
42.13
And we particularly-- I particularly-- wanted to do floating point properly. And
they had a lot of floating point experience.
42.19
So we decided to a joint design center in Austin, Texas. We employed just about
everybody in England
42.26
that could spell microprocessor backward, let alone design one. So we really needed
to tap into another market.
42.32
America was a lot more expensive for salaries than Cambridge was. But we had to
bite that bullet.
42.38
So we did this design center in Austin, Texas. So I went to Austin in late 1996.
42.44
My oldest daughter Catherine was born. She's here today too. She's just finished
her EE degree.
42.50
So that's what I've been doing in between by the way, is raising my children.
42.55
But this program ran into some huge unforeseen problems-- unavoidable problems.
First of all, I noticed the ARM19,
43.02
we didn't have a great debug strategy. And they were booting operating systems. We
had Window C, the Symbian operating system,
43.09
and Linux were all running on ARM at the time. And our first silicon ran about
10,000 structures
43.15
and fell over. And so we spun the silicon. Got the silicon back. It ran about
10,000 more instructions and fell over.
43.22
And they did this four times from memory. And this is a very expensive long loop to
be going around.
43.28
Digital had enough performance that they were booting the operating system on the
neat list. They had enough compute in the Digital company
43.35
that they were getting about 100 instructions per second. And they were booting
Unix up to the command prompt.
43.42
I really thought that was cool, and really wanted to exploit that somehow. The next
thing that happened was that Digital sued Intel.
43.49
And Intel looked at the price of the lawsuit
43.54
and the price of Digital and went, let's just buy Digital. No one at the Digital
design center in Austin
44.00
wanted to work for Intel, so they all quit. And they didn't want to work for ARM.
They wanted to do their own startup.
44.07
And then we were using Compass design tools at the time. They were bought by
Avanti. And I believe overnight the licenses just stopped working.
44.17
So we had no design flow. So obviously big problems are a big opportunity.
44.23
I went back to the emergency board meeting in Cambridge. Do we stop this now and
fire the four or five people we've hired
44.29
and apologize profusely? Or do we change gear and do our own design center? That's
what we decided to do.
44.35
They gave me headcount for 50. I only ever used about 20. But I didn't get to spend
much more time at home
44.40
with my family. So we had a new chip, a new team, new tools, new flow, new country.
44.46
So obviously I had to get infrastructure, buildings, admin, the time zones are a
pain. There was a hiring frenzy.
44.52
I borrowed the support people from elsewhere. I just didn't have time to put all
that together. But this was really a startup in Austin.
44.59
And I didn't want to back off on the deliverable. I wanted ARM10 to be about twice
as fast as ARM9.
45.05
By the time you add floating point to that and new support for operating systems,
it ended up being about four or five times more complex
45.11
than ARM9. I was really worried about the ARM9 long loop
45.17
around booting code. I really wanted to find some way of getting much better
validation in these silicon chips.
45.25
We didn't do super scalar on ARM10. But we set up for the next chip to be super
scalar. And that group in Austin went on to do
45.32
the start of all those Cortex A series that were all in the phones. It was another
group and Sophia in France, they had it also.
45.39
They ping-ponged back between the two designs. But the ARM10 was a decent chip.
45.44
It had an 8-stage pipe. It ran fast. It did indeed become twice as fast as the
ARM9. We fixed up everything we could that we knew about in ARM9.
45.52
And so ARM10 was very successful. We did brand new floating point from the ground
45.58
up with little short vectors in it. That architecture is still in use today in
ARMV8.
46.04
So that architecture is 23 years old already. So that probably says it was half
decent.
46.09
That floating point also got back-ported to ARM9 and ARM7.
46.15
So it really was a broad architecture. We put in some proper software hooks.
46.20
By this time people were actually debugging code that was running on the arm. So
we'd sort of come full circle.
46.26
People were making something powerful enough to actually develop on the machine. We
completely reworked our validation methodology.
46.34
We started including random instruction set generation to just throw random
instructions at the core to see
46.41
if we can make it blow up. I had a small brainwave that code that I wrote way back
for my thesis.
46.48
I brought it up to ARM10 specification. And I made that code record an instruction
trace.
46.54
When we booted an operating system, I saved every instruction that got pulled into
the core. And I recorded every data transfer to and from the core.
47.02
And I played that back to the transistor model. And wherever they were different,
we set around the table and worked out
47.08
whether it was their fault or mine. It was about 50/50. But we got it to the point
where we could make this instruction trace of the three operating
47.15
systems booting. And we could run that on a simple Sun workstation.
47.20
And as soon as we saw a problem, we could stop and go and look, really pinpointed
where the problems were.
47.26
And of course, we'd fix the transistors manually. Run a regression test up to that
point to make sure
47.32
we hadn't broken anything with the fix. And then kept on booting. So we were able
to boot whole operating systems that way.
47.37
And we ended up finding every bug in ARM10 that way. Worked beautifully, along with
the random instruction
47.42
generation. ARM10200 was very successful. And as I said, it was the start of that
Austin design center.
47.51
In the year 2000 we had that silicon back again. We knew what we were going to do
for Rev 1. There's always a few tweaks. You don't get it perfect when you type it
out.
47.58
You get something that's very close, and then make some silicon, bring it up.
48.03
The Austin office was about 45 people by then, a pretty experienced team. And
they've gone on to be a wonderful set of CPU designers.
48.10
I bailed. I was from New Zealand, remember. So I bailed back to New Zealand in
early June 2000 just after nine years at ARM.
48.16
Technically I was on sabbatical. And I've been ever since working on much more
powerful CPUs.
48.22
I spent the next few years actually wading through patents because there was a
lawsuit over ARM7TDMI.
48.27
But I was pretty happy with what I've achieved. I did work hard, but had fun.
48.34
We got a few things wrong. We backed-- we backed-- the people we backed were mostly
wrong.
48.41
And the people we didn't back were mostly right. So we really got it-- we were one
inverter away from success, one not
48.47
8 away from success. If we had of backed any of the other games consoles, we
probably would have been fine.
48.52
If we had of backed the Palm Pilot, we probably would have been a little better
off. We didn't see Nokia coming.
48.57
I personally did not see cell phones coming at all. I looked at the possibility of
the cell phone infrastructure and thought, wow.
49.03
They're really going to dig up every road and put aerials on top of buildings. And
this just seemed so unlikely.
49.09
But I actually thought the Iridium cell ph-- the satellite stuff was going to work
better.
49.16
I often look back at my life. And I don't if you know the movie "Slumdog
Millionaire." it's quite well-known. It's the Indian fellow who's had a hard life.
49.24
But just through serendipity he just happens to know-- he only asks the questions
he knows the answers to somehow.
49.30
He doesn't know much. But he knows the answers to the questions he's asked. And I
always feel that my career has
49.37
been a little bit like that. If any of the things on the bottom were missing, I
just don't think much of this would have come together.
49.45
Certainly Lee-- ARM was 25 years old in 2015. And Lee wrote me this lovely email
49.51
saying that he had been asked as one of the four founders that were still at ARM
what their most
49.59
significant milestone was. And he said it was hiring me on the telephone. I love
the quote.
50.04
He said, starting with memorable moments, starting with returning to the Barn, the
beautiful old Barn at quarter to 9 to phone you in New Zealand.
50.13
Robin Saxby, [INAUDIBLE] was just leaving the pub and offered to buy me a pint-- of
beer, obviously.
50.19
If I'd ever accepted and missed the interview, history might have been very
different. Ended with picking me up, taking to Mike Muller's place
50.25
for a shower. I'd been on an airplane for 24 hours. I'm glad he did that. And then
into the Barn and out for lunch
50.30
to a curry house in Bottisham-- another cute wee town. I'll never forget your
comment when your food arrived. "Gee, Mom. I flew halfway around the world to eat
lamb and potatoes."
50.38
Great time with great people. But yeah, a lot of serendipity in there. And that's--
Robin's 70 and I'm 50 in that photograph.
50.44
A couple of years ago. We still get together for our birthdays. So with a few
minutes to go, I've
50.49
got time for any questions. Sorry if that was a little rushed. But it's hard to
pack nine years into 45 minutes.
50.56
RAYMOND: You did an amazing job. You can go to the microphones for questions.
AUDIENCE: So what would you recommend
51.01
for somebody who is interested in learning about CPU design
51.06
and implementation nowadays, even as just a hobby? Or even just any silicon chip in
general.
51.15
DAVE JAGGAR: Is there any ARM snipers? No. I would Google RISC-V and find out all
about it.
51.23
They've done a fine instruction set, a fine job. And they're explaining it. This is
Berkeley and Stanford are behind this.
51.28
There are obviously commercial companies like [INAUDIBLE] doing things. But it's
the state of the art now for 32-bit general purpose
51.35
instruction sets. And it's got the 16-bit compressed stuff. So you're learning
about that, learning from the best.
51.41
Still. AUDIENCE: All right. Thank you. AUDIENCE: Hello.
51.46
So it seems like if you're programming in the '80s, you would know a lot more
things kind of down below,
51.52
like, the lower levels. And now things are so complicated that if someone's coming
out of school, they're not
51.58
going to be able to really understand everything that's going on below them. So do
you think that's sort of making it harder for us to have a full view?
52.04
Or maybe that's just the way that things are now. And we're just going to have to
accept that? What do you think about that?
52.10
DAVE JAGGAR: I certainly agree with you. There's so much going on. I mean, I'm
still very active. What was I doing the other day?
52.15
I have a-- first of all, let's talk about the Raspberry Pi. That whole program was
trying to address exactly what you're
52.21
talking about. It's giving something simple enough where you can look at a software
stack top to bottom. Well, that's still complicated.
52.26
Even I look at the boot process and go, man, this is hard work to keep in your
head. So there's a lot going on.
52.32
I absolutely agree with you. I personally hate programming languages like Python,
52.37
because I look at inserting something into the list. And just know how many
bazillion instructions are going on
52.43
to support that piece of code. I just can't quite get my head around doing all that
stuff.
52.49
I know it's productivity. I think the best we can probably do is things like a
Raspberry Pi.
52.57
I was recently looking at the host APD code recently cause it didn't work at 5
gigahertz.
53.04
And you can burrow down into that a bit and learn a lot. I think, to come back to
that statement about fog too.
53.11
When I started out I remember being-- I think scared is the right word. When you're
in that fog and you know nothing,
53.16
and you really feel like you're a dumb idiot. And you go, other people understand
this, but I don't.
53.21
I've sort of embraced that over time and gone, I know tomorrow I will know more
than I do today.
53.27
I always feel like I'm kind of groping around in a dark room trying to find the
furniture. But I think that's also the thing
53.33
is not to be afraid of that situation and know that you're a bright person if
you're in this room, let's face it.
53.41
Other people understand this stuff. But not to be afraid to grope around in the
dark like that and just try and get one more piece of information
53.47
than you got yesterday. And then slowly start-- stuff comes together, and you can
build on that.
53.53
But yeah. It's complicated now. I mean, look at-- well, I don't want to say Android
on top of Linux
53.58
on top of ARM. But man, there's a stack. There's a stack of code in there. I mean,
I've hacked around in that quite a lot.
54.04
And it takes a lot of understanding, even with my background. So yeah, it is. It's
complicated. It's hard.
54.10
Maybe there will be-- with machine learning-- maybe there'll be another big
revolution. I'm pretty sure it's coming.
54.16
Where we really look at what an algorithm is now in the modern world, and reinvent
hardware to support that top down.
54.23
So I really think that's coming. I've got a pretty good idea of how that will shake
down, I think. But yeah.
54.28
Yeah. AUDIENCE: How do you think it will shake down? DAVE JAGGAR: Pardon? AUDIENCE:
How do you think it will shake down?
54.36
DAVE JAGGAR: If-- and this is a really interesting experiment. I think everyone
should do this at some point.
54.42
Open a messenger session to a friend, have them use a different service provider to
you,
54.50
send them-- hit the 1 key. And run everything in between on a simulator.
54.57
And then just watch how much data gets sucked in and sucked out to send the one key
through all
55.03
that networking, all the fonts, all the graphics and everything. What's going on
is--
55.08
and I think it's-- there's this wonderful one analogy on YouTube.
55.15
It's a comedian. And he says, the difference between male brains and female
brains-- and this strikes a chord with me.
55.21
He basically says, men's brains put everything in little boxes, and the boxes
mustn't touch.
55.26
And female brains go [BUZZING SOUNDS] all the time. And I really think we have to
design hardware that's much closer to [BUZZING SOUNDS]..
55.34
I think a lot of engineering has got this data based in blocks.
55.39
And we call them buffers. And we have these interfaces where you call a piece of
code,
55.45
and that passes back a nice buffer. And then that code must never touch that code.
And that code must-- and they're all separated.
55.52
I really think we're going to end up with a machine where you put the data on the
top. And the data is going to fall out the bottom.
55.57
And it's working in a much more integrated way. If you YouTube that comedian,
you'll sort of understand. I'm not telling the story very well.
56.03
But it really comes across as I really think we have to be thinking in a much more
holistic way than generally engineers have in the past.
56.11
I think it's a limited way that we think when we partition data.
56.16
And it means that, of course, think about-- let me give an easy example.
56.23
Think about inserting a character into the middle of a string. So this should be
kind of easy, right?
56.30
If I'm a character of a string, and you're the next character of the string, and I
want to put a character in between us,
56.35
I say, well, I'm going to just not hold your hand anymore. You're going to hold his
hand. And away we go.
56.40
And it's easy. It's all local. And we understand exactly what's going on. When you
convert that into a computer,
56.45
you've got a 64-bit address. I've got a 64-bit address. I might be 0. You might be
1.
56.51
But I'm stored on a 64-bit number. I have absolutely no idea of what the locality
of you is in the program related to me.
56.59
But if you build on this hardware, and it's easy to do when you think about it, if
I want you to move along the array more,
57.04
I just pull on a wee line to you that says, increment your index, and slot that new
guy in. That's really easy to do.
57.10
If I want to delete you from the queue, I just say, remove yourself. And you say,
everyone north of you, decrement your index.
57.17
And everything will sort of close up. You get all that in a program. It's called
the directed flow graph.
57.22
That's all there. And we sort of throw that away with the stake of software we put
on top. That's the crazy thing.
57.28
All the layers that we've put in between, all the differences between hardware and
software and assembler and linkages and operating systems.
57.34
With layers and layers and layers. And you actually lose the meaning of the
program. And the hardware then works very hard to try and put
57.41
that meaning back together. Anyway. Sorry that was a long answer to a simple
question. RAYMOND: That's a great answer.
57.46
One thing, I want to say that I found-- I was really glad to hear you say was,
57.51
you'll be smarter tomorrow. DAVE JAGGAR: Yeah. RAYMOND: One of the things I always
tell myself, gets me through every day is, you know those smart gals and guys?
57.58
They're just meat and bone like you. DAVE JAGGAR: They are. Yeah, yeah, yeah.
RAYMOND: Thank you.
58.03
AUDIENCE: Great story. Thank you. I was particularly struck by one of the quotes
58.09
on the slide where Robin Saxby says that you will never
58.15
manufacture chips. DAVE JAGGAR: Yeah. AUDIENCE: And I was wondering if you could
talk more about those when you decide not to take a path.
58.27
I mean, was that a courageous decision? DAVE JAGGAR: He is incredibly--
58.33
there's this buzz word-- global. He was always global. He said, we have this
partnership business model.
58.41
We'll do this. And they do that. And we're not going to compete. And the very
broad--
58.47
and it's nowhere near this defined-- but the very broad thought was, if we design
the chip once
58.53
and sell it three times, we can afford to sell it for about half or a third of what
it would cost them to develop it.
59.00
They're getting a deal. We are getting a business. And there's just no need for us
to sell any product.
59.07
Our product is just going to be design. And it was a very successful intellectual
property company. I mean, as I said, it's tiny compared to Google.
59.14
But it really has no real product. And so his foresight was very strong about that.
59.20
We challenged it a few times. We should make a few embedded controllers to go on
development cards. No.
59.26
We should make some SOCs as demonstrators. No. And we did do some SOCs in the end.
59.31
But we never bought any fab space. Always done through partnership. And that clear
distinction was incredibly beneficial.
59.40
And he was he was absolutely rigid in that decision, and absolutely right in that
decision. AUDIENCE: It's slightly different from say, what
59.48
Qualcomm has done, for example. DAVE JAGGAR: That whole industry is sort of on
its-- on its--
59.53
AUDIENCE: Ear. DAVE JAGGAR: On its ear now. So now those fabs that don't design
anything. So yeah, absolutely.
59.58
The TSMCs and the global foundries of this world, you can just buy fab space. So
there's this other product family in it, knitted in
1.00.04
very well with what ARM-- you know, you've got a designer of a chip, somebody that
integrates the rest of the IP and then fabs it.
1.00.13
And so they're all quite separate things now. But yeah. The TSMC and global
foundries of this world are almost exactly the other way around.
1.00.20
AUDIENCE: Right. DAVE JAGGAR: Yeah. Again, they don't compete. AUDIENCE: Thank you.
DAVE JAGGAR: Yeah. RAYMOND: We're over. But I'm gonna say one more question.
1.00.25
I want to say host privilege. One more question. AUDIENCE: It's a incredible talk.
DAVE JAGGAR: Thanks. AUDIENCE: Very briefly, Specter and Meltdown.
1.00.32
DAVE JAGGAR: Yeah. AUDIENCE: So how much has that changed your thinking? And do you
feel like there's a future for CPUs where they
1.00.40
solve the problem in some way? Or there is securer CPUs that have completely
rigorous predictable performance,
1.00.47
and others that can have variable performance, but a risk of side channels. Thank
you. DAVE JAGGAR: This answer is in 1996
1.00.53
I wrote a patent that said if you bring anything speculatively into the chip, make
sure you take it all the way back out again.
1.00.58
I guess they lost that patent down the back of the couch, right? AUDIENCE: They
did. DAVE JAGGAR: I always hassle them about that.
1.01.05
Again, that's thinking like a software guy. Bringing that stuff in speculatively.
You've got to take it out again, guys.
1.01.10
You can't leave it in the processor. How to handle that in the future.
1.01.15
I think we're all basically nice engineers that just don't expect people to stab us
ins the back with a--
1.01.20
we're just-- and now we're all a little less innocent and probably looking at how
can we break this thing.
1.01.26
But we're always going to be chasing our tails, you know. It's impossible to find
every single backdoor into the processor.
1.01.32
We're always going to be chasing our tails as far as trying to spot where some
sneaky little person might-- and so they should, by the way.
1.01.38
You know, if they don't do it, someone that's really nefarious will. But I don't
know where there's a good solution in the--
1.01.46
my patent, while smart, was a whole lot easier back then when the chips were a
whole lot simpler.
1.01.53
But I think now the side channel attacks. [INAUDIBLE] better known. Or we're better
able to handle them.
1.01.58
Yeah. AUDIENCE: Thank you. DAVE JAGGAR: Yeah. RAYMOND: All right. With that, thank
you all, and thank you, Dave.
1.02.04
[APPLAUSE]

You might also like