Download as pdf
Download as pdf
You are on page 1of 818
(Carl Hamacher et al.: Computer Organization, SECISBN 0-07-112218-4). ‘Copyright © 2002 by The McGraw-Hill Companies, Inc. All rights reserved. Jointly published by China Machine Press/McGraw-Hill. This edition may be sold in the People's Republic of China only. This book cannot be re- ‘exported and is not for sale outside the People's Republic of China, AIRC EA SEAM raw HLS FLOUR T Met RAE BA REMAN AT, AMULET VERT, RRLUE TITS, SLT ABP OMEMTEBA. IROL, LIE ABMRBCS: MF: 01-2002-2178 BEMUA (CIP) Si PEELE (BESCHE - BESHE) / (38) PEI (Hamacher, C. ) 2. ACR: BURT MAHAL, 2002.6 (BSB MED KIRK: Computer Organization, SE ISBN 7-111-10346-7 Tithe IO et TL RSL REAH JEW. TP303 Pint AR BHRICIPRIGH A (2002 ) 380380145 PUTA Ctocmrmmecerine e229 Asst 100037) ES: OR AREA DEL «HEB RAERRATRITT 2002463 981A 2 EA 850mm x 1168mm 1/32 + 26 G13K IB: 0 001-3 0008 Seth: 48.0076 SUMATRA ORT, ARAL TARR RSA REAAUM, BERKEL RO FRE, PIE REA RA FOE AUG T BEE; HERE, (IEE PA BRRRROAT SPS SEH. BR. ARCH , 3 OWA SRR RS, LER HOS Mest lat SAA ASCE RMR, ITT HE OE EE, OLR T BSE RS OM, ERAT PRORE, MERRIE, RAW EAME, IMT AAA A Ov MTB. FE, ESCM MEST, RENT RRRE, MAAC ROR F238). ROT AOL OC AA FEAL OL, tJ KR; HOLA REA A BER RE, CREAR RAT PLR. ALA ARR ES RAR TS, SESE Hes al ESCH LAL eA IL PRE HE RBH TASES. Alt, sles RoE AAEM RAH HOLA PLN ARERR OEE, WTR Dh, BRAEMAR RAKE PUL Mh th RH i ICA A RS A “HEAT ARIS” A 1998, ERA AR LR ET BR, BRAM L. w RULE SAMA, HA1-GPrentice Hall, Addison-Wesley, McGraw-Hill, ‘Morgan Kaufmann tit #24 tH WAS AIRY T FM ATERAR, EMA SAMO PME Tanenbaum, Stroustrup, Kernighan, Jim Gray$AI% H ASHER, UA “ETAL” PR, RAE. OER ‘BER, AA SCRM ST, HIER T RASA. “HAE AB” APRS T AE a A RR, BA RAAT PTS, SHE T RA ROLE ST GRP SESE EP EH, AES ATE FF. £4, “HROAFAR” CSHMTHATEM, Re RRA LT RHO, HRASHRRAHERAMASS HM, WUE SRURITT T BSA aR ALR EL A BL 8 BE eA HE ETRE, SCIENTIST SOTERA A— MRT. ikke DHE MAS TESTE AY DBE, te “ROE” MEMS FBS TRAIT IEBLE : Eb EBLE, SIRULIOR EMR “BUMS” A; SBM, AOAC FRE “SSR” 5 EE RRA eS “TLL POAT” TERROR AOU, EEL ITAA. Hy T PRE = A BURHE, FIDL T ie i RRA ATR, ERAT PRE BE, ALR AY, EA, RHR AY, MAA, bi A KE, WK, PARA, MRR KE, BERHAE PBA RAF, MEMRAM AE, PU, MRO BLT AE SHA, MILER, PARABLE AEP SAAR RASA ATH AMOS ARES AHEM “CRHPBAS", HRN ‘Pes EL A RCE “BITE” ALLAH ABBE HH AH SB, RA PERTH ABC TTI. HES EMR A “SRE FR A” RIL, RAS TRIOS HMA AEE. VERRAN, IC "PRO ABSY ES YEMLLT., Stanford, U.C. Berkley, CM.UFIER2MICER AL. DAAOUHR T BERRY. BURSA. PERSE. TRL, BORE, POR. KL, UE, SR, ROE SIAL PATRHRORE, HASAN —AMM ABARAT. AMS HEMT. AE RULE BTR BOR IA. FE SE DU Mee ME RHEL, Bee R LAE Rh AS, BUBISTER . BICC. —TLANFER. PORN AR EAA, SK HAR ORMORAA TRO RE, RNY A RERM RE, TREY RES RA BI — AR BE A bE a aL FR TT SER» PRES IDE MAEM HT RATIO TL PEE EI PIE, TURN Tr tea 'WFABHF: hzedu@hzbook.com IRMA: (010) 68995265 ARM: IR TT ES 6EGABS: 100037 KFA BRE KLM BE ARR HEAR wee RER ERESERS (BIE ECE UE ) zm a # PR a FURS HA me E RR eR BF SE RS Ee BBR He RL KER FP eat ame ER 28 a RK KAR Wat ata = Ow REL wae ABOUT THE AUTHORS (Cart Hamacher rosivedhis B.A Se degree in engineering physics fromthe University of Waterloo, Canada, an M.S. degre in electrical engineering from Queens Univer sity, Kingston, Canad, and PhD. degre in electical engineering from Syracose University, New York. From 1968 o 1990 he was tthe University of Toronto, where be was a Professor inthe Departments of Electrical Engineering and Computer Sci ence He served as director ofthe Computer Systems Research Insite during 1984 to 1988, and as chairman ofthe Division of Engincering Science daring 1988 to 199, Since January 1991 be has been a Professor of Electrical and Computer Engineering at Queen's University. He served asthe dean ofthe Faculty of Applied Science from 1981 to 1996. Dring 1978 to 1979, he was a visting Siem at the IBM Research Laboratry in San Jose, California in 1986, he was a esearch visitor atthe Labor tory fr Cireuits and Systems associated with the University of Grenoble in France. In 1996 to 1997, he was visting profesor in the Comper Science Department atthe University of California et Riverside and in the LIPS Laboratory ofthe University of Pars VI, France. Hisreserch interests are in mahiprocssors and multicomputers, focusing on their interconnection networks. ‘Zvonko Vranesic received is B.A.Sc., M.A.Sc, and PhD. degrees, in electrical engineering from the University of Toronto From 1963 to 196S he workod asa design engineer withthe Northem Electric Co, Li. in Bramalea, Ontario. In 1968, he joined the University of Toronto, where he is now a Professor in the Department of Electrical and Compute Engineering and the Department of Computer Science. During 1978 to 1979, he was a senior visitor atthe University of Cambridge, England, and during 1984 to 1985 be was atthe University of Pars VI, France, In 2000 o 2001, he was a principal software enginer at Altera Corporation in Toronto. From 1995 to 2000, he served as bar ofthe Division of Engineering Science atthe University of Toronto, His curent esearch interests include computer architecture, eld-programmable VLSI technology, and mulple-valved logic systems. He isa coauthor of three other books: Fundamentals of Digital Logic with VHDL Design, Microcomputer Structures, and Field Programmable Gate Arrays. In 1990, be received the Wighton Fellowship {or “innovative and distinctive contibutions to undergraduate laboratory instruction.” Safwat Zaky received his B.Sc. degree in electrical engineering and B Se. in ‘mathematics both from Cairo University, Egypt, and his M.A.Se. and PhD. degrees in clectical engineering fom the University of Toronto. From 1969 to 1972 he was with Bell Northem Research, Bramales, nro, where ht worked on applications of eectro- ‘optics and magnetics in mass storage and telephone switching, In 1973 he joined the Univesity of Toronto, where he is now a Professor in the Department of Electrical and Computer Engineering and the Department of Computer Science, Presently, he ‘Asoo na AUTHORS ‘serves as chair ofthe Department of Electrical and Computer Engineering. From 1980 01981, be was a senior visitor at the Computer Laboratory, Univesity of Cambridge, England. His research interests rein the areas of computer architecture, reliability of digital reait, and electromagnetic compatibility. He sa coauthor ofthe book Microcomputer, ‘Structures and isa recipient ofthe IEEE Third Millennium Medal CONTENTS 211 NamerRepeseataon 7 212 Addon of Pxive Nomis 28 213 Addon and Sobeaion of Signed Nombes 29 214 Ovetow in eps Aitmesc 32 BASIC STRUCTURE OF COMPUTERS 1 oe LL CompoterTypes 2 22 Memory Locations and Adresses 33, 12 Fonetonal Units 3 221 ByeAdieaabiliy 33 121 putt 4 222 ‘Bipenun and Lithecan 122 Meno tit 4 ‘Assigns 35 123 Aides and Login 5 223° Word Aigmest 36 124 Oupet tit 6 224 Aeceuing Neaie, Chance, st 125 Conmltik 6 Cancer Sings 6 13 Basic Operational Concepts 7 23 Memory Operations 36 14 Bus Stores 9 24 ntact and nto Seri 37 1S. Sofware 10 Regis Tf Nousion, 37 16 Perfomance 13 242 Asean No 38 161 PreeworClock 14 243° Buiclastcion Types 38 162 Basic Peforme: Baton 1 2144 lnaraconExecation nd Sight Line 163 Piping and Soper Sequencing 42 Opcaion 5 245 Brachig 44 164 CockRae 16 246 Conon Coss 46 1.65 —Insracsn See CISC and RISC 16 247 Geneing Mnory Adie 47 166 Comper 17 25° Adirsing Maes 48 1.67 Peformance Measurement 17 251 Implement of Vibes nd 1.7 Malirocessors and Maticompotes 18 Conan 18 Historical Penpective. 19 252 Indieion ad Pines. 50 181 ThefimGeertion 19 253° nsingend Arye 52 182 TheSecondGexerion 20 254 Relwve Addeaing 56 183 Te TiadGamaton 20 255 Addon Motes 56 184 TheFomh Geran 20 26 Assembly Languge 58 185 Beyond he Furth Gearon 21 261 Asner Dities 59 186 Brau of Pefrmance 21 262 Aseatiy and Exton of 19 Concuding Remarks 21 Progam Proton 22 263. Number Nowion 64 lemme 2 27 Basic lpayOupat Operations 64 28. Sacksand Ques 68 Chepter 2 29° Subroutines 72 [MACHINE INSTRUCTIONS AND 29.1 Sobre Nesing andthe Prose Suk 73 ee 292 PaancerPasing 14 21 Names, Arithmetic Operations, and 293 TheSuck Fame 75 Charciers 27 2.10 Additional Instructions 81 ‘coxmmets 210 Logic tations 81 2102 SuftandRowe tmtions #2 2103 Mukipiesion and Division 86 211 Example Progams 86 DILL Veet DotProduct Program 86 2112 ByteSoring Progam 37 2113 Linked Lis 19 2.12 Encoding of Machine Instructions 94 213 Concluding Remarks 98 Problems 98 Chepter 3 ARM, Mororot, AND INTEL INstRUCTION SETS 103 Part The ARM Example 104 31 33 34 33 37 Registers, Memory Acces, and Data Transfer 104 BLL Regier Souctue 105 3.12 Memory Acces Isructon and Aden Modes 105, 3.13 Regier Move nsrctions 113 ‘Aithmetc and Logic Instructions 113 321 Aimee nares 113 322 Logictnrucions 11s Branch Instructions 116 331 Seting Condition Codes 117 332 ALoop Progam fr Ading Nantes 118 ‘Assembly Language 118 341 Preedolnmmcins 120 WO Operations 121 Subroutines 122 Program Examples 126 371 Vector Dot rodct Progam 126 372 ByteSoning Progam 127 ‘373 Liked Lit nerion and Deletion Subrowines 127 Part The 68000 Example 130 38 39 Registers and Addeesing 131 381 The6lO00 Regier Sew 1BT 382 Adéesing 131 Instructions 136 3.10 Assembly Language 140 3.1 Progam Flow Control 141 BALL Condon Cade Fags 141 3112 Branch inrcions 141 3.12 WO Operations 145 313 Stacks and Subrouins 146 3:14 Logie Instructions 151 315 Program Examples 152 BASIL VecorDot Product Progam 152 3152. ByteSoring Progam 153 3153 Liked List scion nd Delton Subrosins 158 Part I The1A-32Peatiom Example 155 3.6 Regisers and Addressing 156 3161 AS RepiserSimcnae 156 3162 32 Adteasng Modes 159 317 TA32 Instructions 164 SITLL Machine Insraion Format 16 3.18 IA32 Assembly Language 170 3.19 Progam Flow Contol 171 3119.1 Condon Jumps sné Condon Cae Fags 7. 3192 Unconditional emp 173, 3.20 Logic and ShifVRotae Instructions 173 320. Logic Operations 173 3202 Shiftand Rote Operations 173 3.21 WO Operations 174 3211 Menor-MappesO 174 3212 twiued uo 1s 3213. Block Tasers 176 322 Subroutines 177 323 Other Insrucions 182 323.1 Muliply and Divide lntuctoas 12 3232 Malinda Benson (MX) Tnseweos 183 3233 Vecor(SIMD) Inston: 184 324 Program Examples 184 32K Veer Dot Product Progam 184 3242 ByeSoning Progam 18S * ‘3243 Linked-List nserion and Deletion Subrosines 85 325 Concluding Remarks 188 Problems 188 Referees 201 421 422 423, 424 425 426 mere Harare 210 Enabling and Disabling interop: 211 ‘Handling Mule Devies 213, CConsing Device Regis 217 Exceptions 218 seat inept in Opersig Systems 220 43° Processor Examples 204 43 432 433 ARM Interop Stacie 224 (0 nerap Stace 229 Peatum lnempt Swetue 1 44 DinetMemory Access 234 441 Bos Actiraboe 237 45° Buses 240 451 452 453 Synckonous Bus 241 Asynchronous Bus 244 Dicason 247 46 Imeriace Circuits 248 461 462 Parle! Por 248 Seal Por 257 417 Standard UO Interfaces 259 an 472 473 Peripten! Component Inesconnet ch Bis 261 SCSIBLs 266 Universal Sec Bus (USB). 272 48° Concluding Remarks 283, 51 Some Basic Concepts 292 52. Semiconductor RAM Memories 295 521 522 523 524 525 526 527 Total Orgsizaton of Memory Chips 295 Stale Memeres 297 Asychonous rams 259 Syacronous RAMS. 302 Sirf Lager Memories 305 Mery System Consieraions 307 Rantus Memory 208 53° ReadOnly Memories 309 531 532 533 534 535 ROM 310 PROM 311 EPROM. it EEPROM 311 PasbMemory 312 54 55 56 87 58 59 owners a Speed Size and Cost 313, (Cache Memories 314 55. Mapping oneons 316 552. Replacement Agoiins 321 553. Buumpleof Mapring Techniques 322 554_ Bump of Caches in Commercial Process 325 Performance Considerations 329 561 Ineceaving 320 5.62. WiRae and Mis Penay 332 563. cach onte Processor Chip 334 564 Other Entancenens 335 Virtul Memories 337 S71 Ades Tanlion 339 ‘Memory Managemeat Requirements 343 Secondary Storage 344 59. Magic Hard Disks 344 592 OptculDists 382 593. Macc Tape Syems 358 5.10 Coociuding Remuris 359 co 63 64 6s 66 Problems 360 References 366 ‘Addition and Subretion of Signed Numbers. 368 6.11 AdétonSibracton Logic Unit 369 Design of Fast Adders 371 621 Cary-Loolatead Adéton 372 ‘Maitiplcatio of Positive Numbers 376 Signed Operand Muliplcaion 380 641 Boo Aig 380 Fast Multipiaion 383 65.1 BiePaicRecding of Muipies 384 652 CanySew Addiionof Summands 385 Ineger Division 390 Floating-Point Numbers and Opermions 393 67.1 IEEE Sundar for Flowing Point Nanben 394 672 Arithmetic Oprsions on Fatng Point Nombers 398 673 Gaurd BisandTuneation 399 2 2B 15 16 al 82 83 84 ‘Execution of Complete Insructon 421 721 Beechiosrcins 422 ‘Multiple Bus Organization 423 ‘Hardwired Contol 425 TAL ACoaplee Pressor 428 Microprogrammed Coatrl 429 7S Micinamctiont 432 752. Micropopram Sequencing 435. 733. Wite-Branch Addeaing 437 TSA. Micrinaretions wit Next Adress Fad 40 SS. Prfeching Micvinsructions 443 736 Enulsion 443 Concluding Remus 445 Problems 46 Basic Concepts 454 BIL RoleofCacheMemory 456 812 Pipeline Peformance 458 Daa Hazards 461 821 Operand Fowaing 462 822 Handing Daa Hazan Software 464 823 SideBeen 464 Tnsruction Hazards 465, 83.1 Uncontionl Branches 466 832 Condoal Branches nd Betoch Prion «70 Influence on Instruction Sets_476 841 Adiesing Maes 476 842 Condon Coles 478 85 87 atapath and Conwol Considerations 479 ‘Supersalar Operation 481 861 Outot Over Excation «83 862 ecto Compeion 485 863. Dispuch Operation 486 UlnaSPARCILEXAMeL 496 871 SPARC Arhiecure 457 872° UmnSPARCH 43 873 PiplieSiuetue 493 Pesfasmance Coaideraions 503 BRL Ble isrcion Hears 04 882 NumierofPiptne Sages SIS Concluding Remasks 506 an 92 93 94 9s 96 97 ‘Examples of Embodéed Systems 512 91.1 Micwowse Oven S12 9.12 Digial Caner si 913 Home Telemetry S16 Processor Chips for Embedded ‘Applications $17 ‘ASimple Microcoowoller S18 931 Pend OPors S18 932 Seal ¥Otnerace 521 933 Coumertimer $23 9134 Inerupt Control Mectasism 525, WO Device Timing Constants $31 95.1 CProgamfor Taner via Cicalar afer $33 952. Asembly Langage Program or “rater via a Ceca Baller So Reaction Timer — An Example 535 9.61 Crogan forthe Reason Tine 337 9.62. Assembly Langsge Progam forte Reaction Timer 537 963 Final Commens 541 Embedded Processor Families $41 97.1 Micocourles Based on te Ie mast 582 9.72 Moora Merwcourles $12 973 ARMMicncsamolles 543 98 Design sroes S44 99 SyemonaChip 546 991 FPGAImplemeasion S47 9.10 Coociuding Remaris 549 Problems 550 References $52 Chopter 10 COMPUTER PERIPHERALS 553 10.1 Input Devices 554 TOLL Keybow 354 1012 Mowe 355 10.3 Trackball Josk, nd Touchpad 556 1044 Seamas 557 102 Ouput Devices $58 1021 Video Diplays 558 1022 FarPael Displays 559 1023 Prem 560 1024 Graphics Acelernors S61 103 Serial Communicaton Links 563 103.1 Asyncironas Transmission 566 1032 Syctroous Tanenison S68 103.3 Standard Commusiaons Inerfces 571 104 Concoding Remarks $74 Problems 575 Chapter 11 ‘Processor FAMILIES 57 Ut The ARM Family 579 TLL The Thumb namin Set 579 TLL2 Presse and CPU Coes 5#0 112 The Motorola 690X0 and Cole Families 582 1121 68000 Preeeor Sk 11.22 Enhances in 64030 ard 68040 Proceso 5H 11.23 66060 Prcenor S85 1124 Theale Family 58s 113 The nel TA32 Family 585 HBL 1AS2Memory Segmentation 6 1132 SiteenBa Mode 588 1133-30386 and 80486 Promos 588 1134 Peotum Prcesar 589 1135 Peaiun Pro Pocesse 589 1136 Peetium and ll Prcenors 590 corners a 1137 Petam4Procssce 590 138 Advanced Mic Devices 32 Promos $91 114 The PowerPC Family 591 HAL Regier Ser 591 1142 Mera Addesing Modes 592 1143 Imeacons 592 1144 PowePCPrcson 592 115 The Sun Micrsysiems SPARC Family $94 116 The Compaq Alpbs Family 596 116.1 lestucion nd Adresing Mode Formas 59% 11.62 Alpha 21064 Processor 597 1163 Alpha 21164 Processor 597 HG Alpa21264Prsoe 597 117 The intel 1A-64 Family 598 T21- lnerocion Bunks 598 1172 Conditional Exceton 598 1173 Spealve Loads 600 1174 Reyisersanl the Regier Stack 600 1175 nium Pees 602 118 AStck Processor 403 TIB1 SuckSmecure 604 T182 Suk aactons 606 11.83 Hardware Regen nthe Suck 610 119 Concluding Remarks 612 Problems 612 References 614 12.1 Forms of Panel Pressing 619 TRL Clasation of Parallel Sreure 619 122 Amy Processors 620 123 The Structure of General-Purpose ‘Malprocessors 622 124 Imerconnection Networks 624 1241 Sap Bus 624 1242 Croc Networks 625 1243 Matiage Neworks 626 1244 HypeesbeNewors 628 1245 Me Netwrts 630 1246 TreeNewors 630 1247 Ring Networks 81 1248 Pracal Considerons 632 1249 Mined Topology Nerworks 636 ” ‘cones 125 Memory Organization in Mubiprocessons 637 126 Program Paralleism and Shared Variables 638 : 1261 Accessing Sued Wishes 610 1262 Cathe Cobeence 641 1263 Neal for Looking ad Cate Cohereace 615 12.7 Muticomputers 645 1271 Lael Ara Neworks 646 1272 beet (CSMAVCD) Bus 616 1273 Token Ring. 647 127A Newerkat Worktaions 617 128 Programmers View of Shared Memory and Message Passing 648 1281 Sted Memory Cue 648 1282 MesagePasing Cae 651 129 Peformance Considerations 653 129.1 AmdshisLaw 654 1292 Peformance Ingest 686 12.10Concluding Remarks 656 Problems 657 References 60 APPENDIX A: LOGIC CIRCUTTS 661 ‘Al Bwic Logic Functions 662 ALL Becconic Logic Gaes 668 A2 Symthesisof Logic Functions 666 ‘A. Minimization of Logic Expressions 668 AS.L Minimization Using Karoaugh Mags 671 ‘A32_ Don'sCareCondons 674 ‘AA Synthesis with NAND and NOR Gates 674 Practical Implementation of Logic Gues 678 ASI CMOSCieoits 6H ‘AS2- Propgatoa Delay 686 AS3. FaninandFun-Out Cons 687 AS4 ThimaeButen 687 ASS tmegraed Cis Packages 68 AG FlipFlops 690 AGI GnedLactes 650 ‘AG2 MaterSisveFipFop 694 AGS Bige Tiggsing 64 AGS TRlpFp 657 AS AGS IKFipFop 67 ‘AGS FlpFop with Preset sod Cex 68 ‘AT Reivers and Shift Regisers 699 AB Counters 702 AS Decoders 703 ‘10 Matipleres 705 ‘AI Programmable Logic Devices (PLDs) 705 ALLL Programmable Logic Amey (LA) 707 ALD Programmable Any Logie AL 710 ‘A113. Complex Programmable Lope Devices (ety) 71, A12 Field Programmable Gate Arays 712 ‘ALS Sequential Circuits 714 A131. An Example of an UyDown (Coaner 76 ‘A132 Timing Dingans 718 ‘A133 ToeFpie Ste Machine 719 A134 Syutbiisof Finite Sine Machines 720 ‘A.M Concluding Remks 724 Problems 724 Referens 731 ‘APPENDIX B: ARM INSTRUCTION Ser 733 B.I Instruction Encoding 734 BLL Antec an Logic Inaracion 736 BAL2_ Memory Load and Sire lasracions 741 BL3 Block Loa and Sire Inston 744 BLL Branch and Branch with Link Insracons 787 BLS Mastne Conta Iasueons 747 B2. Other ARM Instructions 750 B21 CopessorImsiuctoas 750 B22 Versions wands Insrcons 750 3. Programming Experimens 750 APPENDIX C: MOTOROLA 68000 INSTRUCTION SET 751 ‘Apenprx D: INTEL TA-32 INSTRUCTION SeT 769 Dil Iasuuction Encoding 770 DLL Adiesing Modes 72 D2 Basic Isructons 773 D21 Comin! Jump lnsrcons 782 D22 Unconditional ump Inuctons 782 D3 Prefix Bytes 782 DA Other Insictons 783 DAL Sting insraions 723 coxners ~ DA2 Flowing Poi, MMX, and SE Tsrucons 724 DS Sixten-Bit Operation 785 DB Programming Experiments 785 APPENDIX E: CHARACTER CODES AND NUMBER CONVERSION 789 ELL Character Codes 790 2 Decimal-to-Binary Cooversion 793 INDEX 795 PREFACE ‘This book is intended for ws ina fit Hevel course on computer organization inelec- ‘tical engineering, computer engineering, and computer science curticula. The book is self-contained, assuming only hatte reader has abasic knowlege of computer pro- ‘pamming ins high evel language. Many stalets who stu compute organization ‘ill hav had an intodoctory couse on digital logic cits. Therefore, this subject ‘snot covered inthe main body ofthe book. However, we have provided an extensive appendix on logic circuits for those students who need it. ‘The book reflects oar eperience in teaching computer organization tothe ds- tint groups of undergraduates: elecical and compe: engineering undergrads, compte cence specialists, and enginoering science undergraduates. Weave always approached the teaching of courses in this are from practical pat of view. Tus, 2 key consideration jn shaping the conteats of the book has beea to illustrate the prin- ‘ples of computer organization using examples drawn from commercial evilble ‘computers. Our main examples ar based nthe following processors: ARM, Motorola ‘680X0, intel Peatom, and Sun UiSPARC. ‘tis important to recognize that digital system design is not a straightforward proces of applying optimal design algorithms. Many design decision ae based largely ‘on heuristic judgment and experience. They involve costiperformance and hardware! software tradeoffs over a range of alternatives. It is our goal to convey these notions to the reader ‘We have endeavored o provide suficint details to enconrage the sadet to dig ‘beyond the suse when dealing with ideas that seem tobe ituitvely obvious. We ‘believe tha this is best acccmplished by giving real examples that are adequately doc- ‘umented. Blok diagrams are powerful means of describing organizational features of «computer. However, they cn easly lead to an oversimplified view ofthe prob- Jems involved. Hence, they must be accompanied by the details of implementation aliratvs. ‘The book is aimed at a one-semester course in engineering or computer science programs. tis abl fo both bardware- and softwar-rientedstudeats. Even hoogh the emphasis ison hardware, we have dressed a number of software issues, nlud- ing basic aspects of compilers and operating systems related to istruton execution formance, cotdnaton of parallel operations at thesysem level, andrealime appli- cations. An understanding of hardwaresofware interaction ad trades is necessary forcompuer specialists. "Ta Soorr, OF THE BOOK ‘We now review te topes covered in equece, chapter by chape. Theft eight chap- ters cover the basic principles of computer organization, operation, and perfomance. Paice ‘The remaining four chptrs deal with embeded systems, erphertldevies, processor family evoltion pater, and large computer systems. Caper provides an overview of compote hardware and software and informally introduces tems that re deat within more depth inte reminder ofthe book. This chapter discusses the basic functional units and the ways they are interconnected to form acomplee computer sytem. Te ole of system software isintrodaced and basic aspects of performance evaluation are discussed. A bref treatment ofthe history of computer development is so provided. Chapter 2 gives a methodical weatment of machine instrcton,adesing ech igus, and instruction sequencing. Basic aspects of 2'scomplement arithmetic are introduced to fatat the discussion ofthe generation of elective adresses. Program examples atthe machin instruction level, expressed in a generic assembly language, ae used to discuss lops, subroutines, simple inpt-oupe programming, sorting and linked ist operations. ‘Chapter 3 ilstrates implementation ofthe concepts introduced in Chapter 2 on three commercial process ARM, 68000, and Pentium, The ARM processor i- Jusrats the RISC depgaity the 68000 has an esy-o-each CISC design, while the Pentnm represents themadseunctessful commercial design that combines th elements ‘ofboth the CISC and RISC-syes. The materials organized ino thre independent nd ‘complete parts, Ech pat incdes al ofthe examples from Chapter 2 implemented in the context ofthe specific proceso. Is sufcien to cove only one ofthe tre arts to Provide the continuity needed to follow theres ofthe book. I laboratory experiments ‘sing one ofthe tree processors are asociated with the cous, the relevant part of ‘Chapter 3 can be covered in parallel with Chapter 2. Inpt-output organization is developed in Chapter 4. The basis of VO data transfer synchronization are presented, and seis of increasingly complex VO structures ae explained, Inerupts and diret-memary acces methods are described in deta, incloding a discussion ofthe role of software interups in operating systems. Bus protocols and standards ae also presented, withthe PCL SCSI, and USB standards being used as representative commercial example. Semiconductor memories, cluding SDRAM, Ramis, and Flash memery imple- ‘menttions are discussed in Chapter S. Caches and muitple-movale memory systems ae explained as ways for increasing main memory bandwidth. Caches are discussed in some detail, including performance modeling. Vitual-memory systems, memory management, and rapid adress translation techniques are also presented. Magnetic and optical disks are discussed as component inthe memory birarchy. Chapter 6 treats the arithmetic unit ofa compute. Logic design fr fixed-point dd subtract, mukiply, nd divide hardware, operating on 2'-complement umber is Aesribed. Lookahead adders and high speed multiplies are expline, including de- scription ofthe Booth mailer recoding andcary-save ation eeniques.Foaing- point number represeatation and operations, nthe context ofthe IEEE Standard, are preseted. ‘Chapter 7 begins wih aregister-tansfer-Jevel treatment ofthe implementation of instruction fetching and execution in a processor. Tiss followed by a discussion of _rocesor implementation by both hardwired and microprogrammed contol. Paurace Chapter 8 provides a dtd coverage ofthe us of pipelining and mulipe fc tionuntsin the design ofhigh-prfomance processors. The elo the compile andthe relationship between pipelined execution and instruction set design ar explored. Su- persalr processors are discussed, andthe Sun Microsystems UltreSPARC I processor onpanizaton is sed oils he concep Today here are mary more proceso ins in embeded ystems tan in general ‘pupose computers. Tis increasingly important sujet, where a single chip integrates ‘he processng, UO, and timer functionality needed wide range of low-cost splice tons, isteated in Cape, Sytem imegratio ses, interconnections, nd rele softare a discussed. Chapter 10 presents peripheral devices and compute interconnections. Typical inpavoutput devices are described and hardware needed to support computer graphics spplcatons is introduced. Commonly used communication inks, soch as DSL, are discussed. ‘The eveutin ofthe ARM, Motorola and ne! proceso families is discussed in (Chapter 11. This chapter highligh the design changes that to higher performance ‘he PowerPC SPARC, Alpha, and Intel LA4 fame ar also disused Chapter 12 extends the discussion of empater rgnization tole systems that usemany proceso operating in paral! Interconnection networks for multiprocessors are described, and an inodston to cache cobereae contol is presented. Shae memory and message-passing schemes are dscased. Chars me Frnt Enrnon ‘Majorchangesin content and organization have been made in preparing the ith edition ofthis book. They include the following: ‘Chapter? ofthe fourth edition hasbeen splitinto two chapters — Chapters 2 and 3 —in the fifth edition. An expanded treatment of basic issues, explained using generic instructions, is presented in Chapter 2. More programming examples for ‘ypical asks, both numeric and non-numeric, are provided. Chapter 3 uses the Instruction ses of ARM, 68000, and Pentium processors to show how the basic concepts af instruction st design have been implemented in both the RISC and CISC design styles. + The discussion ofthe roe of pipelining and multiple functional units in processor desig has boen extended significantly. The UlraSPARC architecture i used to provide specific examples of performance-eahancing design features. + Arnew chapter on emibedded-processor systems has boen added. A generic de sign ofa typical system is used es the bass for detailed discussion of example pplication. In addition to these main changes, many recent echnology and design advances have ben added toa number of chapters Peace WHat CAN BE CovemeD IN 4 ONE-SeMwsTE:® COURSE ‘This bookis suitable for use tthe university or college levelasatextforaone-semeser coun in compater organization. It is intended for use in the fist course on computer ‘organization tht the students wil take. ‘There is more than enough material in the book for a one-semester course. The core material is given in Chapters 1 through 8. Fr students who have not tad a course ‘in logic circuits, the basic material in Appendix A shouldbe studied at the beginning ‘ofthe course and certainly prior to covering Chapter 4. Chapters 9 through 12 contain a variety of useful material thatthe instructor may choose from if time pet. Particularly suitable are the discussion of embeded sys- tems in Chaper 9 and the description of hardware found in most personal computers sven in Chapter 10. ACKNOWLEDGMENTS ‘We wish o express ourthanks to many people who have helped dhrng the preparation ofthis ith editon. Gail Burgess and Kelly Chan blpe withthe techicalprepaa- tion ofthe manascrit. Alex Gri, Frank Hsa and Robert provided valuable help ‘witha number of programming examples. Ourcoleagues Tack Abelahman Stephen ‘Brown, Paul Chow, Glenn Gulak and Jonthan Rose offered constructive comment. ‘We are particularly grate to Stephen and Tack fr thee belp with important detail, The reviewer, Gojko Babic of The Ohio State Universi, Nahaiel Davis of Vi- sini Polyectnic Insitute and State University, Jose Fre of Purdue Univers John Greiner of Rce University, Sung Hof San Francico State Univers, Ali Hurson of Peasyvania State Universi, Lizy Kran John of University of Texas t Ans, Stefan Leue of Albert Ladwigs Univenitat in Freiburg, Fabrizio Lombard of North. caster University, Wayne Loucks of Univesity of Wate, Praant Mohapatra of Towa State Univesity, Daniel Tabak of George Mason University, and John Valois of ‘Reassele Pol technics gave us many exelet suggestions and provid coa- structive exticiom. We wan to thank Eli Vranesi for permission to use his painting “Fallin High Par” on be rot cover: ecreatdit using the compte a paint rsh Finally, we truly eppecse te suppor of or eit, Catherine Felis Sul, and ber McGraw-Hill ascites: Kelley Butcher, Micelle Fomentof, Kelah Graham, Besy Sones, Rick Noe, Heather Sabo, and Cristie Walker. (Cari Hamacher Zvonko Vransic SafwatZaky CHAPTER 1 BASIC STRUCTURE OF COMPUTERS CHAPTER OBJECTIVES Inthis chapter you wil be introduced to: ‘The basic structure of computer Machine instructions and ber execution ‘System software that enables the preparation and execution of Programs Performance issues in computer systems ‘Thehistory of computer development + Basic Sravcrur oF Cones Tistookissoutcompterorarinion describe te festonmnexgoftheve- ious units of digital computers that store and process information. It also deals with the taitsof te compe hese infomation fom extemal sues and sed computed ‘sul toentemadeinatons. Mostfthe mate inhs bookie camper, fenivare mi computer aciecur, Copper bdwae cose of econ cir cuits, plays, magoetc an opal serge mei cleromecanical expen communication fies Compute achecure encompass te specication of n ination stand the hardware unt at implement te stacy, Many spc of progamming an softwar components in comput systems ae alo dsc ins book Iti important conser bth hadvae and stare aspects of the design of various computer components in order to achieve a good nletanngof compu syens. Ths caprintedices nue of aware and sofware concep, presets some conmon erminology, ad gives x trond overview ofthe fndamenal specs of the subject. More detailed discussions follow in subsequent chapters. 1.1 COMPUTER TYPES: ‘Lets ist define the te digital computer o simpy computer Inthe simplest terms, contemporary computers fast lecronic calculating machine tat aceps digitized input information, processes it acorn ois of internally stored instructions, nd produces th resling ouput information. Te ist of instructions called a computer program, an te intemal worage is called computer memory. ‘Many types of computes exist hat diffe widely in sz, co, computational power, snd intended ws, The most common compute is the personal compute which bas found wide use in homes, schools, and busines ofe. Iti the most common frm of desktop computers. Desktop compters have processing and storage unis, visual display and aus ouput unit, and a keyboard that can all be located easily on & home o ofc desk. Te storage media include hard diss, CD-ROMs, and dikes, Portablenatbookcomputersaeacompact version ofthe personal compatrwithallof these components packaged into a single unit the size of a thin briefcase. Workstations With high-resolution graphics inpuvouput capability, although sil esining the d- rensios of desktop computers, have siguifcanly more computational power than penonal computer. Workstations are ofen ued in engineering aplictos, espe- cially for interactive design work. ‘Beyond workstations, a range of large and very powerful computer systems exist thar called enterprise ystems and serersatthelowend ofthe range, and supercon puters atte bigh end. Emerprit systems, or mainframes, ae used fo tasness data rocessng in medi to arg corporations hat require mich more compting power and storage capacity than workstations can provide. Servers contin sizable database storage units and are capbl of handing lage volumes of requests to aces the dat In many cases, server are widely accessible to the education busines, and personal ser communities. The requests and responses ae usually tanported over Inemet communication fits. Indeed the Inert ad is associated servers have become 8 dominant worlwide source ofl ypes of information. The Inemet communication 1.8 PINcTONAL UNS {tities consis of complex structure of high-ped fiberoptic backbone links inter connected with broadcast cable and telephone connections to schools, businesses, and tomes. ‘Supercomputers are used for the large-scale numerical calculations required in sgplications sch sweater forecasting and arr design an simula, In exe Pie systems, serves, and supecomputen, the fuesonal unis, including mule Proceso, may consi of number of separate and often lane units 1.2 Functions. UNITS ‘A computer consis of ve functionally independent min pas iput, memory, rth netic and logic ouput, and contol nts a shown in Figure. The inptuni cepts ‘coded infomation fom human operator, fom eletromechanica devices such sky- beards, or rom other compas over dig communication ines. The information r- cited seithersoredin the eompute’smemary for later referees orimmediatly sed ‘bythe arithmetic and lgi iat to perfor the desied operations. The pocesing steps are determined bya program stored inthe memory. Fall, the rel re seat tack tothe outside worl though te ouput unt ll of tee acon are coordinated by the contol unit. Figure 1.1 dos not show the connections among the functional unis. These connections, which an be made in several ways, are discussed throughout this book. We refer to he rthmetic and logic cats, in cnjanton wit the main contol czas, she prcesor and input and ouput equipment is often colecvly refered to asthe input ouput (UO) uit. ‘Wenow ake acloer lok athe informationhanded by computer isconvenient o categorize this infomation a either instructions or dla. Instructions, o€ machine instrctions, ae exci commands tat + Govern the transfer of infomation within computer swell as between the com- ter and its UO devices * Specify te arithmetic and logic operations ob prfomed i conta wo Figure 1.1 sic hnctnal units of computes + Bas SrmucruR oF Comers ‘A list of instructions that peforms a task i called program. Usually the program is stored in the memory The processor then fetches the instructions tht make up the program from the memory, one afer another, and performs the desired operations. The compote iscompleely contol by the toed program, excep for possible extemal interaption by an opertr or by 1 devices connected tothe machine. Data are numbers and encoded characters that re wed as operands by the instuc- tions. The term dat, however is often used to mean ay digital information, Within this efiition of data, an entire program tat is, ist of instructions) may be considered 8 date if tis tobe processed by another program. An example ofthis isthe tsk of compiling high-level language source program nto alist of machine instructions con- stttng «machine language program, called the object program. The source program the input data to the compiler program which translates the source program nto & ‘machine language program. Infomation banded by a computer must be encoded ina suitable format. Most rseat-day hardware employs gta circuits that have only two sabe states, ON and (OFF (ace Appendix A) Each number, character, orinstroction is encoded asa string of ‘binary digits called bits, cach having one of two possible values, Oo 1. Numbers ae sully represented in positional binary notation, as discussed in dealin Chapters 2 and 6, Occasionally the binary-coded decimal (BCD) format is employed, in which cach decimal digit is encoded by four bits. ‘Alphanumeric characters ae also expressed in terms of binary coves. Several cod- ing schemes have been developed. Two ofthe mest widely used schemes are ASCTL (American Standard Code fr Information Interchange), in whch each characte is rep- reeeated as a 7-bit code, and EBCDIC (Extended Binary Coded Decimal Interchange (Code), which cight bits are used to denote a characte. A more detailed description ‘ofbinary notation and coding schemes is given in Appendix E. 1.21 Iypur UNIT CComputérs acept coded information through inp units, which read the dats. The ‘most well Anown input device isthe keyboard. Whenever «key is pressed, the come- sponding leter or digit is automatically translated into its corresponding binary code and transmited over a cable to either the memory o the processor ‘Many otherkinds of input devices ar availabe including joysticks, trackball, and ‘mouses. These are often used as graphic input devices in conjunction with displays “Microphones can be used o capeur audio input which is then sampled and converted ito digital codes for storage and processing. Detailed discussion of input devices and their operation i found in Chaper 10. 1.22 Memory UNIT ‘The function ofthe memory unit isto store programs and data, There are two classes of storage, called primary and secondary. 1a PONcTONAL Ons Primary storages fest memory that operates a eetronicspeds. Programs mast ‘be sored inthe memory whl they are being executed. The memory contains a large ‘umber of semiconductor storage cells, ach capable of string on bit of information ‘These cells ae rarely read or writen as individual els ba instead are processed in r0aps of fxd size called words. The memory is organized so thatthe contents of oe ‘word, containing bits, can be stored or retieved in one besc operation. To provide easy accesso any word in the memory. ditnc address associated with each word location, Addresies are numbers tet identify successive loetons. A ven wot is accessed by specifying its address and issuing contol command that state storage or reieval proces. ‘The numberof bits in each word is often refered to asthe word length ofthe computer. Typical word lengths range fom 16 o 64 bis. The capacity ofthe memory {sone facor that characterizes these ofa computer. Small machines typically eve only afew tens of millions of words, whereas medium and large machines normally ‘have many tensor hundreds of milions of words. Data are usually processed within & ‘machine in units of words, multiples of words, o parts of words. When the memory is accessed, usually only one word of daa is read or writen. Programs must reside inthe memory during execution. Istrctons and data can be writen into te memory or readout under the conto ofthe processor. Iis essential tobe able to access any word location inthe memory as quickly as possible, Memory in ‘which any location can be eaced in short and fixed amount of time afte specifying it adres called random-access memory (RAM). The time required to access one words called the memory acess time. This time s ied independent ofthe location ofthe word being accessed. It typically ranges from a few nanoseconds (1) oabout 100s fr modem RAM waits. The memory of «computers normally implemented ss memory hierarchy ofthe or four levels of semiconductor RAM units With different speeds andes. The small fst, RAM units are called caches. They are ightly coupled ‘withthe processor and are often contained onthe same inigated ct chipto achieve high performance. The lagest and slowest units refered to asthe main memory. We will ive @ brief description of how information i accessed inthe memory hierarchy Inter inthe chapter. Chapter S discusses the operational and performance aspects ofthe computer memory in detail. ‘Although primary storage is essential, it tends to be expensive. Ths aéditionl, cheaper, secondary storage i sed when large amounts of dts and many programs have tobe stored, paticularly for information that is ecessed infrequently. A wide seletion of secondary storage devices i availble, including magnetic disk and apes and optical disk (CD-ROMs). These devices ae sso described in Chapter S 1.23. Arrrumeric ap Logi Uxtr ‘Most computer operations are executed in the arithmetic and logic unit (ALU) ofthe processor. Consider a typical example: Suppose two numbers located inthe memory are tobe added. They are brought into the processor, andthe actual addition is carried coutby the ALU. The sum may then be stored in the memory or retained in the processor forimmediate use. (GHAPTHR 1 + Basic SravcTuRE oF ConUTERS ‘Any other arithmetic or logic operation, for example, multiplication, division, or comparison ofnumbers is initited by bringing the required operandsint the processor, ‘where the operation is performed by the ALU. When operands are brought into the processor, they ar toed in high-speed storage elements called reisters. Each register can store one word of data Access times to registers are somewhat faster than access times tothe fastest cache unit inthe memory hierarchy. ‘The control andthe arithmetic and logic units are many times faster than other ‘devices connected to a computer system. This enables & single processor to contol a number of extemal devices such as keyboards, displays, magnetic and optical disks, sensors, and mechanical contol, 1.24 Ourevt Untr ‘Te ouput uit is the counterpart ofthe input unt Its fonction iso send prcessod resto the outside wo. The most fair example of such advice i printer. ‘Printers employ mechanical impect heed, inkjet steams, or photocopying techniques, 18 in serpin, to perfor he rinig Is posible to rode ptr capable of iting asranys 1000 ies per minut, This atemendous speed foraechanical device bts till very stow compared to the electronic speed fa pressor unit Some units, sich as graphic isplys, provide both an output function and an inp function, The dual role of such units is the reason for using the single name 1/0 unit in many cases. 1.25 Contro. Uw ‘The memory, arithmetic and logic, and input and outpat units tore and process in- fomation and perform input and ouput operations. The operation ofthese units must te coordina in some way This isthe task of the cont nit. The cont units effectively the nerve center tht sends contol signals to othe units and senses thet states, 0 transis, consisting of inpat and output operations ae controled by the in structions of HO programs that identity the devices involved andthe information tobe transfered, However, te atl nng signal that gover the transfers are generated by the contol circuits. Timing signals are signals that determine when given action is to take place. Datatranses Between the processor andthe memory are so conoid by the conto unit rough ining signals Its reasonable to hink ofa cont unit at 4 well-defined, physically separate uit that interacts With oer pars of the machine. In practie, however, hs is seldom the case. Much ofthe conto ici is peys- cally dstribted througout te machine. A lng set of contol ines (wires) cates the signals sed for tning and synchronization of evetsin al ws “The operation of «computer can be summarized as follows +The computer accepts information inthe form of programs and data through an nga unit and stores itn the memory. 1.2 BAS OnmaTONAL CoNcaPTS ‘+ Information stored in the memory is fetched, under program contol, into an aih- ‘metic and logic uit, where itis processed. ‘+ Processed information leaves the computer through an output unt. + Allactvtes inside the machine are directed by the contol unit, 1.3 BASIC OPERATIONAL CONCEPTS In Section 12, we stated thatthe activity in a computer is governed by instructions. ‘To perform a given task, an appropriate program consisting of alist of instructions is stored in the memory Individual instructions are brought from the memory nto the rocestr, which executes the speifed operations. Data tobe used as operands ar also stored inthe memory typical instruction may be ‘Add LOCARO ‘This instruction adds the operand at memory locaton LOCA tothe operand ina register {inthe processor, RO, and places the sum into register RO, The orginal contents of, location LOCA are preserved, whereas those of RO are overwriten. This instruction requires the performance of several steps. First the instruction is fetched from the ‘memory into the processor. Next, the operand at LOCA is fetched and added to the contents of RO. Finally, the resulting sums stoed in register RO. ‘The preceding Add instruction combines a memory access operation with an ALU operation. In many modem computers, these two types of operations are performed by separate instructions for performance reasons that are explained in Chapter 8. The effect ofthe above instruction canbe realized by the two- instruction sequence Load LOCARI ‘Add R1,RO ‘The first of these instructions transfers the contents of memory location LOCA into rocessr register Ri, andthe second instruction 2 the contents of eisters RI and RO and places the sum into RO. Not that ths destroys he former contents of register RI a wel as those of RO, whereas the oiginal content of memory locaton LOCA sre preserved. Trans between te memory andthe pocestor are stated by sending the adress of the memory locaton tobe accessed to he memory unit and issuing the appropriate contol signals. The data are then transferod tor from the memory. Figur 1.2 shows how the memory andthe processor can be connected It so shows few essential operational details ofthe proceso tht ave not been discussed yet The intereonneton pate fr these components is nt shown explciy since ere we discus only tee fnetonal characteris, Chape describes the details of the interconnection a par af proceso design. In adit othe ALU andthe conto city, the processor contains «number ‘of reister used for sever diferent purposes. The insracton eiter CR) bolt instruction dats curently being executed ts otptis avaiable tte conto ces, eMAPTAR + Basic StavcroR oF ConerTs Memory Figure 12. Connections between the processor ond the mamery. ‘which generate the timing signals that control the varios processing elements involved inexecatng the instruction. The program counter (PC) is another specialized register Te keps track ofthe execution of program. It contain te memory adress ofthe nextinstruction tobe etched and executed. Dring te execution ofa insneton, the contents of the PC ae updated to comespond to the adres ofthe next nstuction to be executed, Ii customary to say tha the PC points tothe nex instruction thats tobe fetched from th memory. Besides the IR and PC, Figur 1.2 shows n general-purpose registers, Ro through Re. Ther roles ar explained in Chapter 2. Finally, two registers facilitate communication withthe memory. These ae the memory adress register (MAR) and the memory data resister (MDR). The MAR holds the adres of the leation to be accessed. The MDR contains the data to be ‘wten ino or rad ot of the aressed location, ‘Lets now consider some typical operating steps. Programs reside in the memory and usualy ge there troogh the ipat unit Execution ofthe program starts when the PC is se to point to the it inructon of the program. The contets ofthe PC are transfered to the MAR and a Read contol signa is sent to the memory. After te time require to acess the memory claps, the aressed word (inthis ase, the fit instruction ofthe program) i readout of the memory and loaded into the MDR. [Next the content ofthe MD ae transfered to the IR. A this point, tbe instruction Isready to be decoded and executed. 4 Bus Sraveroms Ifthe insrcton involves an operation tobe perfomedby the ALU, tisnecessery to obtain the required operands. In operand resides inthe memory (could also bein a general-purpose register in the processor, it as tobe fetched by sending its adress to the MAR and inating a Read cycle. When the operand hasbeen read from the ‘memory ino the MDR, itis transfered from the MDR tothe ALU. After one or more operands are fetched i his way, the ALU can perform the desired operation. I the result ofthis operations tobe stored in the memory, then the result set othe MDR, ‘The aes ofthe location where the result is tobe stored is sett the MAR, and a Wii eycleisintated. At some pont during the exciton ofthe curren insiucton, the contents ofthe PC ae incremented s thatthe PC point to the net instrcton 19 ‘be executed. Thus, as soon athe execution of tect insrction is completed, a ew iestrction fetch may be stared. ‘Inadtionto transfering data between the memory andthe processor, the computer cepts da fom input devices and sends data to output devices. Thus, sme machine ‘structions with the ability to handle VO transfers ae provided. ‘Normal execution of programs may be preempted if some device requires urge servicing For example, ¢ monitoring device ina compoter controlled indusial process may detect a dangerous contin. In order to deal with tbe situation immediatly, the ‘normal execution of the cueat program must be interrupted. To do this the device raises aninderrapt signal. An nterupt isa request fom an UO device for service by the proceior. The proceso provides the requested sevice by executing an appropiate internat service routine. Because such diversions may ater the itr sate ofthe rocesor, is state must be saved in memory locations before servicing the inter ‘Normally, the contents ofthe PC, te general registers, and some contol infmation are stored in memory. When the nteruptsrvce rate is completed, the sate ofthe processor is etored so tt be inlepted program may continue. ‘The procesor unit shown in Figure 1.2 i wually implemented on single Very Large Scale Integrated (VLSI chip, wth test one ofthe cache units of the memory birch contained onthe same chip. 1.4. BUS STRUCTURES So far, we have discussed the functions of individual parts of «compute. To form an operational sysiem, these parts must be connected in some organized way. There are ‘many ways of doing this. We consider the simplest and most common ofthese bere. ‘To schieve a reasonable sped of operation, a computer must be organized so that all its units can handle one fall word of data ata given time. When a word of deta is transfered between units, all its bits are transfered in parallel, thats the bts are ‘uansferred simultaneously over many wies, or ines, oe bit per line. group of lines that serves asa connecting path for several devices is called a bus. In addition tothe lines that carry te data, the bus must have ines for address and control purposes. ‘The simplest way to inerconnec functional unit is to use a single bus, as shown ‘in Figure 1.3. All units are connected to this bus. Because the bus canbe used for only one transfer at time, only two units can actively use the bus at any given time. Bus Figure 13 Singlobus src. contol ines ae used tw arbitate mil requests fo us ofthe bus. The msi ite of the singl-bus structure is its low cost ad its likly fo atahing peripheral devices. Systems that contain multiple buss achieve more concurrency in operations by allowing wo or mor transfers tobe cad out at the same ie. This ado beter formance but tan increased cost ‘The devies connected to bus vary widely in ther sped of peraton Some elec- ‘romechanical devices, such as keyboards and printers, ar relive slow. Obes, ike magnetic or opi dss, ae considerably fase. Memory nd proceso nits operate at eleewonic speeds, making them te fest pars ofa compute. Because all these eo = “2m 8 > tt Figure 24 2'sconplorent add and wet operations. a CCHAPTHR 2+ MacNE TROCTONS AND PROGRAMS 2s-complemented, This operation is done in exactly the same mane for both positive and negative numbers. We often need to representa nunbe inthe 2°-complement system by using a ‘umber fbi thats large than some giva size Fora positive number, hiss achieved ty ang Oso tke eft. Fora negative umber, the fimost bi, wich she sig bit, isa 1 anda longer aur withthe same vale is obtained by replicating the sgn bit tothe left as many times as desire, To see why this is eomet, examine the mod 16 cice of Figure 2.3. Compare itt larger cles fr te mo 32 ormod 64 cases. The representations for valves ~,~2 et, woud be exactly te same, wih 1 aed tothe Jet In summary, to represet a signed number in 2's-complement frm using larger umber of is, repeat the sign bites mary mesa nested othe left. Tis operation is called sgn extension, “The simplicity of ether ading or subtracting signed numbers in '-complement representation isthe reson why this number representation i sed in moder com- ers. It might sem tht the T'complement epresenation would be jus s good 2 te 2'-complement sytem. However, although complementation i easy the rest lined afl anadtion operation i nat always coc. The cay-ou, ¢y, cannot be ignored. y= 0 theres obsinedscomect. If, = I then must beaded tothe ‘esto make itcomet. The ned fr hs conection ecl, wich is conditional nth -—_—_—_—— 20 (pep TAT] LL sent by=0 for posve mbes y= fr megatve umber (@) A signed integer ———— ascit scit Ascit ASciL cance: crater charcty—charcter (b) Four charactors Figure 2.6 Examples of encode infomation n a 32bit word 2. MmioRY Loca TIONS AND ADDRESSES locations in the memory. This isthe assignment used in most modem computers, ands the one we will normally use in this book. The term bye-addressable memory is used for this assignment. Byte locations have addresses 0, 1,2,.... Tus ifthe word length ofthe machine i 32 bits, successive words are located at adresses 4,8, ..., with cach word consisting of four bytes. 2.2.2 BIG-ENDIAN AND LITTLE-ENDIAN ASSIGNMENTS, ‘There are two Ways tat byte addresses can be assigned across words, as shown in Figure 2.7. Thename big-endin is used when lowerbyte adresses ar used forthe more significant byes (the leftmost bytes) ofthe word. The name litle-endian is used forthe ‘pporite ordering, where te lower byte adresses are used for theless significant bytes (the rightmost bytes ofthe word. The words “more significant” and “Iss significant” sreusedin elation othe weigh (powers of) assigned bis whenthe word represents number, as described in Section 2.1.1. Both itle-endian an bigendian assignments sre used in commercial machines. In both cases, byte addresses 0,4, 8. are taken ss the adresses of successive word in the memory and are the adresses used when specifying memory read and write operations fr words In addition to specifying the adress ordering of bytes within a word, itis also noceasary to specify the labeling of bits within a byt or a word. The most common convention, and the one we will use in this book, is shown in Figure 26a. It isthe (@) Bigendan assignment Fgure 2.7 fe ond word addesing WAPTER 2+ Maca TaUCTONS AND PROGRAMS ‘most natural ordering forthe encoding of mumercal data. The same ordering is also ‘wed for labeling bits within a byte, tat is, by Bg... bp, fom lft to right. There are computers, however, that use the reverse ordering. 2.2.3, Wor ALIGNMENT In the case ofa 32-bit word length, natural word boundaries occur at adresses 0,4, 8... a8 shown in Figure 27. We sy thatthe word loetons have aligned addresses. In gener, words are std to be aligned in memory if they begin ata byte address that is muliple ofthe numberof bytes in « word. For practical reasons associated with manipulating binary. coded adresses, the numberof bytes ina word i «power of 2 Hence if the word length is 16 2 byes), aligned words begin at byte adresses 0,2,4,...,andforaword length of 642 byes), aligned words bepnatbyte adresses 08,1 ‘There is no fundamental reason why words cannot begin at an arbitrary byte dress. In thet case, words are ssid to have unaligned edieses. While the most ‘common cases to use aligned adresses, some computers allow the us of unaligned ‘word adresses 22A_ ACCESSING NUMBERS, CHARACTERS, AND CHARACTER STRINGS ‘A oumber usually occupies one word. I can be acessd in the memory by specify- ing its word address. Similarly, individual characters can be accessed by their byte ads, Tnmany applications, itis necessary to handle character strings of variable length. ‘The beginning of the sting is indicated by giving the adress ofthe byte conning its fist characier. Successive byte locations contain soccesive character ofthe string. ‘There are two ways to indicate the length of the string. A special contro character with the meaning “end of sting” can be sed asthe lst character inthe string, o a separate ‘memory word location or processor register can contain «number indicating the length ofthe string in bytes. 2.3 MEMORY OPERATIONS ‘Both program instructions and data operands are stored inthe memory. To execute an instruction, the processor contol circuits must cause the word (or words) containing the instruction tobe transfered fromthe memory tothe processor. Operands and results smustalso be moved between the memory and tbe processor. Ths, two basic operations involving the memory are needed, namely, Load (ot Read or Fetch) and Store (oe Wri 2.4 ysrmucTONs AND NTRUCTION SeIUINONG ‘The Load operation transfers a copy ofthe contents ofa specific memory locaton to the processor. The memory contents remain unchanged. To stata Load operation, the processor sends the edess of the desired location to the memory and requests that its contests be read. The memory reads the data stored a that address and sends them 16 the processor. ‘The Store operation transfers an item of information from he processor‘oaspocific ‘memory locaton destroying the former conten ofthat locaton. The processor sends the address of the desired location tothe memory, together withthe data tobe writen to that locaton. ‘An information item of ether one word o one byte canbe transferred between the ‘processor andthe memory ina single operation. As described in Chapter, the processor ‘contains small numberof registers, each capable of holding word. These registers ae citer the source or the destination ofa transfer to or from the memory. When 2 ‘bye is transfered, itis usualy located inthe low-order (rightmost) byte position of the eis. ‘The details of the hardware implementation of these operations are treated in (Chapters 5 and 7. In this chapter, we are taking the ISA viewpoint, so we concentrate ‘on the logical handling of instructions and operands. Specific hardware components, such as processor registers, are discussed only tothe extent necessary to understand the ‘execution of machine instructions and programs. 2.4 INSTRUCTIONS AND INSTRUCTION SEQUENCING ‘The tasks cared out by a computer program consist ofa sequence of mall steps, such as adding two numbers, testing for «particular condition, reading a characte from the keyboard, or seading a character o be displayed ona display screen. A computer must ‘hve instructions capable of performing fou types of operations: + Data transfers between the memory and the processor registers ‘+ Arithmetic and logic operations on data ‘+ Program sequencing and contol + VO twansfers ‘We begin by discussing the fist two types of instructions. To facilitate the discussion, ‘we need some notation which We preset fs. 24.1 REGISTER TRANSFER NOTATION ‘We ned to describe the anf of information from one location inthe compute to ssber. Possible locaton tht may be involved in sch transfers are memory locaton, processor registers, or regis inthe VO subsystem. Most of the ime, we identify a Jocaon by symbolic name standing for its ardware binary adres. Fr example, IR 2 + MACiNGE STRUCTION AND PROGRAMS ‘names forthe adresses of memory locations maybe LOC, PLACE, A, VAR2; processor register names may be RO, RS; and VO register names may be DATAIN, OUTSTATUS, and soon. The contents of a location ae denoted by placing square brackets around the name ofthe locaton. Thus, the expression RI © [LOC] means hat th cones of memory location LOC are tansfeed int proceso regis- terR, ‘As nother example, consider te operation hat ad th content of eit RI snd 2, and then places thi sum into register R3. Tis actions indicated 3 «RII + R2] ‘This typeof notation is known as Register Transfer Notation (RIN). Note thatthe right-hand side ofan RTN expression alvays denotes a value, and the left-hand sie is ‘the name ofa location where the value isto be placed, overwriting the old contents of that location, 24.2 ASSEMBLY LANGUAGE NOTATION ‘We need another type of souton to represent machine instructions and programs. For this, we use an assembly language format. For example, an instruction that causes the transfer described above, from memory location LOC to processor register RI specified by the statement Move LOCRI ‘The contents of LOC are unchanged by the execution of this instruction, but the old contents of register Rl ae overwrite, ‘The second example of adding two numbers contained in processor registers RI ‘andR2 and placing their sum in 3 canbe specified bythe assembly language statement Add RLRORS 243 Basic INsPRECTION Types ‘The operation of adding two numbers is fundamental capability in any computer The statement C=A+B in high-Jevel language program is a command tothe computer to add the current ‘values of the two variables called A and B, and to assign the sum toa third variable, C. When the program contining this statement is compiled the three variables, A,B, and C are asigned to distinct locations nthe memory. We will use he variable names to refe tothe corresponding memory locaton addresses, The contents ofthese loca- tions represent the values ofthe three variables Hence, the above high-level language 2.4 NSIRUCTIONS AND NSTHUCTIONSPQUINGNG ‘Matement requires the action Ce TAL+ Bl totake pace in the compute. T car ou this action, the contents of memory locations ‘A and B are fetched from the memory and transferred into the procesor where their ‘sum is computed. This results then seatback tothe memory and stored in leation C. Let us frst assume that tis action isto be accomplished by a single machine Instruction. Farhermore, assume that this instruction contains the memory addresses ofthe three operands — A,B, and C. This thee-address instruction canbe represenied symbolically Add ABC Operands A and B ae called the source operands, Cis called the destination operand, snd Addis the operation tobe performed on the operands. A general instruction ofthis ‘ype has the format Operation Source! Source2,Destination IK bits are needed to specify the memory adres of each operand the encoded form of the above instruction must contain 3 bits for addressing purposes in addition othe ‘bite needed to denote the Add operation. For a modem processor witha 32-bit address spoce, a3 adres instructions oo largo itn one word fora easonabe word lngih. ‘Thus, format that allows maltile words to be used for a single instruction would be ‘needed to represent an instruction ofthis type. ‘An alternative approach is to use a sequence of simpler instructions to perfrm the same tas, with each instruction having only one or two operands. Suppose that wo-addressinsrvtions ofthe form Operation Source,Destination are availabe, An Add instruction ofthis type is Add AB ‘which performs the operation B <- [A] + [B]. When the sum is calculated, the result is set to the memory and stored in locaton B, replacing the original contents ofthis Jecatin, This means that operand Bis both a source anda destination, ‘A single two-address instruction cannot be used to solve our original problem, ‘which iso add the contents of locations A and B, without destroying either of them, and to place the sum in location C. The problem canbe solved by using another two- ‘addres instruction that copies the contents of one memory loeation into another. Such sm instrcton is Move BC ‘which performs the operation C « [B], leaving the contents of location B unchanged. ‘The word “Move” is misnomer here; it should be “Copy” However, this instruction name is deply entrenched in computer nomenclature. The operation C <= [A] + [B] |APTER 2. + Mace KsTRDCTONS AND PROGRAMS can now be performed bythe two-instuction sequence Move BC Add AC Inalltheinsructon given above, the source operands are specie fist, followed ty the destination. This orders used inthe assembly language expressions for machine ‘instructions in maay computes. But there are alo many computer in which he order of the source and destination operands is reversed. We will ee examples of both orderings in Chaptr3, Its unfortunate th 0 single coneatio as been adaped by all manufacturers. In fact, even for a particular computer, its assembly language may sea difereat ode fr different instructions. In this chapter, we will continue to give the source operands fis. ‘We have defined thee and two-adessinstroctions. Bu, even two-adessin- structions will nt normaly into one word for usual word lengths and adres sizes “Anober possibilty i to have machine instructions tat specify only one memory ‘operand. When a second operand is needed, a inthe cae ofan Ad istructo, itis ‘understood implicitly to be in # unique lesion. A processor ester, usually called the accumulator may be used for this purpose. Thus, the one-addess instuction Add A ‘means the following: Add the contents of memory location to the contents ofthe sccumulator register and place the sum back into the accumulator. Let us als introduce the one-addess instructions Load A Store A ‘The Load instrction copies the contents of memory locaton A into the accumulator, and the Store instruction copes the contents ofthe accumulator into memory location ‘A. Using only one-adress instructions, the operation C +-[A] + [B] canbe performed by executing the sequence of nstrictions Load A Adi B Store C [Note that the operand specified in the instruction may be a source ora destination, depending on the instruction, Inthe Load instruction, address A specifies the source ‘operand, andthe destination locaton, the accumulator, is implied. On the oer hand, C denotes the destination location in the Store instuction, whereas the source, the sccumolator, is implied. ‘Some early computers were designed sround a single accumulator structure. Most ‘modem computers have a number of general-purpose processor registers — typically 8032, and even considerably more in some cases. Acces to daa in these registers is much faster than to daa stored in memory locations because the registers are inside the 2.4 xsmuctins Axo RermucTi SUNN processor. Because the number of reistesis relatively smal, onl afew bits are needed to specify which register takes part in an operation. For example, for 32 register, only ‘Sits are needed. This is much less than the number of bis needed to give the address ofa location in the memory. Because the use of registers allows faster processing and results in shorter instructions, registers are used to store data temporarily inthe processor during processing. ‘Let Ri represent a general-purpose register. The instructions Load ARI ‘Store Ri,A and Add ARE are generations ofthe Loud, Str, and Addnstuctions forthe single accumulator «ase in which register Ri performs the fonction ofthe accumulator Evenin hese cases, when only one memory adres is dil specified in an instruction the instuction nay aot into one word ‘Whee processor has several general-purpose registers, many instctons involve ony opeans tat reine eit Infact, in mary moder processes, computations canbe performed iret only on data eld in processor registers. Instructions suchas Add RRJ Add RiR/.Rk are ofthis type. In both ofthese instructions the source operands are the contents of registers Ri and Rj. nthe fist instruction, Rj also serves athe destination register, ‘whereas in the second instruction, a third repster, RE, is used asthe destination. Such instructions, where only register names are contained inthe instruction, will normally fit imo one word . Ikis often necessary to transfer data between diferent locations. This i achieved ‘with the instraction ‘Move Source-Destnaton which places a copy of the contents of Source into Destination, When data are moved {oor oma processor reise, the Move instruction canbe used rather than the Load ‘Store instruction beceuse the order ofthe source and destination operands determines ‘which operation is intended. Ths, Load ARH Move RIA. a |APTER 3. © MACuME INSTRUCTIONS AND PROGRAMS is the same as Store RiA In this chapter, we will use Move instead of Load or Store. In processors where arithmetic operations are allowed only on operands that arin processor registers, the C= A+ B task canbe performed by the instruction sequence Move ARI Move BRJ Add RAR Move RjC In processors where one operand may be in the memory but the other must be in a ‘register, an instruction sequence forte required task would be Move ARI Add BRI Move RIC ‘The speed with which given tasks cared out depends onthe time ttakestotrans- {er instructions from memory into the processor and to access the operands referenced by these instructions. Transfers that involve the memory are much slower than transfers Within the processor. Hence, a substantial increase in speed is achieved when several operation are performed in succession on data in processor registers without the need to copy data oor from the memory. When machine language programs are generated by compilers fom high-level languages, itis important to minimize the fequency with which data is moved back and forth between the memory and processor reps ‘We have discussed thre, two, and one-address instruction. It is also posible to use instructions in which the leations of all operands are defined implicitly. Suc instructions are found in machines that toe operands in a structure called a pushdown stack In this case the instruction are called zerv-addressinsuctions. The concept of «pushdown stack isintoduced in Section 28, and computer that uss this spprosch is discussed in Chapter 11 2.44 INSTRUCTION EXECUTION AND STRAIGHT-LINE SEQUENCING In the preceding discussion of instruction fomats, we sed the tsk C < [A] + [B] forillasrtion. Figure 28 shows a possible program segment fr this tsk as it appears inthe memory of computer, We have assumed thatthe compute allows one memory operand per instruction and has a number of processor registers. We assume tat the Word eng 32 bts and the memory is byte addessabl. The tre instructions ofthe program are in successive word locations, stating at locaton i. Since each instruction is 4 byes lon, the second and third instructions start at adresses i+ 4 andi + 8 For simplicity, we alo assume tet full memory address canbe drety specified in ‘sngle-wordinstcton, although his isnt usualy posible for aes space sizes and word lengths of current processors Mies Cotets egnexeviontee —> 1 Now ARO “inseucton isa] aad BRO pooam i+ Move ROC a Cc — 5 a! paste thepogan ¢ lH Figure2.8 A progam fr C + (A) + (8) Let us consider how this program is executed The processor contains a register called the program counter (PC), which bods the adress ofthe instruction to be aceuted next. To begin excuing » program, the ees offs instvton (in ‘our example) must be placed ito the PC. The, the processor contro circuits use the {information in the PC to fetch and exeuie instructions, neat ime inthe order af ‘increasing rests. Tis is called straight-line sequencing. During the execution of «ach instruction, tbe PCs incremented by 4to point tothe net instruction, Thus, afer the Move instrcton a oaton i+ 8 is exzcuted, the PC contin the vale i+ 12, ‘hich isthe address ofthe first nstuction ofthe next program segment. Executing given instruction i a two-phase procedure nthe fist phase, called ingrution fetch, he instruction is fetched from te memory lation whose adress ‘isin the PC. Tis instruction i placed in the instruction reser) in the processor. [A the stat of the second phase, called instruction exeete, te instruction in TR is examined to deternine which operation isto be peformed. The specified operation is then performed by the procestor. This often involves fetching operands from the ‘memory o from pocesor registers, perfonning an rite r logic operation, and storing the reslt in the destination locaton. At some point during this two-phase procedure, the contents of the PC are advanced to point othe next instruction. When the execute phase ofan istrcton i completed, the PC contains the aes ofthe net instruction, nda new instuton fetch phase can begin. In most processors, the (HAPTER 2+ MAGNE RSTRUCTONS A PROGRONS execute phase itself is divided into a small number of distinct phases corresponding to fetching operands, performing the operation, and storing the esl 24.5 BRANCHING Consider the task of adding alist of m sumer. The program outlined in Figure 29 is 2 generalization of the program in Figure 28. The adresses ofthe memory locations ‘containing then numbers are symbolically given as NUM1, NUM2,..., NUMn, and separate Add instruction is used to add each number tothe contents of register RO. ‘Atle al the numbers have been added, the result is placed in memory location SUM, Tstead of using a long list of Add instructions, itis posible to place a single ‘Add instruction in a program loop, as showa in Figure 2.10. The loops astigh-ine Sequence of instructions executed as many times as needed. I starts at location LOOP and ends atte instruction Branch>0, During each passthrough this loop, the address of i Move _NUMIRO ‘NUMERO ‘NUMSR0 ina E/E isdn-a{ Add NUMARO iste [7 Move ROU sum uM! NUMa Figure 2.9 A shaighine program for cing n numbers. 2.4 _lxstmucTons AND DUCTION SeNUINONG, Moe RI Gear RO Loop Decrees of ‘Nex nmber end dt Progam "Neu? nomber to RD Decenent RI ‘eciod LOOP Move __ROSUM sum ‘NUM nun NUMa Figure 2.10 Using o loop odd» umber. the netlist entry is determined, and that etry is etched and added to RO. The address ‘of an operand canbe specified in various ways as wil be described i Section 2.5. For ‘aw, We concentrate on how to crete and control a program loop. ‘Assume thatthe numberof etre inthe list, is stored in memory location N, shown. Register RI is used asa counter to determine the number of mes the loop is executed. Hence, the content of location Nar loaded into register Rat the beginning ofthe program. Then, within the body ofthe loop the instruction Decrement RI reduces the contents of RI by each time through the loop. (A similar typeof operation is performed by an Inremeat instruction, which adds | to is operand) Execution of the lop is repeated as Jong asthe result ofthe decrement operation is greater than cnarn 22+ Mac NSTRCCTIONS AND PROGRINS. ‘We now introduce branch instructions Tis typeof instruction loads anew value {tothe program counter. Asa result, the processor fetches and executes the instruction at this new address, calle the branch target, instead ofthe instruction at the location that follows te branch instruction in sequential address order. A conditional branch instruction causes a branch only ifa specified condition is satisfied. Ifthe condition is nt satisfied, the PC is incremented in te normal way, and the next instruction in Sequential address orders fetched and executed. Tn the program in Figure 2.10, the instruction Branch>0 LOOP (branch if greater than 0) is a conditional branch instruction that causes a branch to location LOOP if the result of the immediatly preceding instruction, which isthe decremented value in register RI, is greater than zero. This means that the lop is repeated as long as there are entries inthe list hat are yet tobe added to RO. A the nd ofthe mth passthrough the loop the Decrement instruction produces a value of zer, and, hence, branching doesnot occur Instead, the Move instruction is fetched and exected. It moves the final result from RO into memory location SUM. ‘Thecapabiltyo test conditions and subsequently choose one of ase of alternative ‘ways to continue computation has many more aplication than just loop conto. Such a capability is found inthe instruction sets of ll computers and is fundamental tothe programming of most nontrivial tasks. 246 Covprtioy Comes, ‘The processor keeps track of information about the results of varius operations for use by subsequent conditional ranch instructions. This is accomplished by recording the ‘required information in individual bis, ofte called condition code lags. These fags are usually grouped together ina special processor restr called the condition code register or status register. Individual condition code flags are st to I or cleared to 0, depending onthe outcome of the operation performed. Four commonly used fags are 'N(oegative) Seto 1 ifthe result is negative otherwise, cleared 10.0 Z(wero) Seto ifthe result is 0; otherwise, cleared to 0 V (overiow) Set to | if arithmetic overflow occurs; otherwise, cleared toO Ccamy) Seto 1 ifcary-out results fom the operation; otherwise, cleared 100 ‘The N and Z flags indicate whether the result of an arithmetic of logic operation is negative or zero, The N and Z flags may also be affected by instructions that tans- fer data, sach as Move, Load, or Store. This makes it posible for alter conditional ‘ranch instruction to cause a branch based on the sign and value ofthe operand that ‘was moved, Some computers also provide a specil Test instruction that examines

You might also like