当前: 首页 - 图书专区 - 计算机组成与设计:硬件/软件接口(英文版·第5版·亚洲版)
计算机组成与设计:硬件/软件接口(英文版·第5版·亚洲版)


  在线购买
(美)David A. Patterson加州大学伯克利分校John L. Hennessy 著斯坦福大学
978-7-111-45316-1
139.00
704
2014年01月21日

计算机 > 计算机组织与体系结构
Elsevier (Singapore) Pte Ltd
2289
英文
16
Computer Organization and Design
教材
经典原版书库








本书是计算机组成的经典教材。全书着眼于当前计算机设计中最基本的概念,展示了软硬件间的关系,并全面介绍当代计算机系统发展的主流技术和最新成就。
这本最畅销的计算机组成与设计的经典教材经过全面修订,关注后PC时代发生在计算机体系结构领域的革命性变革(从单处理器发展到多核微处理器,从串行发展到并行),并强调了新出现的移动计算和云计算。为了研讨和强调这种重大的变化,本书更新了许多内容,重点介绍平板电脑、云体系结构以及ARM(移动计算设备)和x86(云计算)体系结构。
因为正确理解现代硬件对于实现好的性能和能效至关重要,所以本版在全书中增加了一个新的实例“Going Faster”,以演示非常有效的优化技术。本版还新增了一个关于计算机体系结构“八大理念”的讨论。
与前几版一样,本书采用MIPS处理器来展示计算机硬件技术、汇编语言、计算机算术、流水线、存储器层次结构以及I/O等基本功能。
本书特色
包含新的实例、练习和资料,重点介绍新出现的移动计算和云计算。
涵盖从串行计算到并行计算的革命性变革,特别用一章篇幅讲述并行处理器,并且每章中还有一些强调并行硬件和软件主题的小节。
全书采用Intel Core i7、ARM Cortex-A8 和NVIDIA Fermi GPU作为实例。
增加一个新的实例“Going Faster”,展示正确理解硬件技术能够激发软件优化,提高200倍的性能。
讨论并强调计算机体系结构的“八大理念”——Performance via Parallelism; Performance via Pipelining; Performance via Prediction; Design for Moore's Law; Hierarchy of Memories; Abstraction to Simplify Design; Make the Common Case Fast; Dependability via Redundancy。
全面更新和改进了练习。
作者简介
David A. Patterson 加州大学伯克利分校计算机科学系教授,美国国家工程院院士,美国国家科学院院士,IEEE和ACM会士。他因为教学成果显著而荣获了加州大学的杰出教学奖、ACM的Karlstrom奖、IEEE的Mulligan教育奖章和本科生教学奖。因为对RISC技术的贡献,他获得IEEE的技术成就奖和ACM的Eckert-Mauchly奖;而在RAID方面的贡献为他赢得了IEEE Johnson信息存储奖。他还和John L. Hennessy分享了IEEE John von Neumann奖章和NEC C&C奖金。Patterson还是美国艺术与科学院院士、美国计算机历史博物馆院士,并被选入硅谷工程名人堂。Patterson身为美国总统信息技术顾问委员会委员,还曾担任加州大学伯克利分校电子工程与计算机科学系计算机科学分部主任、计算机研究协会(CRA)主席和ACM主席。这一履历使他荣获了ACM和CRA颁发的杰出服务奖。
John L. Hennessy 斯坦福大学的第10任校长,从1977年开始在该校电子工程与计算机系任教。Hennessy教授是IEEE和ACM会士,美国国家工程院、美国国家科学院和美国哲学院院士,美国艺术与科学院院士。他获得过众多奖项,如2001年度Eckert-Mauchly奖,表彰他对RISC技术的贡献;2001年度Seymour Cray计算机工程奖;与David Patterson共同获得的2000年度IEEE John von Neumann奖章。他还拥有7个荣誉博士学位。
Contents
Preface v
About the Author xiii
CHAPTERS
1 Computer Abstractions and Technology 2
1.1 Introduction 3
1.2 Eight Great Ideas in Computer Architecture 11
1.3 Below Your Program 13
1.4 Under the Covers 16
1.5 Technologies for Building Processors and Memory 24
1.6 Performance 28
1.7 e Power Wall 40
1.8 e Sea Change: e Switch from Uniprocessors to Multiprocessors 43
1.9 Real Stu: Benchmarking the Intel Core i7 46
1.10 Fallacies and Pitfalls 49
1.11 Concluding Remarks 52
1.12 Historical Perspective and Further Reading 54
1.13 Exercises 54
2 Instructions: Language of the Computer 60
2.1 Introduction 62
2.2 Operations of the Computer Hardware 63
2.3 Operands of the Computer Hardware 66
2.4 Signed and Unsigned Numbers 73
2.5 Representing Instructions in the Computer 80
2.6 Logical Operations 87
2.7 Instructions for Making Decisions 90
2.8 Supporting Procedures in Computer Hardware 96
2.9 MIPS Addressing for 32-Bit Immediates and Addresses 106
2.10 Parallelism and Instructions: Synchronization 116
2.11 Translating and Starting a Program 118
2.12 A C Sort Example to Put It All Together 126
2.13 Advanced Material: Compiling C 134
2.14 Real Stu: ARMv7 (32-bit) Instructions 134
2.15 Real Stu : x86 Instructions 138
2.16 Real Stu : ARMv8 (64-bit) Instructions 147
2.17 Fallacies and Pitfalls 148
2.18 Concluding Remarks 150
2.19 Historical Perspective and Further Reading 152
2.20 Exercises 153
3Arithmetic for Computers 164
3.1 Introduction 166
3.2 Addition and Subtraction 166
3.3 Multiplication 171
3.4 Division 177
3.5 Floating Point 184
3.6 Parallelism and Computer Arithmetic: Subword Parallelism 210
3.7 Real Stu: Streaming SIMD Extensions and Advanced Vector Extensions in x86 212
3.8 Going Faster: Subword Parallelism and Matrix Multiply 213
3.9 Fallacies and Pitfalls 217
3.10 Concluding Remarks 220
3.11 Historical Perspective and Further Reading 224
3.12 Exercises 225
4 The Processor 230
4.1 Introduction 232
4.2 Logic Design Conventions 236
4.3 Building a Datapath 239
4.4 A Simple Implementation Scheme 247
4.5 An Overview of Pipelining 260
4.6 Pipelined Datapath and Control 274
4.7 Data Hazards: Forwarding versus Stalling 291
4.8 Control Hazards 304
4.9 Exceptions 313
4.10 Parallelism via Instructions 320
4.11 Real Stu: e ARM Cortex-A8 and Intel Core i7 Pipelines 332
4.12 Going Faster: Instruction-Level Parallelism and Matrix Multiply 339
4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware
Design Language to Describe and Model a Pipeline and More Pipelining Illustrations 342
 Section 2.13 in the Asian Edition corresponds to Section 2.15 on-line, and section 2.19 can be found under section 2.21 on-line. – editor’s note
4.14 Fallacies and Pitfalls 343
4.15 Concluding Remarks 344
4.16 Historical Perspective and Further Reading 345
4.17 Exercises 345
5 Large and Fast: Exploiting Memory Hierarchy 360
5.1 Introduction 362
5.2 Memory Technologies 366
5.3 THe Basics of Caches 371
5.4 Measuring and Improving Cache Performance 386
5.5 Dependable Memory Hierarchy 406
5.6 Virtual Machines 412
5.7 Virtual Memory 415
5.8 A Common Framework for Memory Hierarchy 442
5.9 Using a Finite-State Machine to Control a Simple Cache 449
5.10 Parallelism and Memory Hierarchies: Cache Coherence 454
5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 458
5.12 Advanced Material: Implementing Cache Controllers 458
5.13 Real Stu: e ARM Cortex-A8 and Intel Core i7 Memory Hierarchies 459
5.14 Going Faster: Cache Blocking and Matrix Multiply 463
5.15 Fallacies and Pitfalls 466
5.16 Concluding Remarks 470
5.17 Historical Perspective and Further Reading 471
5.18 Exercises 471
6 Parallel Processors from Client to Cloud 488
6.1 Introduction 490
6.2 e Di culty of Creating Parallel Processing Programs 492
6.3 SISD, MIMD, SIMD, SPMD, and Vector 497
6.4 Hardware Multithreading 504
6.5 Multicore and Other Shared Memory Multiprocessors 507
6.6 Introduction to Graphics Processing Units 512
6.7 Clusters, Warehouse Scale Computers, and Other
Message-Passing Multiprocessors 519
6.8 Introduction to Multiprocessor Network Topologies 524
6.9 Communicating to the Outside World: Cluster Networking 527
6.10 Multiprocessor Benchmarks and Performance Models 528
6.11 Real Stu: Benchmarking Intel Core i7 versus NVIDIA Tesla GPU 538
6.12 Going Faster: Multiple Processors and Matrix Multiply 543
6.13 Fallacies and Pitfalls 546
6.14 Concluding Remarks 548
6.15 Historical Perspective and Further Reading 551
6.16 Exercises 551
APPENDICES
A Assemblers, Linkers, and the SPIM Simulator A-2
A.1 Introduction A-3
A.2 Assemblers A-10
A.3 Linkers A-18
A.4 Loading A-19
A.5 Memory Usage A-20
A.6 Procedure Call Convention A-22
A.7 Exceptions and Interrupts A-33
A.8 Input and Output A-38
A.9 SPIM A-40
A.10 MIPS R2000 Assembly Language A-45
A.11 Concluding Remarks A-81
A.12 Exercises A-82
B TH-2 High Performance Computing System B-2
B.1 Introduction B-3
B.2 Compute Node B-3
B.3 The Frontend Processors B-5
B.4 The Interconnect B-6
B.5 The Software Stack B-7
B.6 LINPACK Benchmark Run (HPL) B-7
B.7 Concluding Remarks B-8
F Networks-on-Chip F-2
F.1 Introduction F-3
F.2 Communication Centric Design F-3
F.3 The Design Space Exploration of NoCs F-5
F.4 Router Micro-architecture F-8
F.5 Performance Metric F-9
F.6 Concluding Remarks F-9
Index I-1
 Appendices A, B, and F appear in this printed edition. Appendices C, D, E are online and can be downloaded from the publisher’s web site.
ONLINE CONTENT
Graphics and Computing GPUs C-2
C.1 Introduction C-3
C.2 GPU System Architectures C-7
C.3 Programming GPUs C-12
C.4 Multithreaded Multiprocessor Architecture C-25
C.5 Parallel Memory System C-36
C.6 Floating Point Arithmetic C-41
C.7 Real Stu: e NVIDIA GeForce 8800 C-46
C.8 Real Stu: Mapping Applications to GPUs C-55
C.9 Fallacies and Pitfalls C-72
C.10 Concluding Remarks C-76
C.11 Historical Perspective and Further Reading C-77
D Mapping Control to Hardware D-2
D.1 Introduction D-3
D.2 Implementing Combinational Control Units D-4
D.3 Implementing Finite-State Machine Control D-8
D.4 Implementing the Next-State Function with a Sequencer D-22
D.5 Translating a Microprogram to Hardware D-28
D.6 Concluding Remarks D-32
D.7 Exercises D-33
E A Survey of RISC Architectures for Desktop, Server, and Embedded Computers E-2
E.1 Introduction E-3
E.2 Addressing Modes and Instruction Formats E-5
E.3 Instructions: e MIPS Core Subset E-9
E.4 Instructions: Multimedia Extensions of the Desktop/Server RISCs E-16
E.5 Instructions: Digital Signal-Processing Extensions of the Embedded RISCs E-19
E.6 Instructions: Common Extensions to MIPS Core E-20
E.7 Instructions Unique to MIPS-64 E-25
E.8 Instructions Unique to Alpha E-27
E.9 Instructions Unique to SPARC v9 E-29
E.10 Instructions Unique to PowerPC E-32
E.11 Instructions Unique to PA-RISC 2.0 E-34
E.12 Instructions Unique to ARM E-36
E.13 Instructions Unique to umb E-38
E.14 Instructions Unique to SuperH E-39
E.15 Instructions Unique to M32R E-40
E.16 Instructions Unique to MIPS-16 E-40
E.17 Concluding Remarks E-43
Glossary G-1
Further Reading FR-1
计算机\计算机组成
读者书评
发表评论



高级搜索
计算机存储与外设
计算机组成原理
深入理解计算机系统(原书第3版)


版权所有© 2017  北京华章图文信息有限公司 京ICP备08102525号 京公网安备110102004606号
通信地址:北京市百万庄南街1号 邮编:100037
电话:(010)68318309, 88378998 传真:(010)68311602, 68995260
高校教师服务
华章教育微信
诚聘英才
诚聘英才