Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: THEend8_
CIS273
Course Goals
• Develop an understanding of principles and design tradeoffs in
modern computers
• Develop a basic understanding of computer performance
• Gain a detailed understanding of processor design for a given
instruction set architecture
• Gain an understanding of memory organization and memory
hierarchies
2
What to Expect from Midterm 2
• T-F / MCQ – 20 points
• Short Answer Questions – 30 points
• Pipelining – 35 points
– Performance
– Parallelism
– Hazards
• Scheduling – 15 points
3
Rules
• 1-page cheat sheet / front and back
• No electronic devices including calculators
• No slide printouts
• No collaboration – Automatic ”zero” when caught
• Be there on time, otherwise, you won’t be able to
enter after 2 PM
• 50 minutes exactly, no extra time
4
Chapter 4 — The
Processor — 5
The Main Control Unit
• Control signals derived from instruction
0 rs rt rd shamt funct
31:26 5:025:21 20:16 15:11 10:6
35 or 43 rs rt address
31:26 25:21 20:16 15:0
4 rs rt address
31:26 25:21 20:16 15:0
R-type
Load/
Store
Branch
opcode always
read
read,
except
for load
write for
R-type
and load
sign-extend
and add
Chapter 2 —
Instructions: Language
of the Computer — 6
Translation and Startup
Many compilers produce
object modules directly
Static linking
§2.12 Translating and Starting a Program
Chapter 2 —
Instructions: Language
of the Computer — 7
Producing an Object Module
• Assembler (or compiler) translates program into machine
instructions
• Provides information for building a complete program
from the pieces
– Header: described contents of object module
– Text segment: translated instructions
– Static data segment: data allocated for the life of the program
– Relocation info: for contents that depend on absolute location
of loaded program
– Symbol table: global definitions and external refs
– Debug info: for associating with source code
Chapter 2 —
Instructions: Language
of the Computer — 8
Linking Object Modules
• Produces an executable image
1. Merges segments
2. Resolve labels (determine their addresses)
3. Patch location-dependent and external refs
• Could leave location dependencies for fixing by a relocating loader
– But with virtual memory, no need to do this
– Program can be loaded into absolute location in virtual memory space
Chapter 2 —
Instructions: Language
of the Computer — 9
Loading a Program
• Load from image file on disk into memory
1. Read header to determine segment sizes
2. Create virtual address space
3. Copy text and initialized data into memory
• Or set page table entries so they can be faulted in
4. Set up arguments on stack
5. Initialize registers (including $sp, $fp, $gp)
6. Jump to startup routine
• Copies arguments to $a0, … and calls main
• When main returns, do exit syscall
Chapter 2 —
Instructions: Language
of the Computer — 10
Dynamic Linking
• Only link/load library procedure when it is called
– Requires procedure code to be relocatable
– Avoids image bloat caused by static linking of all (transitively) referenced
libraries
– Automatically picks up new library versions
Chapter 2 —
Instructions: Language
of the Computer — 11
Lazy Linkage
Indirection table
Stub: Loads routine ID,
Jump to linker/loader
Linker/loader code
Dynamically
mapped code
Chapter 2 —
Instructions: Language
of the Computer — 12
Starting Java Applications
Simple portable
instruction set for
the JVM
Interprets
bytecodes
Compiles
bytecodes of
“hot” methods
into native
code for host
machine
Chapter 4 — The
Processor — 13
Pipelining Analogy
• Pipelined laundry: overlapping execution
– Parallelism improves performance
§4.6 An O
verview
of Pipeliningn Four loads:
n Speedup
= 8/3.5 = 2.3
n Non-stop:
n Speedup
= 2n/0.5n + 1.5
n 4 stages
Chapter 4 — The
Processor — 14
MIPS Pipeline
• Five stages, one step per stage
1. IF: Instruction fetch from memory
2. ID: Instruction decode & register read
3. EX: Execute operation or calculate address
4. MEM: Access memory operand
5. WB: Write result back to register
Chapter 4 — The
Processor — 15
Forwarding Paths
Most Datapaths are capable of forwarding:
- EX to ID
- Mem to ID
Some Datapaths are ALSO capable of forwarding:
- Output of EX to Input of EX
Chapter 4 — The
Processor — 16
Pipeline Performance
• Assume time for stages is
– 100ps for register read or write
– 200ps for other stages
• Compare pipelined datapath with single-cycle
datapath
Instr Instr fetch Register
read
ALU op Memory
access
Register
write
Total time
lw 200ps 100 ps 200ps 200ps 100 ps 800ps
sw 200ps 100 ps 200ps 200ps 700ps
R-format 200ps 100 ps 200ps 100 ps 600ps
beq 200ps 100 ps 200ps 500ps