banner
cos

cos

愿热情永存,愿热爱不灭,愿生活无憾
github
tg_channel
bilibili

Computer Organization Principles Review Summary (5) Central Processing Unit

Chapter 5 Central Processing Unit#

5.1 CPU Functions and Composition#

5.1.1 Functions of the CPU#

The central processing unit is the component that controls the computer to automatically complete the tasks of fetching and executing instructions; it is the core component of the computer, commonly referred to as the CPU. Its functions are as follows:

  • Instruction Control: A program is an ordered collection of instructions, ensuring that the machine executes the program in the specified order.
  • Operation Control: The CPU manages and generates operation signals for each instruction fetched from memory and sends various operation signals to the corresponding components, thereby controlling these components to act according to the requirements of the instructions.
  • Timing Control: Implementing time control over various operations, the operation signals of various instructions in the computer are strictly controlled by time.
  • Data Processing: Performing arithmetic and logical operations on data. Completing data processing is the fundamental task of the CPU.
    Insert image description here

5.1.2 Basic Composition of the CPU (Key Point)#

(1) Central Processing Unit CPU = Arithmetic Logic Unit + Cache + Controller
(2) Arithmetic Logic Unit

  • Arithmetic Logic Unit (ALU)
  • General Registers: R0~R3
  • Data Buffer Register: DR
  • Status Word Register: PSW
  • Performs operations according to commands from the controller.
    • Arithmetic operations, logical operations

(3) Controller

  • Composition of the Controller
    • Program Counter (PC)
    • Instruction Register (IR)
    • Address Register (AR)
    • Instruction Decoder (ID)
    • Timing Generator
    • Operation Controller
  • Decision-Making Unit: Completes the coordination and command of the entire computer system's operations.
  • Main Functions:
    • Fetches an instruction from the instruction cache and indicates the position of the next instruction in the instruction cache.
    • Decodes or tests the instruction and generates corresponding operation control signals to initiate specified actions.
    • Directs and controls the flow of data between the CPU, data cache, and input/output devices.
      • Information flowing between memory and the controller — Instruction Flow
      • Information flowing between memory and the arithmetic unit — Data Flow

5.1.3 Major Registers in the CPU (Key Point)#

Data Buffer Register (DR)#

  • Used to temporarily store the results of ALU operations, or a data word read from the data memory, or a data word from an external interface.
  • Functions
    • Acts as a buffer in time for the transfer of information between ALU operation results and general registers.
    • Compensates for the speed differences between the CPU and memory, peripheral devices.

Instruction Register (IR)#

  • Used to store the currently executing instruction.
  • Instruction Decoder ID (Instruction Decoder)
    • The instruction temporarily stored in the instruction register can only be identified as a certain type of instruction after its opcode part is decoded.
    • The decoder analyzes and interprets the instruction, generating corresponding control signals.

Program Counter (PC)#

  • The Program Counter PC (Programming Counter) is used to store the address of the instruction currently being executed or the address of the next instruction to be executed.
  • If the program executes sequentially, the value of the PC should increase by 1 after each instruction, i.e., PC ← PC+1
  • If the program has jumps: PC ← PC + offset
  • Has both register and counting functions.

Data Address Register (AR)#

  • Used to store the address of the memory unit currently accessed by the CPU. Due to the speed differences between memory and CPU, an address register must be used to hold address information until a read/write operation is completed.

General Registers (R0~R3)#

  • Used for transferring and temporarily storing data.
  • Can also participate in arithmetic logical operations and save the results.
  • Accumulator (AC)
    • A general-purpose register.
    • Provides a workspace for the ALU.
    • Temporarily stores the result information of ALU operations.

Status Word Register (PSW)#

  • Used to store various condition codes established by arithmetic and logical instruction operations or test results.
  • Such as: Carry Flag (C), Zero Flag (Z), information on saving interrupts and system working states, etc.
  • It is a register composed of various status condition flags.

5.1.4 Operation Controller and Timing Generator#

  • (1) Data Path: The path for transferring information between registers.
  • (2) Operation Controller: Provides various operation signals for establishing the data path, to correctly select the data path and load relevant data into a register, thus completing the control of fetching and executing instructions.
  • Depending on the design method, it can be divided into timing logic type and storage logic type:
    • Hardwired Controller: Implemented using timing logic technology.
    • Microprogram Controller: Implemented using storage logic.
  • (3) Timing Generator: Provides timing and timing signals to control various operation signals in terms of time. The CPU also has interrupt systems, bus interfaces, and other functional components.

5.2 Instruction Cycle#

5.2.1 Basic Concept of Instruction Cycle (Key Point)#

Program Execution Process#

The sequence of program execution in a von Neumann architecture computer:

  1. Start from the program's starting address.
  2. Execute each instruction step by step and form the address of the next instruction to be executed.
  3. Automatically and continuously execute instructions until the last instruction of the program.

Instruction Execution Process#

  • Read the instruction.
    • The instruction address is sent to the main memory address register.
    • Read from main memory, and the content is sent to the specified register.
  • Analyze the instruction.
  • Execute the instruction according to the specified content of the instruction.
    • The number of operation steps and specific operation content for different instructions vary greatly.
  • Check for interrupt requests; if none, proceed to the execution of the next instruction.

Instruction Cycle#

Every time the CPU fetches and executes an instruction, it must complete a series of operations. The time required for this series of operations is usually called an instruction cycle.

Machine Cycle#

The machine cycle is also called the CPU cycle. It is usually defined by the shortest time to read an instruction word from memory. Instruction cycles are often expressed in terms of several CPU cycles.

Clock Cycle#

The time of one CPU cycle contains several clock cycles (commonly referred to as pulse cycles or T cycles, which are the most basic units of processing operations). The total of these clock cycles defines the time width of a CPU cycle.

  • Single Cycle, Multi-Cycle: A single cycle means completing the fetch and execute operations in one CPU cycle. Most instructions require multiple CPU cycles to complete all operations of the instruction cycle. Insert image description here
    Insert image description here

5.2.2 Instruction Cycle of the MOV Instruction#

  • In computer design, a block diagram language can be used to represent the instruction cycle of an instruction.
  • Method:
    • Block — CPU cycle
    • Block content — Operations of the data path or some control operations.
    • Diamond symbol — Discrimination or testing (in time, it is attached to the previous block's CPU cycle and does not occupy a separate CPU cycle).
    • ~ — Common operations performed by the CPU after executing an instruction, mainly processing requests from peripheral devices, such as interrupt handling. If there are no requests from peripherals, the CPU will fetch the next instruction. Fetching instructions is a common operation for every instruction. Insert image description here

Summary

  • An instruction consists of a fetch instruction cycle and one or more execution cycles.
  • In each CPU cycle, the data path is clear.
  • The establishment and operation of the data path are controlled by the operation controller, which of course depends on what instruction it is.

5.3 Timing Generator and Control Methods#

5.3.1 Functions and Systems of the Timing Generator#

Functions#

  • The controller in the CPU uses it to command the operation of the machine.
  • The CPU can use timing signals/cycle information to distinguish whether what is fetched from memory is an instruction (fetching) or data (execution).
  • The clock pulse in a CPU cycle strictly constrains the CPU's actions.
  • Various signals issued by the operation controller are functions of time (timing signals) and space (component operation signals).

System#

  • The characteristics of the components that make up the computer hardware determine that the most basic system of timing signals is potential-pulse system (taking D flip-flops as an example).

  • D is the potential input terminal, CP (Clock Pulse) is the pulse input terminal.

  • S is the set terminal, R is the reset terminal.

  • The characteristic equations are as follows:

    • When D=0, the D flip-flop state is set to 0 when the rising edge of CP arrives.
    • When D=1, the D flip-flop state is set to 1 when the rising edge of CP arrives. Insert image description here
  • When transferring data between registers, data is applied to the potential input terminal of the flip-flop, while the control signal for loading data is applied to the clock input terminal of the flip-flop. The high or low potential indicates whether the data is 1 or 0, and it is required that the control signal for loading data must be stable before the potential signal arrives.

  • Depending on the design method, the operation controller can be divided into timing logic type and storage logic type:

    • Hardwired Controller: Implemented using timing logic technology.
    • Microprogram Controller: Implemented using storage logic.
  • Hardwired Controller:

    • Timing signals adopt a main state cycle — pulse potential — pulse signal three-level system.
    • One pulse potential represents the time of one CPU cycle, indicating a larger time unit;
    • Within one pulse potential, there are several pulse signals to represent smaller time units.
    • The main state cycle contains several pulse potentials, which is the largest time unit.
  • Microprogram Control:

    • Timing signals adopt a pulse potential — pulse signal two-level system.

5.3.2 Timing Signal Generator#

  • Functions:
    • Generates timing signals; the timing circuits of different types of computers vary.
    • The timing circuits of large and medium-sized computers are complex, while those of microcomputers are simple.
  • Composition:
    • Clock source
    • Ring pulse generator
    • Pulse and read/write timing decoding logic
    • Start-stop control logic
      Insert image description here

5.3.3 Control Methods#

  • The number of CPU cycles contained in machine instructions reflects the complexity of the instructions, and the number and order of operation signals for different CPU cycles also vary.
  • Control Methods: The methods for forming timing signals that control different operation sequences. Three basic control methods:
    • Synchronous control method
    • Asynchronous control method
    • Combined control method

Synchronous Control Method (The number of machine cycles for instructions and clock cycles remains unchanged)#

  • A completely unified machine cycle executes various different instructions.
  • Uses variable-length machine cycles.
  • A combination of central control and local control.

Asynchronous Control Method#

  • Each instruction takes as long as it needs.
  • The "end" signal generated when the previous micro-operation is completed serves as the "start" signal for the next micro-operation.

Combined Control Method (Used by microprogram controllers)#

  • Most instructions are completed within a fixed cycle, while a few difficult-to-determine operations use asynchronous methods.
  • The pulse signal of the machine cycle is fixed, but the number of machine cycles for each instruction is not fixed.

5.4 Microprogram Controller#

  • Development
    • The concept and principle of microprogramming were first proposed by Professor M.V. Wilkes of the University of Cambridge in 1951 at the Manchester University Computer Conference, when there were no suitable components for storing microprograms in control memory.
    • By 1964, IBM successfully adopted microprogram design technology in the IBM 360 series machines.
    • Since the 1970s, the development of VLSI technology has promoted the development and application of microprogram design technology.
    • Currently, microprogram design technology is widely used in large, medium, and microcomputers.
  • Basic Idea:
    • Following the method of solving problems, operation control signals are compiled into microinstructions, stored in control memory, and during operation, microinstructions are fetched from control memory to generate the operation control signals required for instruction execution, enabling the corresponding components to perform the specified operations.
    • From the above, it can be seen that microprogram design technology is a technique for designing hardware using software methods.

5.4.1 Principles of Microprogram Control#

1. Microcommands and Microoperations#

  • Microcommands: Various control commands sent from control components to execution components through control lines.
    • Microcommands are the smallest, most basic units of control signals.
  • Microoperations: The operations performed by the execution components after receiving microcommands.
  • Microcommands and microoperations correspond one-to-one. Microcommands are the control signals for microoperations, and microoperations are the operational processes of microcommands. Microoperations are the most basic operations in execution components.
  • Due to the structural relationships of the data path, microoperations can be divided into compatibility and exclusivity.
    • Exclusive Microoperations: Refers to microoperations that cannot be executed simultaneously or in parallel within the same clock cycle.
      • Compatible Microoperations: Refers to microoperations that can be executed simultaneously or in parallel within the same clock cycle.

2. Microinstructions and Microprograms#

  • Microinstructions: The microcommands executed in parallel within the same CPU cycle are stored in control memory, referred to as a microinstruction.
  • Microprogram: A program composed of several microinstructions used to implement instruction functions.
  • Each machine instruction corresponds to a segment of microprogram; by interpreting and executing this segment of microprogram, the operations specified by the instruction are completed.

3. Relationship Between Machine Instructions and Microprograms (Key Point)#

  • A machine instruction corresponds to a microprogram, which is composed of a sequence of several microinstructions.
  • Machine instructions are related to memory storage, while microinstructions are related to control memory. Insert image description here

5.4.2 Microprogram Design Technology#

1. Microcommand Encoding#

Microcommand encoding is the representation method used for the operation control fields in microinstructions.
There are three encoding methods: direct representation method / encoding representation method / mixed representation method.

Direct Representation Method#
  • In the microinstruction, each bit in the operation control field represents a microcommand. Each bit can directly control the computer without the need for decoding.
  • For example, each independent binary bit in the operation control field represents a microcommand; a bit of "1" indicates that this microcommand is valid, while "0" indicates that this microcommand is invalid. Example of microinstruction format (TEC-8 experimental platform format)
  • Characteristics:
    • This method has a simple structure, strong parallelism, and fast operation speed, but the microinstruction word is too long. If there are N microcommands, the operation control field of the microinstruction word must have N bits.
    • Additionally, among the N microcommands, many are mutually exclusive and do not allow parallel operations. Arranging them in one microinstruction is meaningless and will only reduce the utilization of information.
Encoding Representation Method#
  • A group of mutually exclusive microcommand signals is combined into a field, and each microcommand signal is decoded by a field decoder, with the decoded output serving as the operation control signal. Insert image description here
  • Characteristics of the encoding representation method:
    • Can avoid exclusivity, greatly shortening the instruction word.
    • However, it increases the decoding circuit, which slows down the execution speed of the microprogram.
Mixed Representation Method#
  • Combines the first two methods, taking advantage of both.
  • Some encodings in a field cannot independently define certain microcommands and need to be jointly defined with encodings from other fields.
    Insert image description here
  • Points to note for encoding: The operation control field in the field encoding method is not arbitrary; it must follow the following principles:
  • ① Place mutually exclusive microcommands in the same segment, and compatible microcommands in different segments. This not only helps improve information utilization and shorten the microinstruction word length but also helps fully utilize the parallelism of hardware and accelerate execution speed.
  • Should be compatible with the structure of the data path.
  • The number of information bits in each small segment should not be too many, otherwise it will increase the complexity of the decoding lines and the decoding time.
  • Generally, each small segment should also leave a state indicating that no microcommands are issued from this field. Therefore, when the length of a field is three bits, it can represent at most seven mutually exclusive microcommands, usually using 000 to indicate no operation. The following example illustrates this. Insert image description here

2. Formation Methods of Microaddresses#

  • The issue of controlling the execution order of microinstructions is essentially the problem of how to determine the address of the next microinstruction.
  • Entry Address: Each machine instruction corresponds to a segment of microprogram. After the common fetch microprogram fetches the machine instruction from main memory, it finds the entry address of the microprogram corresponding to the machine instruction based on the opcode field of the machine instruction.
  • There are mainly two ways to generate subsequent microaddresses:
    • Counter Method
    • Multipath Transfer Method
Counter Method#
  • Method:
    • In sequentially executing microinstructions, the subsequent microaddress is generated by adding an increment to the current microaddress;
    • In non-sequentially executing microinstructions, it must be done through transfer methods, after executing the current microinstruction, transfer to execute the next microinstruction at the specified subsequent microaddress. In this method, the microaddress register is usually changed to a counter.
  • Advantages: The control field for the sequence of microinstructions is relatively short, and the microaddress generation mechanism is simple.
  • Disadvantages: The multipath parallel transfer function is relatively weak, speed is slow, and flexibility is poor.
Multipath Transfer Method#
  • Multipath Transfer:
    • A microinstruction has the capability of multiple transfer branches.
    • When the microprogram does not produce branches, the subsequent microaddress is directly given by the sequential control field of the microinstruction.
    • When the microprogram has branches, there are multiple "subsequent" microaddresses to choose from.
    • The selection of one microaddress is based on the "discrimination test" flag and "status condition" information in the sequential control field.

3. Microinstruction Format (Key Point)#

Divided into two categories: horizontal microinstructions and vertical microinstructions.
(1) Horizontal Microinstructions

  • Horizontal microinstructions refer to microinstructions that can define and execute multiple microcommands in parallel at once. The format is as follows:
Control FieldDiscrimination Test FieldNext Address Field

(2) Vertical Microinstructions

  • Sets the microoperation code field in the microinstruction, using the microoperation code compilation method, the microoperation code specifies the function of the microinstruction. Similar to the structure of machine instructions! Insert image description here

Comparison of Horizontal and Vertical Microinstructions

  • Horizontal microinstructions have strong parallel operation capability, high efficiency, and flexibility, while vertical microinstructions are relatively poor.
  • Horizontal microinstructions take a short time to execute an instruction, while vertical microinstructions take longer.
  • The microprogram interpreted by horizontal microinstructions has the characteristic of long microinstruction words and short microprograms. Vertical microinstructions are the opposite.
  • Horizontal microinstructions are difficult for users to master, while vertical microinstructions are relatively similar to instructions and are easier to master.

5.5 Hardwired Controller (Omitted)#

5.6 Pipelined CPU#

5.6.1 Parallel Processing Technology#

Concept of Parallelism#

  • The problem has the characteristic of being able to perform calculations or operations simultaneously.
  • Example: Under the same delay conditions, using an n-bit processor for n-bit parallel calculations is almost n times faster than using a one-bit processor for n-bit serial calculations (narrow definition). The broad definition means that as long as two or more tasks of the same or different nature are completed at the same time (simultaneity) or within the same time interval (concurrency), they overlap in time, reflecting parallelism.
  • As long as two or more tasks of the same or different nature are completed at the same time (simultaneity) or within the same time interval (concurrency), they overlap in time, reflecting parallelism.
  • Three forms:
    • Temporal Parallelism (Overlapping): Allowing multiple processing processes to stagger in time, taking turns using various components of the same hardware device to speed up hardware turnover and gain speed, achieved by using pipelined processing components.
    • Spatial Parallelism (Resource Duplication): Winning by quantity.
      • It can truly reflect simultaneity.
      • LSI (Large Scale Integration) and VLSI (Very Large Scale Integration) provide technical guarantees for it.
    • Time + Spatial Parallelism: The Pentium uses superscalar pipelining technology.

5.6.2 Structure of Pipelined CPU#

  • The system composition of pipelined computers:
    • Memory System: Main memory uses multi-body cross storage; Cache.
    • Pipelined CPU: Instruction components, instruction queue, execution components.
      • Instruction pipeline.
      • Instruction queue: FIFO.
      • Execution components: Can consist of multiple arithmetic logic components constructed in a pipelined manner, separating fixed-point operation components and floating-point operation components.
  • To achieve pipelining, the input task (or process) is first divided into a series of subtasks, allowing each subtask to be executed concurrently in various stages of the pipeline.
  • When tasks are continuously input into the pipeline, the execution results are continuously output from the pipeline, thereby achieving task-level parallelism.

5.6.3 Major Issues in Pipelining (Key Point)#

Bottleneck Problem#

  • There are slow segments in the pipeline.
  • Solutions:
    • Further divide into several segments.
    • Use resource duplication methods.
  • Caused by the occurrence of related conflicts.
  • Resource Dependency, Data Dependency, Control Dependency
  • Resource Dependency: After multiple instructions enter the pipeline, they compete for the same functional component within the same clock cycle.
    • Solution: Delay the subsequent instruction before advancing; add a functional component.
  • Data Dependency: In a program, if the next instruction must wait for the previous instruction to complete before it can execute, then these two instructions are data dependent.
    • RAW (Read After Write)
      • The subsequent instruction uses data written by the previous instruction. Insert image description here
    • WAW (Write After Write)
      • Two instructions write to the same unit.
    • WAR (Write After Read)
      • The subsequent instruction overwrites the unit read by the previous instruction.
    • Solutions:
      • The read operation on the related unit by the subsequent instruction can be delayed.
      • Set up direct paths for related operations (Forwarding).
  • Control Dependency
    • Causes: When executing a transfer instruction, based on the results of the transfer condition, it may fetch the next instruction in sequence or transfer to a new target address to fetch instructions, causing the pipeline to stall.
    • Solutions:
      • Delay Transfer Method: Let jump instructions follow the last pipeline entry.
      • Transfer Prediction Method: Use hardware to predict future behavior, allowing transfer instructions to enter the pipeline in advance.

Insert image description here

Chapter Summary#

  • The CPU is the central processing component of the computer, with basic functions such as instruction control, operation control, timing control, and data processing. Early CPUs were composed of two main parts: the arithmetic unit and the controller. With the development of high-density integrated circuit technology, today's CPU chips have become three main parts: the arithmetic unit, cache, and controller, which also include floating-point units, storage management components, etc. The CPU must have at least six types of registers: Instruction Register, Program Counter, Address Register, Data Buffer Register, General Registers, and Status Condition Register. The time taken by the CPU to fetch an instruction from memory and execute it is called the instruction cycle. In CISC, due to the different operational functions of various instructions, the instruction cycles for different instructions vary. The division of instruction cycles is an important basis for designing operation controllers.
  • The timing signal generator provides the timing signals required for CPU cycles (also called machine cycles). The operation controller uses these timing signals to time the orderly fetching and executing of an instruction. Microprogram design technology is a technique for designing operation controllers using software methods, with advantages such as regularity, flexibility, and maintainability, thus widely applied in computer design. However, with the development of ULSI technology and the demand for machine speed, the idea of hardwired logic design has gained attention. The basic idea of the hardwired controller is that a certain microoperation control signal is a logical function of the instruction opcode decode output, timing signals, and status condition signals, i.e., writing out the logical expression using Boolean algebra, and then implementing it with gate circuits, flip-flops, and other devices.
  • Parallel processing technology can run through all steps and stages of information processing. In summary, there are three main forms: ① Temporal Parallelism; ② Spatial Parallelism; ③ Temporal Parallelism + Spatial Parallelism. The pipelined CPU is a processor constructed based on the principle of temporal parallelism, which is a very economical and practical parallel technology. Almost all current high-performance microprocessors use pipelining technology without exception. The main issues in pipelining are resource dependency, data dependency, and control dependency, and corresponding technical measures need to be taken to ensure the smooth flow of the pipeline.
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.