Wednesday, July 9, 2014

Introduction to pipelining and its CPU architecture

Architecture of propcessor

 
Modern processors are rich in features that lead to an overall high execution speed for a given application. An analysis of their architectures is generally complex; each feature has an effect dependent on the others and it is difficult to quantify the benefits of the feature in isolation. Our simple processor follows the single-issue RISC strategy instructions are fetched one by one from memory and they specify either an operation on data stored in the processor register-file or an I/O operation on external memory.

It is a single-stage processor: instructions are fetched only when previous ones have been completed. The architecture for such a machine is straightforward. The instruction is read (or fetched), decoded, operands are read, the operation is executed, and then the result is stored each step corresponds to a module performing the corresponding action passing information with the previous and following one.
  

Introducing pipelining

Our simple processor executes each instruction. Through five basic sequential actions: fetch, decode, operand read, execute and write back . During the execution process, the actions required by consecutive instructions might be independent of each other. In this case, we could imagine concurrently fetching, decoding and executing them This observation suggests we could decompose the processor into a pipeline, in which each stage performs a basic action. The overall organization resembles an automotive assembly line obtained by inserting storing elements) in between the stages of our single-block processor The described pipelined architecture exploits the vertical dimension of instruction parallelism (concurrent execution of different actions relative to different instructions) as opposed to the horizontal one (concurrent execution of the same actions relative to different instructions).
 
pipelining has a twofold effect on performance. Since each stage has to locally store data received from the previous stage, latency is increased. On the other hand, throughput is dramatically improved, because instruction execution is overlapped. In an ideal situation, the improvement factor would be equal to the numbers of pipeline stages (i.e. pipeline depth), which corresponds to the number of basic actions.
For More information:-

0 comments:

Post a Comment