All uniprocessor systems follow the von Neumann model. The von Neumann architecture is characterized by a CPU and central memory system, with instructions and data being read from memory.
After an instruction has been read the instruction is decoded and then any relevant operands fetched from memory, the instruction is executed and the result stored back in memory. The single data path between the CPU and memory over which both instructions and data must pass, and the sequential nature of instruction execution together limit the performance possible from the computer. This is sometimes known as the von Neumann bottleneck.
This is aided in uniprocessor’s by pipelining and superscalar design enhancements.
One form of classification for von Neumann machines is based on the number of instructions that can be executed at any one time and on the number of chunks of data that can be operated on at a time. In 1972 Michael Flynn introduced a classification of various computer architectures based on notions of instruction and data streams. The number of instructions is given as either SI for single instruction or MI for multiple instruction and the number of pieces of data is given as either SD for single data or MD for multiple data. Machines can thus be classified as SISD, MISD, SIMD or MIMD.
11.1 Single instruction single data(SISD)
The classical von Neumann machine can be regarded as a single-instruction-single-data machine in that at any one time only a single instruction is being executed, and only a single piece of data is being operated upon. This is where part of the problem arises, since we often want to perform the same instruction on many different pieces of data, and the von Neumann machine requires us to fetch the same instruction many times, once for each piece of data. In fact the situation is much worse since a von Neumann machine will usually require us to create a loop, and so we will need to execute many instructions for each piece of data. This can slow the machine down many times over what the arithmetic unit is capable of performing.
11.2 Multiple instruction single data(MISD)
The multiple instruction single data (MISD) architecture is the most uncommon one. In this architecture, the same data stream flows through a linear array of processors, executing different instructions on the stream. This kind of architecture is also known as a systolic array for pipelined execution of specific algorithms.
11.3 Single instruction multiple data(SIMD)
For problems in which the same operation needs to be performed on many pieces of data, particularly those involving vectors and arrays, SIMD (single-instruction multiple-data) architectures are often capable of high speeds. A single CPU controls many arithmetic units, each of which operates on its own data. Each arithmetic unit executes the same instruction as determined by the CPU, but uses data found in its own memory. Thus all the elements of two vectors could be added together simultaneously, increasing the speed of the operation many times over a SISD machine.
In practice, the provision of many arithmetic units is expensive, particularly since many of them will not be in use at any given time. Even if a large number of arithmetic units are provided, the size of vectors and arrays will rarely be a multiple of the number of arithmetic units and so some inefficiency in the use of the arithmetic units will arise.
A more effective use of hardware can be obtained by pipelining the arithmetic unit. A hardware floating point accelerator will already contain dedicated hardware for each part of the calculation of a floating point operation. By pipelining the use of this hardware, significant improvements can be made in processor performance. This technique will not give as high a performance as a true SIMD machine, but the improvements can be significant.
11.4 Multiple instruction multiple data(MIMD)
The most general form of von Neumann architecture is the multiple-instruction-multiple-data machine. A MIMD machine is usually a number of separate processors connected together through some interconnection network. The actual format of interconnection between the processors can take many forms, depending on the type of problem, which the machine is designed to solve. This is the most common architecture chosen for multiple processor machines because modern processors have the control logic for parallel systems built in. Therefore, this is attractive since software, replacement parts and additions to the system are easily accessible.