In this multi-part series, we explore the evolution of the microprocessor and its astonishing growth in processing power over the decades. In Part 1, we learn about the first commercial CPU, the Intel 4004 and examine how it and similar early CPU's work at the fundamental level.
During the mid-1960s a revolution in miniaturization was kick-started. The idea of packing dozens of semiconductor-based transistors on to a single silicon chip spawned the integrated circuit. It laid the groundwork for a complete paradigm shift in how modern society would evolve. In March of 1971, the commercial launch of a new semiconductor product set the stage for this new era. Composed of a then-incredible 2,300 transistors, the Intel 4004 central processing unit or CPU was released.
For comparison, ENIAC, the first electronic computer built just 25 years earlier could only execute 5,000 instructions a second. But what made the 4004 so powerful wasn’t just its 1800% increase in processing power - it only consumed 1 watt of electricity, was about ¾” long and cost $5 to produce in today’s money. This was miles ahead of ENIAC’s, cost of $5.5 million in today’s money, 180kW power consumption, and 27-ton weight.
In order to understand how a CPU derives its processing power, let examine what a CPU actually does and how it interfaces with data. For all intents and purposes, we can think of a CPU as an instruction processing machine. They operate by looping through three basic steps, fetch, decode, and execute. As CPU designs evolve these three steps become dramatically more complicated and technologies are implemented that extend this core model of operation.
FETCH
In the fetch phase, the CPU loads the instruction it will be executing into itself. A CPU can be thought of as existing in an information bubble. It pulls instructions and data from outside of itself, performs operations within its own internal environment, and then returns data back. This data is typically stored in memory external of the CPU called Random Access Memory or (RAM). Software instructions and data are loaded into RAM from more permanent sources such as hard drives and flash memory. But at one point in history magnetic tape, punch cards, and even flip switches were used.
BUS
The mechanism by which data moves back and forth to RAM is called a bus. A bus can be thought of as a multi-lane highway between the CPU and RAM is which each bit of data has its own lane. But we also need to transmit the location of the data we’re requesting, so a second highway must be added to accommodate both the size of the data word and the address word. These are called the data bus and address bus respectively. In practice, these data and address lines are physical electrical connections between the CPU and RAM and often look exactly like a superhighway on a circuit board.
REGISTER
The address of the memory location to fetch is stored in the CPU, in a mechanism called a register. A register is a high-speed internal memory word that is used as a “notepad” by CPU operations. It’s typically used as a temporary data store for instructions but can also be assigned to vital CPU functions, such as keeping track of the current address being accessed in RAM. Because they are designed innately into the CPU’s hardware, most only have a handful of registers. Their word size is generally coupled to the CPU’s native architecture.
DECODE
Once an instruction is fetched the decode phase begins. In classic RISC architecture, one word of memory forms a complete instruction. This changes to a more elaborate method as CPUs evolve to complex instruction set architecture, which will be introduced in part 2 of this series.
BRANCHING
Branching occurs when an instruction causes a change in the program counter’s address. This causes the next fetch to occur at a new location in memory as oppose to the next sequential address.
OPERAND
Opcodes sometimes require data to perform its operation on. This part of an instruction is called an operand. Operands are bits piggybacked onto an instruction to be used as data. Let say we wanted to add 5 to a register. The binary representation of the number 5 would be embedded in the instruction and extracted by the decoder for the addition operation.
EXECUTION
In the execution phase, the now configured CPUs is triggered. This may occur in a single step or a series of steps depending on the opcode.
CLOCKS
In a CPU these 3 phases of operation loop continuously, workings its way through the instruction of the computer program loaded in memory. Gluing this looping machine together is a clock. A clock is a repeating pulse use to synchronize a CPU’s internal mechanics and its interface with external components. The CPU clock rate is measured by the number of pulses per second or Hertz.