Anna Wierciak

It is much easier to learn if you understand what’s happening under the hood.

At the heart of every computer there are two key components – a CPU (central processing unit)…

…and memory…

Memory stores both data and the program itself. CPU can read and write both data and its own program – instructions for what to do – operations – from memory. Memory address is the sequential number of the byte from the very beginning of memory from where instructions or data can be read, or to which data can be written.

CPU can execute very simple operations very, very quickly. It can add numbers, subtract numbers, multiply, divide numbers, and read and write numbers to memory.

It can do billions of these simple operations per second.

To know what to do, which operations to execute, CPUs would read the next instruction from memory. Instruction is just a number, but each number denotes a specific operation – read a byte from memory, or add two bytes, or write a byte back to memory. A number ENCODES the instruction, which is why computer programs are called CODE.

After reading the operation from memory, CPU would execute it – do what the instruction tells it to do. It will then proceed to the next instruction in memory to read the next operation, do it, and so on.

When CPU performs arithmetic operations, it does it on numbers that it stores locally – in special storage locations called registers. A CPU might have 8 or 16 of these registers, and its logical circuits can add, subtract, multiply, and divide their contents and store results in registers.

When the data is read from memory, it is first put in registers, then some arithmetic operation is done on the registers, then the data is stored back into memory.

The language of CPU is called “Assembly”. Assembly assigns a human-readable mnemonic to each code, so that humans that write in CPU language do not have to operate in numbers. Mnemonic is easier to read because it better indicates what the CPU would do.

Assembly mnemonic light looks like this:

READ [100], r1

READ [101], r2

ADD r1, r2, r3

WRITE r3, [102]

The program above would read bytes from memory locations 100 and 101 (into registers r1 and r2), add these addresses together (putting the results of the addition into r3), and then write it into memory location 102 (from r3).

When programmers write code as above, they put it into a text file. Then, a program called “compiler” (or “assembler” if the language is Assembly) runs to convert it into true machine code – the numbers CPU understands. Compiled (or assembled) code is then put into another file – this is now the application that can be loaded by computer’s operating system so the program might run.

Inside compiled (or assembled) application are numbers – code – that might look like this for the program above:

14 100 24 101 32 76 102

Humans don’t know how to read these numbers, and that’s why they need programming languages to make life easier. But computers do!

Assembler languages are easy to understand, but they are hard to use. For starters, every CPU has its own dialect. Also, programs written in assembly is very verbose. For this reason, people invented “higher level” languages such as Java – they are more expressive, easier for programmers to understand, and converting them to machine code is left to compilers.