The number of instructions, type of architecture (stack-based, register-based, etc.), the number of addressable registers, the number of internal busses and the type, format and location of the operands will have an influence on the format i of an instruction. For access to primary memory, the memory organization, in particular the exchange format (byte or word), byte order (remember the Endian story! cf. § V1-2.2.1) and the alignment (cf. § 2.6.1 from Darche (2012)), will have some influence. The ISA can be evaluated by the number of instructions F, their complexity, their format i and the memory space they occupy. The designer's choice will depend on the function of the desired performances (execution time, memory requirement, etc.), of the usage domains and the manufacturing cost. Complexity, if it is not material, could affect the software, in particular the compiler as in the RISC approach and in the programmer. The appendix shows the instruction encoding table for MPU 6809E from Motorola. For information, the aspect of decoding an instruction has been discussed in the previous volume.
1.1.1. Code compression
In order to limit the programs' memory footprint for reasons of cost, memory size, performance or, in particular, power saving, one solution is to compress the machine code at compilation and its decompression at execution, for example, when it is loaded in the MPU cache memory (Wolfe and Chanin 1992). One benefit lies in the fact that the compiler has not been modified. For implementation, the Huffman (1952) (de)compression algorithm can be used, for example. Because of its objectives, it is intended especially for embedded systems with an MPU/MCU5 RISC. Two industrial examples are Thumb® and Thumb-2 for which the 16-bit instruction word is a compression of the classic version of Arm® processors, which have a 32-bit format. RISC-V (Waterman 2016) has a compressed version of its code suggested by (Waterman 2011). A comparison between MPUs can be made using a measurement of the code density.
The principle can quite clearly be applied to data and to buses (cf. V2) for the same aims.
1.2. Addressing modes
We recall that the address is a whole number that makes it possible to identify (we also say locate or spot) a place in the memory (cf. § V1-2.1). This, generated by an MPU, is termed “physical” (PA for Physical Address) since it is this that will be carried by the address bus. This physical address can be positive (i.e. natural integer) or also negative (i.e. relative integer) in the case of an address in Assembly Language (AL) or machine language, for example, for a displacement relative to the current value of the PC (Program Counter). Addressing is the mechanism for accessing information (data and instructions) stored in MPU registers or in other levels of the memory hierarchy (cf. § V1-2.3). The addressing or referencing mode specifies how to reach the instruction (code addressing mode) and its operands (operand addressing mode) during its execution. This distinction between the addressing code and its operands, which may moreover be an instruction classification (cf. § 2.1), may not exist (which is the most common scenario). One of the difficulties of using the concept is that its designation and its semantics vary depending on the architecture and on the designer of the CU (Control Unit). Thus, it involves sometimes only the memory address (memory address calculation mode) or it also covers the registers (operand addressing mode). The definition is taken in its widest sense. It does not therefore only involve access to the primary memory. The different addressing modes add to the wealth of a processor, and their number still varies depending on the architectures and designers. Addressing modes are one of the ISA specification points (cf. § V1-3.5). For example, the IBM System/360 mainframe computer only has three (immediate, register and memory), but the Pentium microprocessor has nine. The more possibilities there are, the less the assembly language programmer will have to write the lines of code to carry out the desired operation. The argument refers today to the compiler designer, as assembly language is used less and less, except for teaching purposes or to meet a specific need in the use domain (cf. § V2-1.3). The other side of the coin is a more complex control unit and a longer execution for the instruction using it. We will see what the consequences of this will be covered in a future book by the author on microprocessors, which studies, among others, the RISC approach. If necessary, it specifies the means used to calculate the effective address (EA), also called the target address. This address is the result of the evaluation of an address according to its addressing mode. It will be applied on the address bus to reference the memory location if there is no virtual address mechanism at work (a mechanism that will be covered in a future book by the author on microprocessors). A synonym for EA (Effective Address) is “physical address”. In the contrary scenario, the effective address is a logical address that should then be translated into a physical address in the case of the Virtual Memory (VM) mechanism. Depending on the manufacturers, the name may also be different or there may be other nuances. To finish, some microprocessors distinguish access to instructions and to their operands from access to Input–Output (I/O) registers with specialized instructions (I/O addressing mode), thus making it possible to address different Address Space (AS) (cf. § V3-2.1.1.1). One example is shown in § 2.8.2.
We define four modes of basic (i.e. simple) addressing, which are immediate addressing, implicit and explicit addressing and memory addressing. Memory addressing is broken down into direct, relative, indirect, indexed and based addressing. These modes indicate the way to fetch or store the operand. The storage of one value can only be done in a register or memory location. There can then exist combinations of these basic addressings, called complex addressings that can be replaced using a sequence of instructions with simple addressing. The other modes involve primary memory, the stack, the bit, the registers and those specific to a particular MPU family. To illustrate these, we have chosen some instructions that are representative of various MPUs. In these examples, all digital data will be expressed on base 10 (implicit base) with the exception of indications in the form of a character prefixing or post-fixing the value or of a number in subscript. To define the operand, the rules of syntax inspired by those of the MC6809 microprocessor will be the following:
#: immediate value
$: hexadecimal base
%: binary base
The registers will be the following:
PC: Program Counter
A: accumulator
The conventions for the pseudo-code will be the following:
← or =: assignment of the right-hand value (similar to an rvalue) in the left identifier (similar to an lvalue). The symbol used means “receives” or “takes the value”. This left–right positional information avoids using parentheses, but it makes use of them for the right-hand value; they mean “contained in”.
(): address access, of which the value is framed.
@: (calculation of) the two-point symbol address: concatenation
1.2.1. Immediate addressing
Immediate addressing mode, also called immediate data addressing mode, makes it possible to initialize a register or a memory location with a constant value d, which is specified after the instruction mnemonic (cf. § 2.1) (Figure 1.8), hence its other name “literal addressing mode”. There is no effective address here since the memory is not addressed, but (DEC 1983) called it “PC immediate mode with autoincrement” as the PC (Program Counter) is used to address the value memorized immediately after the instruction code. One example is LDA #%10101010
from MC6802 from Motorola, which means that the accumulator A receives the immediate binary value 10101010b (b for binary) in byte format.
Figure 1.8. Instruction with an operand field
It is one of the fastest addressing modes since the value is included in the instruction and there is therefore no additional access to the main memory to fetch the operand accessed by another