Machine code instructions

Here is a simplified account of machine code instructions for the 8088. Each machine code instruction is chosen to be a multiple of 8, since that is how memory is stored. This makes the machine code more efficient. If instructions could be 7-bits or 11-bits, then space in the machine code is wasted.

Machine code instrucions are also chosen to be as short as possible. That is the reason for such a complicated conventional machine language for the 8088. Shorter instructions are executed faster. To be more reasonable, the designers would have to have added 1 or 2 bits to the instruction. But if 1 bit is added, then 1 byte is added, due to the way memory is accessed. So, space was saved at the expense of flexibility. This is why only certain registers can be used for addressing - only three bits were allowed in the instruction. If more registers were allowed for indirect memory references, then more bits would have to be added to the instruction. The code is found in the R/M field, described below. Here is a table that shows the relationship between bits in the machine code and addressing modes when making an indirect memory reference:

	Code        Indirect reference
	000         DS:[BX+SI+disp]
	001         DS:[BX+DI+disp]
	010         SS:[BP+SI+disp]
	011         SS:[BP+DI+disp]
	100         DS:[SI+disp]
	101         DS:[DI+disp]
	110         SS:[BP+disp]
	111         DS:[BX+disp]
Similarly, the reason that the segment registers are limited in their use, is due to the size of the field in the conventional machine language for specifying the register. This information is stored in the REG field, described below. There is also a bit, the W bit, which indicates if the access is to a byte or a word. A table representing which registers can be specified:
	Code        Register when W=1 Register when W=0   
	000         AX			AL
	001         CX			CL
	010         DX			DL
	011         BX			BL
	100         SP			AH
	101         BP			CH
	110         SI			DH
	111         DI			BH

The layout for a machine instruction

All machine instructions on the 8088 have a similar layout. There are instructions with no operands, 1 operand and 2 operands. They are similar, so only the 2 operand instructions will be considered here.There is space for the OPCODE, which tells what the instruction does. There are two fields for the operands. One operand is always a register, the other can be a register or memory. The register field is always first in the machine code instruction, so there is a bit to indicate if the register field is the source or the destination of the operation. There is also a bit to indicate if this is a byte or word operation. Optionally, there are extra bytes added for immediate values, direct memory addresses, or indirect memory displacements. Here is a simplified layout
OP        6 bits         Indicates the operation being performed
Dest      1 bit          A 1 indicates the the register operand is the destination
W         1 bit          A 1 indicated that the operation is on words
Mod       2 bits         Indicates what the second operand is:
                    00   direct or indirect memory reference     
                    01   indirect memory reference with a byte displacement
                    10   indirect memory reference with a word displacement
                    11   operand is a register
Reg       3 bits         indicates which register is being used
R/M       3 bits         used with MOD to indicate what the second operand is, R/M stands
                         for register or memory

Some examples using the ADD instruction

03 1E 0000 R        ADD  BX,NUM
000000 11 00 011 110

OP   000000 Add
Dest 1      Reg operand is destination
Word 1      Working on words
Mod  00     Mod=00 and Mem=011 means memory direct
Reg  011    BX
R/M  110    Mod=00 and Mem=011 means memory direct

01 1E 0000 R             ADD  NUM,BX
000000 01 00 011 110

OP   000000 Add
Dest 0      Reg is source
Word 1      Working on words
Mod  00     Mod=00 and Mem=011 means memory direct
Reg  011    BX
R/M  110    Mod=00 and Mem=011 means memory direct
0000 R      Offset to NUM, relocatable

83 06 0000 R 07     ADD  NUM,7
100000 11 00 000 110
Immediate has a different op code, the Dest bit is always 1, and the Reg field is alwasy 0

OP   100000 Add immediate
Dest 1      Always 1 for immediate
Word 1      Working on words
Mod  00     Mod=00 and Mem=110 means memory direct
Reg  000    Always 0 for immediate      
R/M  110    Mod=00 and Mem=110 means memory direct                    
0000 R      Offset of NUM, relocatable
07          Immediate value, not relocatable

83 C3 07    ADD     BX,7
100000 11 11 000 011

OP   100000 Add
Dest 1      Always 1 for immediate
Word 1      Working on words
Mod  11     Storing in register
Reg  000    Always 0 for immediate
R/M  011    BX
07          Immediate value

83 C0 07  ADD       AX,7
100000 11 11 000 000

Immediate has a different op code, the Dest bit is always 1, and the Reg field is alwasy 0

OP   100000 Add immediate
Dest 1      Always 1 for immediate
Word 1      Working on words
Mod  11     Storing in register
Reg  000    Always 0 for immediate
R/M  000    AX
07          Immediate value

03 D3               ADD  DX,BX
000000 11 11 010 011

OP   000000 ADD
Dest 1      Reg is destination
Word 1      Working on words
Mod  11     reg to reg
Reg  010    DX
R/M  011    BX

03 17               ADD  DX,[BX]
000000 11 00 010 111

OP   000000 Add
Dest 1      Reg is destination
Word 1      Working on words
Mod  00     Mod=00 and Mem<>011 means memory indirect, no displacement
Reg  010    DX
R/M  111    DS:[BX+disp], disp is 0 here

03 57 02    ADD     DX,[BX+2]
000000 11 01 010 111

OP   000000 Add
Dest 1      Reg is destination
Word 1      Working on words
Mod  01     Mod=01 means memory indirect for 8 bit displacement
Reg  010    DX
R/M  111    DS:[BX+disp], disp is 2 here
02          Displacement