The RAMM Computer

RAMM Assembly Language

The fundamental language of any computer is its absolute machine language. Machine language is the only language to which the computer can respond directly and therefore all other computer languages must be translated into the computer's machine language before execution of the program can be accomplished. There are many different machine languages as each computer processor has a specified set of instructions that it can execute. The machine language of a computer is normally in binary representation.

There are several difficulties inherent in programming in numeric machine code. Numeric operation codes (even when represented in decimal code or hexadecimal code corresponding to the binary representation) are hard to associate with any specific operation such as "add", "jump", and so on. The assignment of addresses to both instructions and data and the insertion of appropriate addresses as operands in instructions are tedious and conducive to error. Changing a few words of code in a program may make it necessary to alter many addresses scattered throughout the program.

A programmer very seldom writes a program in machine language. Instead, languages categorized as assembly languages are sometimes used. Assembly language is characterized by the fact that for each assembly language instruction there is usually one machine language instruction that will accomplish the desired operation. We say that there is a one-to-one correspondence between assembly language instructions and machine language instructions. The assembly language of a computer is termed machine dependent since the allowable instructions in an assembly language must usually have counterparts in the machine language of the computer involved. We can think of the assembly language for a computer as being merely a symbolic form of machine language coding with some assistance from a translator program (an assembler) in checking that the code is grammatically (or syntactically) correct.

Assembly language provides the programmer with a greater freedom and ease in program coding than is available in machine language. Once the programmer has written the program in assembly language, the program is input to the computer and translated into machine language by a program called an assembler. The assembler translates the symbols and symbolic codes of assembly language, called the source code, into numeric codes and addresses in the machine language of the computer. The machine language program that results from this translation process is called the object code. The object code is then input to the computer for execution of the computer instructions to be performed. The diagram below (Figure 10) illustrates the computer processing involved starting with input of a source program coded in assembly language to output produced by the object program when it is finally executed to solve the problem.

Figure 10

The components of an assembly language are similar to those of a written natural language. For example, there is an alphabet, there are rules of grammar (syntax), and there are statements. Some statements produce numerical information to be stored in memory and others simply give directions to the assembler. In the pages that follow, the assembly language for the RAMM computer will be presented.

There are differences, however, between computer languages and natural languages that may cause major difficulties to the programmer. First, unlike natural languages, the grammar of an assembly language is usually well defined, very precise, and somewhat restrictive. Second, a computer language is used for communication between a human being and a machine, while a natural language is used for communication between two human beings. Third, a computer (or computer program) will do exactly what it has been told to do. A person who receives an ungrammatical letter is likely to allow for the poorly formed sentence or the misused word and understand, at least to some extent, what was intended rather than what was said. The computer makes no such allowances.

For even greater freedom writing programs, high-level languages (HLLs) may be used. Many of these languages are also termed procedure-oriented because the symbols and notations used in the language are related to the type of problem being solved. For example, much of the code in a FORTRAN program looks like the mathematical formulas that the program was written to solve. Some common high-level languages are FORTRAN, COBOL, Pascal, C, C++ and BASIC. An instruction written in a high-level language usually results in a number of machine language instructions. For this reason, we say that there is a one-to-many correspondence between instructions written in a high-level language and instructions in machine language. High-level languages are machine independent because a program written in a high-level language can be translated into the machine languages of many different computers. As assemblers translate assembly language programs, compilers or interpreters translate high-level language programs into machine language before execution of the computer instructions can take place.

The remainder of this paper will be concerned with a presentation and description of the RAMM assembly language. As with any written language, there must be an alphabet or character set so that information may be transcribed.

Characters

A character in RAMM assembly language is a printed alphabetic capital letter A through Z, a numeral 0 through 9, or the special characters +, -, and blank. The term alphanumeric character includes both alphabetic and numeric characters and no others. The lowercase b will be used to indicate a blank when it is desirable to call attention to a blank in this presentation of the RAMM assembly language. The RAMM Assembler does not, however, recognize the lowercase b. Blanks will also be indicated by blanks where it is not likely to cause confusion. The term first character refers to the leftmost character of a string of characters.

Symbols

A RAMM symbol is composed of up to four characters left justified in a field. The first character must be alphabetic; the remaining, alphanumeric. A blank terminates a symbol. Left justified means that the leftmost character in the field appears in the leftmost column of the field. Any blank characters in the field will be at the right side of the field. The following are examples of RAMM symbols:

Valid

A2Z

DATA

345

END See note*

Invalid	Reason
*XY-Z*	illegal character, -
*T 4*	illegal character, blank
*CA2B3*	too many characters

*Note: It is recommended that the OP code mnemonics not be used as symbols.

Since a blank terminates a symbol, Tb1 and Tb2 would both be taken as the symbol T.

A symbol is a name for a location in memory. When a programmer wishes to refer to a location in memory, he names the location by a symbol and uses the symbol whenever he wishes to refer to that memory location. It is desirable to use symbols that have some association to the problem being solved. For example, a location containing the integer constant 2 might well be named with the symbol K2, or a location containing the result of an addition operation might be named SUM.

Constants

A constant in RAMM is one of the following:

A nonnegative decimal integer containing exactly four decimal digits. All leading zeros must appear. The largest nonnegative value is 9899. A nonnegative constant must be unsigned.

A negative decimal integer that begins with a minus sign followed by exactly three decimal digits. The smallest negative value is -999.

A blank terminates the constant. Examples of valid RAMM constants are:

0001
0123
-003
-123
8999

Note that, in addition to the range of constant values that can be entered using the DEC operation code, values up to a maximum of 9999 can be computed, stored and printed during execution of a program. For example, the constant 8999 could be added to the constant 1000 leaving a result of 9999 in the A-Register. This value could then be stored into a main memory location for further use during execution.

Delimiters

A blank is used in RAMM as a delimiter to terminate a symbol or a constant and to separate fields.

Instructions

An instruction in RAMM assembly language has the form:

OP R

where the OP is an operation code and R is a RAMM symbol or RAMM constant.

In an instruction, the operation code, OP, tells what action is to take place. Operation codes are defined by the designers of the language and are usually mnemonic (memory aided) abbreviations of the operations to be performed. R is the operand of the instruction and tells what information is to be used in performing the operation. In many instructions it is the symbol that represents the address of a word to be used. A list of RAMM operation codes (op codes) is given elsewhere.

There are two types of instructions in an assembly language: computer instructions and pseudo-instructions (often called pseudo-operations). In a computer instruction, the operation code corresponds to a machine language operation code of the computer. For a computer instruction, the assembler translates a mnemonic OP into a machine language code. Similarly, a symbolic R is translated into the numeric equivalent (an address between 00 and 99, inclusive) that the assembler has assigned to it. The assembler then forms from these parts the numeric machine instruction that corresponds to the symbolic computer instruction written by the programmer.

The second type of assembler language instruction, the pseudo-instruction, furnishes information to the assembler and does not correspond to a numeric machine instruction. From the pseudo-instruction, the assembler determines some kind of action that the assembler is to take with respect to the program that it is translating. A pseudo-instruction may, for example, instruct the assembler that a certain number of consecutive memory locations are allocated with the first one named by a symbol, or that the assembler has reached the end of the program and hence is to stop translating.

Statements

Statements are the "sentences" of the assembly language. The most common form of statement contains an instruction consisting of an operation code and its operand (usually a symbol for an address). It also contains, at least implicitly, the address of the memory location in which the instruction is to be stored, and remarks by the programmer. All of this must be presented to the computer in a standard form, which will now be discussed.

Fields

The programmer enters the consecutive statements for his program, one per line, in a text file. In many assembly languages (including RAMM), certain parts of a statement must be placed in specific columns in the line. The consecutive columns reserved for some specific use are collectively called a field. A given field is usually identified according to the type of information to be placed in that field, for example, "the OP code field." If information must be located within a field so that the leftmost character occurs in the leftmost column of the field, it is said that the contents of the field must be left justified. Similarly, if the rightmost character must occur in the rightmost column, it is said that the contents of the field must be right justified. All symbols, op codes and constants in RAMM assembly language are left justified in the fields in which they appear.

The RAMM Assembly language statement is composed of six fields: location, operation code, address, sign, number, and remarks. Depending on the operation code, one or more of these fields may be blank.

Col.	1-4	5-6	7-9	10	11-14	15	16	17-19	20-40
Field	location		op code		address	sign	number		remarks

The location field (columns 1-4) may contain blanks or a valid RAMM symbol: up to four alphanumeric characters, left-justified, with no imbedded blanks.

The operation code field (columns 7-9) may contain a valid three- character operation code.

The address field (columns 11-14) may contain:

blanks, or
a valid RAMM symbol: up to four alphanumeric characters, left-justified, or
a valid RAMM constant less than 9900.

The sign field (column 15) may contain a +, -, or blank. This field is used to allow addressing relative to a given symbol that appears in a location field.

The number field (column 16) can be blank or contain an integer between 0 and 9, inclusive. This field is used to address a memory location relative to the one identified by the symbol in the location field.

The remarks field (columns 20 through 40) is completely ignored by the assembler (except in producing a listing). This field may contain any alphanumeric or special character.

Field separators

Columns 5-6, 10, and 17-19 must contain blanks. These columns serve as field separators.

Instruction statement

In an instruction statement, the OP code field and the address field must contain a legal RAMM assembly instruction. The location field may be blank, or may contain a RAMM symbol. The remarks field may contain anything desired. The sign and number fields may both be blank or contain a sign and number, respectively.

Pseudo-instruction statement

In a pseudo-instruction statement the OP code field must contain a legal RAMM pseudo-instruction (BSS, DEC, or END). The address field must contain a legal RAMM constant or symbol. The location field may contain blanks or a legal RAMM symbol. The remarks field may contain anything desired.

End statement

The END statement is formed by using a blank (recommended) location field, the END pseudo-instruction as the OP code, and a remark, if desired, in the remarks field. This statement must be the final statement of the program, and no statement may occur after it. This statement stops the translation process. Note the END instruction corresponds to the 99 code that is recognized by the RAMM computer as the marker for the end of the machine language code. If execution of the machine language program is not supposed to begin at 00, the address field of the END instruction should have a RAMM symbol that indicates the first instruction to be executed in the program.

Use of the Location field

In RAMM assembly language the location field is synonymous in all statements with an address in memory. The operation code and address fields of the statement determine what information the assembler puts into that location in memory. The programmer writes a symbol in the location field of a statement only if he must refer to that location elsewhere in his program. Since a symbol is synonymous with an address in memory, it is obvious that the same symbol cannot refer to more than one address. Therefore, a given symbol may be used only once per program in the location field of a statement. If the programmer inadvertently uses the symbol more than once in the location field, the assembler will use the address of the first occurrence. A symbol in the address field of a statement (the R-term of an instruction) refers to the location that bears that symbol name. If the programmer must refer in some instruction to a location, that location must be named by a symbol. Hence every RAMM symbol occurring in the address field within a program must occur in the location field of some statement in the program. If the programmer inadvertently refers in the address field of a statement to a symbol that does not occur in the location field of any statement in the program, the assembler informs the programmer that he has used an undefined symbol.