5.1.1 Description

An assembly language statement consists of a "symbol", a "mnemonic", "operands", and a "comment".

[symbol][:]       [mnemonic]     [operand], [operand]    ;[comment]

 

Separate labels by colons or one or more whitespace characters. Whether colons or spaces are used, however, depends on the instruction coded by the mnemonic.

 

It is irrelevant whether blanks are inserted in the following location.

-

Between the symbol name and colon

-

Between the colon and mnemonic

-

Before the second and subsequent operands

-

Before semicolon that indicates the beginning of a comment

 

One or more blank is necessary in the following location.

-

Between the mnemonic and the operand

Figure 5.1

Organization of Assembly Language Statement

 

One assembly language statement is described on one line. There is a line feed (return) at the end of the statement.

(1)

Character set

The characters that can be used in a source program (assembly language) supported by the asembler are the following 3 types of characters.

 

(a)

Language characters

These characters are used to code instructions in the source.

Table 5.1

Language Characters and Usage of Characters

Character

Usage

Numerals

Constitutes an identifier and constant

Lowercase letter (a-z)

Constitutes a mnemonic, identifier, and constant

Uppercase letter (A-Z)

Constitutes a mnemonic, identifier, and constant

@

Constitutes an identifier

_ (underscore)

Constitutes an identifier

.(period)

Constitutes an identifier and constant

, (comma)

Delimits an operand

: (colon)

Delimits a label

; (semicolon)

Beginning of comment

*

Multiplication operator

/

Division operator

+

Positive sign and addition operator

- (hyphen)

Negative sign and subtraction operator

' (single quotation)

Character constant

<

Relational operator

>

Relational operator

( )

Specifies an operation sequence

$

Symbol indicating the start of a control instruction equivalent to an assembler option

Symbol specifying relative addressing

gp offset reference of label

=

Relational operator

!

Beginning immediate addressing and negation operator

(blank)

Field delimiter

~

Concatenation symbol (in macro body)

&

Logical product operator

#

References the absolute address of a label and begins a comment (when used at the beginning of a line)

[ ]

Indirect indication symbol

"(double quotation)

Start and end of character string constant

%

ep offset reference of a label and remainder operator

<<

Left shift operator

>>

Right shift operator

|

Logical sum operator

^

Exclusive OR operator

(b)

Character data

Character data refers to characters used to write character string constant, character constant, and the quote-enclosed operands of some control instructions.

Caution

Character data can use all characters (including multibyte kanji, although the encoding depends on the OS).

(c)

Comment characters

Comment characters are used to write comments.

Caution

Comment characters and character data have the same character set.

(2)

Symbol

The symbol field is for symbols, which are names given to addresses and data objects. Symbols make programs easier to understand.

(a)

Symbol types

Symbols can be classified as shown below, depending on their purpose and how they are defined.

Symbol Type

Purpose

Definition Method

Name

Used as names for addresses and data objects in source programs.

Write in the symbol field of a Symbol definition directive.

Label

Used as labels for addresses and data objects in source programs.

Write a symbol followed by a colon ( : ).

External reference name

Used to reference symbols defined by other source modules.

Write in the operand field of an external reference directive.

Section name

Used at link time.

Write in the symbol field of a section definition directive.

Macro name

Use to name macros in source programs.

Write in the symbol field of macro directive.

(b)

Conventions of symbol description

Observe the following conventions when writing symbols.

-

The characters which can be used in symbols are the alphanumeric characters and special characters (@, _, .).
The first character in a symbol cannot be a digit (0 to 9).

-

The maximum number of characters for a symbol is 4,294,967,294 (=0xFFFFFFFE) (theoretical value). The actual number that can be used depends on the amount of memory, however.

-

Reserved words cannot be used as symbols.
See "5.5 Reserved Words" for a list of reserved words.

-

The same symbol cannot be defined more than once.
However, a symbol defined with the .set directive can be redefined with the .set directive.

-

The assembler distinguishes between lowercase and uppercase characters.

-

When a label is written in a symbol field, the colon ( : ) must appear immediately after the label name.

 

Example

Correct symbols

CODE01  .cseg               ; "CODE01" is a section name.
VAR01   .set    0x10        ; "VAR01" is a name.
LAB01:  .dw     0           ; "LAB01" is a label.

 

Example

Incorrect symbols

1ABC    .set    0x3         ; The first character is a digit.s
LAB     mov     1, r10      ; "LAB"is a label and must be separated from the mnemonic 
                            ; field by a colon ( : ).
FLAG:   .set    0x10        ; The colon ( : ) is not needed for symbols.

 

Example

A statement composed of a symbol only

ABCD:                       ; ABCD is defined as a label.

(c)

Points to note about symbols

The assembler generates a name automatically when a section definition directive does not specify a name. These section names are listed below.

Duplicate section name definitions are errors.

Section Name

Directive

Relocation Attribute

.text

.cseg directive

TEXT

.const

CONST

.zconst

ZCONST

.zconst23

ZCONST23

.bss

.dseg directive

BSS

.data

DATA

.sbss

SBSS

.sdata

SDATA

.sbss23

SBSS23

.sdata23

SDATA23

.tdata

TDATA

.tbss4

TBSS4

.tdata4

TDATA4

.tbss5

TBSS5

.tdata5

TDATA5

.tbss7

TBSS7

.tdata7

TDATA7

.tbss8

TBSS8

.tdata8

TDATA8

.ebss

EBSS

.edata

EDATA

.ebss23

EBSS23

.edata23

EDATA23

.zbss

ZBSS

.zdata

ZDATA

.zbss23

ZBSS23

.zdata23

ZDATA23

(d)

Symbol attributes

Every symbol and label has both a value and an attribute.

The value is the value of the defined data object, for example a numerical value, or the value of the address itself.

Macro names do not have values.

The following table lists symbol attributes.

Attribute Type

Classification

Value

BIT

-

Symbols defined as bit values

-

Symbols defined with the EXTBIT directive

Decimal notation:

-2147483648 to 2147483647

Hexadecimal notation:
0x80000000 to 0x7FFFFFFF (signed)

MACRO

Macro names defined with the Macro directive

These attribute types have no values.

FNUMBER

Symbols defined with the FLOAT directive

(Single precision floating point)

1.40129846e-45 to 3.40282347e+38

DFNUMBER

Symbols defined with theDOUBLE directive

(Double-precision floating point)

4.9406564584124654e-324 to 1.7976931348623157e+308

(3)

Mnemonic field

Write instruction mnemonics, directives, and macro references in the mnemonic field.

If the instruction or directive or macro reference requires an operand or operands, the mnemonic field must be separated from the operand field with one or more blanks or tabs.

Example

Correct mnemonics

mov     1, r10

Example

Incorrect mnemonics

mov1, r10       ; There is no blank between the mnemonic and operand fields.
mo v    1, r10  ; The mnemonic field contains a blank.
MOVE            ; This is an instruction that cannot be coded in the mnemonic field.

(4)

Operand field

In the operand field, write operands (data) for the instructions, directives, or macro references that require them.

Some instructions and directives require no operands, while others require two or more.

When you provide two or more operands, delimit them with a comma ( , ).

The following types of data can appear in the operand field:

-

Constants (numeric constants, character constants, character string constants)

 

See the user's manual of the target device for the format and notational conventions of instruction set operands.

The following sections explain the types of data that can appear in the operand field.

(a)

Constants

A constant is a fixed value or data item and is also referred to as immediate data.

There are numeric constants, character constants and character string constants.

 

<1>

Numeric constants

Integer constants can be written in binary, octal, decimal, or hexadecimal notation.
Integer constants has a width of 32 bits. A negative value is expressed as a 2's complement. If an integer value that exceeds the range of the values that can be expressed by 32 bits is specified, the assembler uses the value of the lower 32 bits of that integer value and continues processing (it does not output any message).

Type

Notation

Example

Binary

Append an "0b" or "0B" suffix to the number.

0b1101

Octal

Append an "0" suffix to the number.

074

Decimal

Simply write the number.

128

Hexadecimal

Append an "0x" or "0X" suffix to the number.

0xA6

 

Floating constants consist of the following elements. Specify the exponent and mantissa as decimal constants. Do not use (3), (4), or (5) if an exponent expression cannot be used.

(1) sign of mantissa part ("+" is optional)

(2) mantissa part

(3) 'e' or 'E' indicating the exponent part

(4) sign of exponent part ("+" is optional)

(5) exponent part

Example

123.4
-100.
10e-2
-100.2E+5

 

You can indicate that the number is a floating constant by appending "0f" or "0F" to the front of the mantissa.

Example

0f10

 

<2>

Character constants

A character constant consists of a single character enclosed by a pair of single quotation marks (' ') and indicates the value of the enclosed characterNote.

If any of the escape sequences listed below is specified in " ' " and " ' ", the assembler regards the sequence as being a single character.

Example

'A'             ; 0x00000041
' '             ; 0x00000020 (1 blank)

Note

If a character constant is specified, the assembler assumes that an integer having the value of that character constant is specified.

Table 5.2

Value and Meaning of Escape Sequence

Escape Sequence

Value

Meaning

\0

0x00

null character

\a

0x07

Alert

\b

0x08

Backspace

\f

0x0C

Form feed

\n

0x0A

Line feed

\r

0x0D

Carriage return

\t

0x09

Horizontal tab

\v

0x0B

Vertical tab

\\

0x5C

Back slash

\'

0x27

Single quotation marks

\"

0x22

Double quotation mark

\?

0x3F

Question mark

\ddd

0 to 0377

Octal number of up to 3 digits (0 <= d <= 7) Note

\xhh

0 to 0xFF

Hexadecimal number of up to 2 digits

(0 <= h <= 9, a <= h <= f, or A <= h <= F)

Note

If a value exceeding "\377" is sp value of the escape sequence becomes the lower 1 byte. Cannot be of value more than 0377. For example value of"\777"is 0377.

 

<3>

Character string constants

A character-string constant is expressed by enclosing a string of characters from those shown in "(1) Character set", in a pair of single quotation marks ( " ).

Example

"ab"            ; 0x6162
"A"             ; 0x41
" "             ; 0x20 (1 blank)

(b)

Register names

The following registers can be named in the operand field:

-

r0, zero, r1, r2, hp, r3, sp, r4, gp, r5, tp, r6, r7, r8, r9, r10, r11, r12, r13, r14, r15, r16, r17, r18, r19, r20, r21, r22, r23, r24, r25, r26, r27, r28, r29, r30, ep, r31, lp

 

r0 and zero (Zero register), r2 and hp (Handler stack pointer), r3 and sp (Stack pointer), r4 and gp (Global pointer), r5 and tp (Text pointer), r30 and ep (Element pointer), r31 and lp (Link pointer) shows the same register.

Remark

For the ldsr and stsr instructions, the PSW, and system registers are specified by using the numbers. Further, in assembler, PC cannot be specified as an operand.

(c)

Symbols

The assembler supports the use of symbols as the constituents of the absolute expressions or relative expressions that can be used to specify the operands of instructions and directives.

(d)

Expressions

An expression is a combination of constants and symbols, by an operator.

Expressions can be specified as instruction operands wherever a numeric value can be specified.

See "5.1.2 Expressions and operators" for more information about expressions.

Example

TEN     .set    0x10
        mov     TEN - 0x05, r10

 

In this example, "TEN - 0x05" is an expression.

In this expression, a symbol and a numeric value are connected by the - (minus) operator. The value of the expression is 0x0B, so this expression could be rewritten as "mov 0x0B, r10".

(5)

Comment

Describe comments in the comment field, after a semicolon ( ; ).

The comment field continues from the semicolon to the new line code at the end of the line, or to the EOF code of the file.

Comments make it easier to understand and maintain programs.

Comments are not processed by the assembler, and are output verbatim to assembly lists.

Characters that can be described in the comment field are those shown in "(1) Character set".