diff --git a/README.md b/README.md
index da92a8e..a721360 100644
--- a/README.md
+++ b/README.md
@@ -1,2 +1,24 @@
# vic-sss
Commodore VIC20: Software Sprite Stack library using modern 6502 assembler
+
+I have found that programming for an 8-bit computer is a very challenging, yet
+personally rewarding, experience:
+
+VIC-SSS provides a programmer-friendly API to manage your game's playfield with
+software-rendered sprites and other animations for a flicker-free video
+experience. On-the-fly custom character manipulations with dual video buffers
+accomplish these goals, avoiding the alternative of dedicating all internal RAM
+for a smaller, but fully, bit-mapped screen. This API supports both NTSC and
+PAL VIC 20 computers, and allows for display modes that change VIC's 22x23
+screen layout.
+
+The software sprite stack promotes a flicker-free video experience, with the
+option by the game programmer to govern frame buffer flips with screen raster
+timing. While the VIC 20 computer and its graphics are primitive to begin with,
+this API was created to strike a balance between machine and programmer
+friendliness – which is what the VIC is all about. The result of that
+friendliness makes the code size around 2 kilobytes and requires nearly all of
+the internal 4 kilobytes of RAM for graphics display and management. Thus, your
+game program will require some form of memory expansion – all examples provided
+will run on 8k expansion.
+
diff --git a/VIC-SSS-MMX.pdf b/VIC-SSS-MMX.pdf
new file mode 100755
index 0000000..c6aa0b1
Binary files /dev/null and b/VIC-SSS-MMX.pdf differ
diff --git a/ca65-primer/COMPILE.BAT b/ca65-primer/COMPILE.BAT
new file mode 100755
index 0000000..29aa2e1
--- /dev/null
+++ b/ca65-primer/COMPILE.BAT
@@ -0,0 +1,15 @@
+@echo on
+ca65.exe --cpu 6502 -t vic20 --listing --include-dir . vic-sss4.s
+ca65.exe --cpu 6502 -t vic20 --listing --include-dir . -o sample.o basic-8K.s
+ld65.exe -C basic-8k.cfg -Ln sample.sym -m sample.map -o ..\ca65-sprite.prg sample.o vic-sss4.o
+@echo off
+
+choice /C DMV /D M /T 30 /M "[D]ebug, [M]ESS, or [V]ICE? " /N
+set CHOICE=%ERRORLEVEL%
+cd ..
+
+if %CHOICE% EQU 1 mess -debug -window -natural -skip_gameinfo -skip_warnings vic20 -ramsize 16k -quik ca65-sprite.prg
+if %CHOICE% EQU 2 mess -skip_gameinfo -skip_warnings -newui vic20 -ramsize 16k -quik ca65-sprite.prg
+if %CHOICE% EQU 3 xvic -memory 8k -autostart ca65-sprite.prg
+
+exit
diff --git a/ca65-primer/VIC-SSS-MMX.h b/ca65-primer/VIC-SSS-MMX.h
new file mode 100755
index 0000000..079c0e1
--- /dev/null
+++ b/ca65-primer/VIC-SSS-MMX.h
@@ -0,0 +1,252 @@
+;*********************************************************************
+; Commodore VIC 20 Software Sprite Stack - MMX Edition
+; written by Robert Hurst ca65 is a replacement for the ra65 assembler that was part of the cc65 C
+compiler, originally developed by John R. Dunning. I had some problems with
+ra65 and the copyright does not permit some things which I wanted to be
+possible, so I decided to write a completely new assembler/linker/archiver
+suite for the cc65 compiler. ca65 is part of this suite. Some parts of the assembler (code generation and some routines for symbol
+table handling) are taken from an older crossassembler named a816 written
+by me a long time ago. Here's a list of the design criteria, that I considered important for the
+development:
+
+1. Overview
+
+
+1.1 Design criteria
+
+
+
+
+
+
+
+right.
+
+ .import S1, S2
+ .export Special
+ Special = 2*S1 + S2/7
+
+
Pseudo functions expect their arguments in parenthesis, and they have a result, +either a string or an expression.
+ + +.BANKBYTE
+The function returns the bank byte (that is, bits 16-23) of its argument. +It works identical to the '^' operator.
+ + + +.BLANK
+Builtin function. The function evaluates its argument in braces and yields +"false" if the argument is non blank (there is an argument), and "true" if +there is no argument. The token list that makes up the function argument +may optionally be enclosed in curly braces. This allows the inclusion of +tokens that would otherwise terminate the list (the closing right +parenthesis). The curly braces are not considered part of the list, a list +just consisting of curly braces is considered to be empty.
+As an example, the .IFBLANK
statement may be replaced by
+
+
+ .if .blank({arg})
+
+
+
+
+
+
+
+.CONCAT
+Builtin string function. The function allows to concatenate a list of string
+constants separated by commas. The result is a string constant that is the
+concatenation of all arguments. This function is most useful in macros and
+when used together with the .STRING
builtin function. The function may
+be used in any case where a string constant is expected.
Example:
++
+
+ .include .concat ("myheader", ".", "inc")
+
+
+
+
+This is the same as the command
++
+
+ .include "myheader.inc"
+
+
+
+
+
+
+.CONST
+Builtin function. The function evaluates its argument in braces and +yields "true" if the argument is a constant expression (that is, an +expression that yields a constant value at assembly time) and "false" +otherwise. As an example, the .IFCONST statement may be replaced by
++
+
+ .if .const(a + 3)
+
+
+
+
+
+
+.HIBYTE
+The function returns the high byte (that is, bits 8-15) of its argument. +It works identical to the '>' operator.
+ + + +.HIWORD
+The function returns the high word (that is, bits 16-31) of its argument.
+See:
+.LOWORD
.IDENT
+The function expects a string as its argument, and converts this argument
+into an identifier. If the string starts with the current
+.LOCALCHAR
, it will be converted into a cheap local
+identifier, otherwise it will be converted into a normal identifier.
Example:
++
+
+ .macro makelabel arg1, arg2
+ .ident (.concat (arg1, arg2)):
+ .endmacro
+
+ makelabel "foo", "bar"
+
+ .word foobar ; Valid label
+
+
+
+
+
+
+.LEFT
+Builtin function. Extracts the left part of a given token list.
+Syntax:
++
+
+ .LEFT (<int expr>, <token list>)
+
+
+
+
+The first integer expression gives the number of tokens to extract from +the token list. The second argument is the token list itself. The token +list may optionally be enclosed into curly braces. This allows the +inclusion of tokens that would otherwise terminate the list (the closing +right paren in the given case).
+Example:
+To check in a macro if the given argument has a '#' as first token +(immediate addressing mode), use something like this:
++
+
+ .macro ldax arg
+ ...
+ .if (.match (.left (1, {arg}), #))
+
+ ; ldax called with immediate operand
+ ...
+
+ .endif
+ ...
+ .endmacro
+
+
+
+
+See also the
+.MID
and
+.RIGHT
builtin functions.
.LOBYTE
+The function returns the low byte (that is, bits 0-7) of its argument. +It works identical to the '<' operator.
+ + + +.LOWORD
+The function returns the low word (that is, bits 0-15) of its argument.
+See:
+.HIWORD
.MATCH
+Builtin function. Matches two token lists against each other. This is +most useful within macros, since macros are not stored as strings, but +as lists of tokens.
+The syntax is
++
+
+ .MATCH(<token list #1>, <token list #2>)
+
+
+
+
+Both token list may contain arbitrary tokens with the exception of the +terminator token (comma resp. right parenthesis) and
++
The token lists may optionally be enclosed into curly braces. This allows +the inclusion of tokens that would otherwise terminate the list (the closing +right paren in the given case). Often a macro parameter is used for any of +the token lists.
+Please note that the function does only compare tokens, not token
+attributes. So any number is equal to any other number, regardless of the
+actual value. The same is true for strings. If you need to compare tokens
+and token attributes, use the
+.XMATCH
function.
Example:
+Assume the macro ASR
, that will shift right the accumulator by one,
+while honoring the sign bit. The builtin processor instructions will allow
+an optional "A" for accu addressing for instructions like ROL
and
+ROR
. We will use the
+.MATCH
function
+to check for this and print and error for invalid calls.
+
+
+ .macro asr arg
+
+ .if (.not .blank(arg)) .and (.not .match ({arg}, a))
+ .error "Syntax error"
+ .endif
+
+ cmp #$80 ; Bit 7 into carry
+ lsr a ; Shift carry into bit 7
+
+ .endmacro
+
+
+
+
+The macro will only accept no arguments, or one argument that must be the +reserved keyword "A".
+See:
+.XMATCH
.MID
+Builtin function. Takes a starting index, a count and a token list as +arguments. Will return part of the token list.
+Syntax:
++
+
+ .MID (<int expr>, <int expr>, <token list>)
+
+
+
+
+The first integer expression gives the starting token in the list (the first +token has index 0). The second integer expression gives the number of tokens +to extract from the token list. The third argument is the token list itself. +The token list may optionally be enclosed into curly braces. This allows the +inclusion of tokens that would otherwise terminate the list (the closing +right paren in the given case).
+Example:
+To check in a macro if the given argument has a '#
' as first token
+(immediate addressing mode), use something like this:
+
+
+ .macro ldax arg
+ ...
+ .if (.match (.mid (0, 1, {arg}), #))
+
+ ; ldax called with immediate operand
+ ...
+
+ .endif
+ ...
+ .endmacro
+
+
+
+
+See also the
+.LEFT
and
+.RIGHT
builtin functions.
.REF, .REFERENCED
+Builtin function. The function expects an identifier as argument in braces.
+The argument is evaluated, and the function yields "true" if the identifier
+is a symbol that has already been referenced somewhere in the source file up
+to the current position. Otherwise the function yields false. As an example,
+the
+.IFREF
statement may be replaced by
+
+
+ .if .referenced(a)
+
+
+
+
+See:
+.DEFINED
.RIGHT
+Builtin function. Extracts the right part of a given token list.
+Syntax:
++
+
+ .RIGHT (<int expr>, <token list>)
+
+
+
+
+The first integer expression gives the number of tokens to extract from the +token list. The second argument is the token list itself. The token list +may optionally be enclosed into curly braces. This allows the inclusion of +tokens that would otherwise terminate the list (the closing right paren in +the given case).
+See also the
+.LEFT
and
+.MID
builtin functions.
.SIZEOF
+.SIZEOF
is a pseudo function that returns the size of its argument. The
+argument can be a struct/union, a struct member, a procedure, or a label. In
+case of a procedure or label, its size is defined by the amount of data
+placed in the segment where the label is relative to. If a line of code
+switches segments (for example in a macro) data placed in other segments
+does not count for the size.
Please note that a symbol or scope must exist, before it is used together with
+.SIZEOF
(this may get relaxed later, but will always be true for scopes).
+A scope has preference over a symbol with the same name, so if the last part
+of a name represents both, a scope and a symbol, the scope is chosen over the
+symbol.
After the following code:
++
+
+ .struct Point ; Struct size = 4
+ xcoord .word
+ xcoord .word
+ .endstruct
+
+ P: .tag Point ; Declare a point
+ @P: .tag Point ; Declare another point
+
+ .code
+ .proc Code
+ nop
+ .proc Inner
+ nop
+ .endproc
+ nop
+ .endproc
+
+ .proc Data
+ .data ; Segment switch!!!
+ .res 4
+ .endproc
+
+
+
+
++
.sizeof(Point)
will have the value 4, because this is the size of struct Point
.
.sizeof(Point::xcoord)
will have the value 2, because this is the size of the member xcoord
+in struct Point
.
.sizeof(P)
will have the value 4, this is the size of the data declared on the same
+source line as the label P
, which is in the same segment that P
+is relative to.
.sizeof(@P)
will have the value 4, see above. The example demonstrates that .SIZEOF
+does also work for cheap local symbols.
.sizeof(Code)
will have the value 3, since this is amount of data emitted into the code
+segment, the segment that was active when Code
was entered. Note that
+this value includes the amount of data emitted in child scopes (in this
+case Code::Inner
).
.sizeof(Code::Inner)
will have the value 1 as expected.
+ +.sizeof(Data)
will have the value 0. Data is emitted within the scope Data
, but since
+the segment is switched after entry, this data is emitted into another
+segment.
.STRAT
+Builtin function. The function accepts a string and an index as +arguments and returns the value of the character at the given position +as an integer value. The index is zero based.
+Example:
++
+
+ .macro M Arg
+ ; Check if the argument string starts with '#'
+ .if (.strat (Arg, 0) = '#')
+ ...
+ .endif
+ .endmacro
+
+
+
+
+
+
+.SPRINTF
+Builtin function. It expects a format string as first argument. The number
+and type of the following arguments depend on the format string. The format
+string is similar to the one of the C printf
function. Missing things
+are: Length modifiers, variable width.
The result of the function is a string.
+Example:
++
+
+ num = 3
+
+ ; Generate an identifier:
+ .ident (.sprintf ("%s%03d", "label", num)):
+
+
+
+
+
+
+.STRING
+Builtin function. The function accepts an argument in braces and converts +this argument into a string constant. The argument may be an identifier, or +a constant numeric value.
+Since you can use a string in the first place, the use of the function may +not be obvious. However, it is useful in macros, or more complex setups.
+Example:
++
+
+ ; Emulate other assemblers:
+ .macro section name
+ .segment .string(name)
+ .endmacro
+
+
+
+
+
+
+.STRLEN
+Builtin function. The function accepts a string argument in braces and +evaluates to the length of the string.
+Example:
+The following macro encodes a string as a pascal style string with +a leading length byte.
++
+
+ .macro PString Arg
+ .byte .strlen(Arg), Arg
+ .endmacro
+
+
+
+
+
+
+.TCOUNT
+Builtin function. The function accepts a token list in braces. The function +result is the number of tokens given as argument. The token list may +optionally be enclosed into curly braces which are not considered part of +the list and not counted. Enclosement in curly braces allows the inclusion +of tokens that would otherwise terminate the list (the closing right paren +in the given case).
+Example:
+The ldax
macro accepts the '#' token to denote immediate addressing (as
+with the normal 6502 instructions). To translate it into two separate 8 bit
+load instructions, the '#' token has to get stripped from the argument:
+
+
+ .macro ldax arg
+ .if (.match (.mid (0, 1, {arg}), #))
+ ; ldax called with immediate operand
+ lda #<(.right (.tcount ({arg})-1, {arg}))
+ ldx #>(.right (.tcount ({arg})-1, {arg}))
+ .else
+ ...
+ .endif
+ .endmacro
+
+
+
+
+
+
+.XMATCH
+Builtin function. Matches two token lists against each other. This is +most useful within macros, since macros are not stored as strings, but +as lists of tokens.
+The syntax is
++
+
+ .XMATCH(<token list #1>, <token list #2>)
+
+
+
+
+Both token list may contain arbitrary tokens with the exception of the +terminator token (comma resp. right parenthesis) and
++
The token lists may optionally be enclosed into curly braces. This allows +the inclusion of tokens that would otherwise terminate the list (the closing +right paren in the given case). Often a macro parameter is used for any of +the token lists.
+The function compares tokens and token values. If you need a function
+that just compares the type of tokens, have a look at the
+.MATCH
function.
See:
+.MATCH
Here's a list of all control commands and a description, what they do:
+ + +.A16
+Valid only in 65816 mode. Switch the accumulator to 16 bit.
+Note: This command will not emit any code, it will tell the assembler to +create 16 bit operands for immediate accumulator addressing mode.
+See also:
+.SMART
.A8
+Valid only in 65816 mode. Switch the accumulator to 8 bit.
+Note: This command will not emit any code, it will tell the assembler to +create 8 bit operands for immediate accu addressing mode.
+See also:
+.SMART
.ADDR
+Define word sized data. In 6502 mode, this is an alias for .WORD
and
+may be used for better readability if the data words are address values. In
+65816 mode, the address is forced to be 16 bit wide to fit into the current
+segment. See also
+.FARADDR
. The command
+must be followed by a sequence of (not necessarily constant) expressions.
Example:
++
+
+ .addr $0D00, $AF13, _Clear
+
+
+
+
+
+
+
+.ALIGN
+Align data to a given boundary. The command expects a constant integer +argument that must be a power of two, plus an optional second argument +in byte range. If there is a second argument, it is used as fill value, +otherwise the value defined in the linker configuration file is used +(the default for this value is zero).
+Since alignment depends on the base address of the module, you must +give the same (or a greater) alignment for the segment when linking. +The linker will give you a warning, if you don't do that.
+Example:
++
+
+ .align 256
+
+
+
+
+
+
+.ASCIIZ
+Define a string with a trailing zero.
+Example:
++
+
+ Msg: .asciiz "Hello world"
+
+
+
+
+This will put the string "Hello world" followed by a binary zero into +the current segment. There may be more strings separated by commas, but +the binary zero is only appended once (after the last one).
+ + +.ASSERT
+Add an assertion. The command is followed by an expression, an action
+specifier, and an optional message that is output in case the assertion
+fails. If no message was given, the string "Assertion failed" is used. The
+action specifier may be one of warning
or error
. The assertion is
+evaluated by the assembler if possible, and also passed to the linker in the
+object file (if one is generated). The linker will then evaluate the
+expression when segment placement has been done.
Example:
++
+
+ .assert * = $8000, error, "Code not at $8000"
+
+
+
+
+The example assertion will check that the current location is at $8000,
+when the output file is written, and abort with an error if this is not
+the case. More complex expressions are possible. The action specifier
+warning
outputs a warning, while the error
specifier outputs
+an error message. In the latter case, generation of the output file is
+suppressed in both the assembler and linker.
.AUTOIMPORT
+Is followed by a plus or a minus character. When switched on (using a ++), undefined symbols are automatically marked as import instead of +giving errors. When switched off (which is the default so this does not +make much sense), this does not happen and an error message is +displayed. The state of the autoimport flag is evaluated when the +complete source was translated, before outputting actual code, so it is +not possible to switch this feature on or off for separate sections +of code. The last setting is used for all symbols.
+You should probably not use this switch because it delays error +messages about undefined symbols until the link stage. The cc65 +compiler (which is supposed to produce correct assembler code in all +circumstances, something which is not true for most assembler +programmers) will insert this command to avoid importing each and every +routine from the runtime library.
+Example:
++
+
+ .autoimport + ; Switch on auto import
+
+
+
+
+
+.BANKBYTES
+Define byte sized data by extracting only the bank byte (that is, bits 16-23) from
+each expression. This is equivalent to
+.BYTE
with
+the operator '^' prepended to each expression in its list.
Example:
++
+
+ .define MyTable TableItem0, TableItem1, TableItem2, TableItem3
+
+ TableLookupLo: .lobytes MyTable
+ TableLookupHi: .hibytes MyTable
+ TableLookupBank: .bankbytes MyTable
+
+
+
+
+which is equivalent to
++
+
+ TableLookupLo: .byte <TableItem0, <TableItem1, <TableItem2, <TableItem3
+ TableLookupHi: .byte >TableItem0, >TableItem1, >TableItem2, >TableItem3
+ TableLookupBank: .byte ^TableItem0, ^TableItem1, ^TableItem2, ^TableItem3
+
+
+
+
+See also:
+.BYTE
,
+
+.HIBYTES
,
+
+.LOBYTES
.BSS
+Switch to the BSS segment. The name of the BSS segment is always "BSS", +so this is a shortcut for
++
+
+ .segment "BSS"
+
+
+
+
+See also the
+.SEGMENT
command.
.BYT, .BYTE
+Define byte sized data. Must be followed by a sequence of (byte ranged) +expressions or strings.
+Example:
++
+
+ .byte "Hello "
+ .byt "world", $0D, $00
+
+
+
+
+
+
+.CASE
+Switch on or off case sensitivity on identifiers. The default is off +(that is, identifiers are case sensitive), but may be changed by the +-i switch on the command line. +The command must be followed by a '+' or '-' character to switch the +option on or off respectively.
+Example:
++
+
+ .case - ; Identifiers are not case sensitive
+
+
+
+
+
+
+.CHARMAP
+Apply a custom mapping for characters. The command is followed by two
+numbers in the range 1..255. The first one is the index of the source
+character, the second one is the mapping. The mapping applies to all
+character and string constants when they generate output, and overrides
+a mapping table specified with the
+-t
+command line switch.
Example:
++
+
+ .charmap $41, $61 ; Map 'A' to 'a'
+
+
+
+
+
+
+.CODE
+Switch to the CODE segment. The name of the CODE segment is always +"CODE", so this is a shortcut for
++
+
+ .segment "CODE"
+
+
+
+
+See also the
+.SEGMENT
command.
.CONDES
+Export a symbol and mark it in a special way. The linker is able to build +tables of all such symbols. This may be used to automatically create a list +of functions needed to initialize linked library modules.
+Note: The linker has a feature to build a table of marked routines, but it
+is your code that must call these routines, so just declaring a symbol with
+.CONDES
does nothing by itself.
All symbols are exported as an absolute (16 bit) symbol. You don't need to
+use an additional
+.EXPORT
statement, this
+is implied by .CONDES
.
.CONDES
is followed by the type, which may be constructor
,
+destructor
or a numeric value between 0 and 6 (where 0 is the same as
+specifying constructor
and 1 is equal to specifying destructor
).
+The
+.CONSTRUCTOR
,
+.DESTRUCTOR
and
+.INTERRUPTOR
commands are actually shortcuts for .CONDES
+with a type of constructor
resp. destructor
or interruptor
.
After the type, an optional priority may be specified. Higher numeric values +mean higher priority. If no priority is given, the default priority of 7 is +used. Be careful when assigning priorities to your own module constructors +so they won't interfere with the ones in the cc65 library.
+Example:
++
+
+ .condes ModuleInit, constructor
+ .condes ModInit, 0, 16
+
+
+
+
+See the
+.CONSTRUCTOR
,
+.DESTRUCTOR
and
+.INTERRUPTOR
commands and the separate section
+Module constructors/destructors explaining the feature in more
+detail.
.CONSTRUCTOR
+Export a symbol and mark it as a module constructor. This may be used +together with the linker to build a table of constructor subroutines that +are called by the startup code.
+Note: The linker has a feature to build a table of marked routines, but it +is your code that must call these routines, so just declaring a symbol as +constructor does nothing by itself.
+A constructor is always exported as an absolute (16 bit) symbol. You don't
+need to use an additional .export
statement, this is implied by
+.constructor
. It may have an optional priority that is separated by a
+comma. Higher numeric values mean a higher priority. If no priority is
+given, the default priority of 7 is used. Be careful when assigning
+priorities to your own module constructors so they won't interfere with the
+ones in the cc65 library.
Example:
++
+
+ .constructor ModuleInit
+ .constructor ModInit, 16
+
+
+
+
+See the
+.CONDES
and
+.DESTRUCTOR
commands and the separate section
+Module constructors/destructors explaining the
+feature in more detail.
.DATA
+Switch to the DATA segment. The name of the DATA segment is always +"DATA", so this is a shortcut for
++
+
+ .segment "DATA"
+
+
+
+
+See also the
+.SEGMENT
command.
.DBYT
+Define word sized data with the hi and lo bytes swapped (use .WORD
to
+create word sized data in native 65XX format). Must be followed by a
+sequence of (word ranged) expressions.
Example:
++
+
+ .dbyt $1234, $4512
+
+
+
+
+This will emit the bytes
++
+
+ $12 $34 $45 $12
+
+
+
+
+into the current segment in that order.
+ + +.DEBUGINFO
+Switch on or off debug info generation. The default is off (that is, +the object file will not contain debug infos), but may be changed by the +-g switch on the command line. +The command must be followed by a '+' or '-' character to switch the +option on or off respectively.
+Example:
++
+
+ .debuginfo + ; Generate debug info
+
+
+
+
+
+
+.DEFINE
+Start a define style macro definition. The command is followed by an +identifier (the macro name) and optionally by a list of formal arguments +in braces. +See section +Macros.
+ + +.DEF, .DEFINED
+Builtin function. The function expects an identifier as argument in braces.
+The argument is evaluated, and the function yields "true" if the identifier
+is a symbol that is already defined somewhere in the source file up to the
+current position. Otherwise the function yields false. As an example, the
+
+.IFDEF
statement may be replaced by
+
+
+ .if .defined(a)
+
+
+
+
+
+
+.DESTRUCTOR
+Export a symbol and mark it as a module destructor. This may be used +together with the linker to build a table of destructor subroutines that +are called by the startup code.
+Note: The linker has a feature to build a table of marked routines, but it +is your code that must call these routines, so just declaring a symbol as +constructor does nothing by itself.
+A destructor is always exported as an absolute (16 bit) symbol. You don't
+need to use an additional .export
statement, this is implied by
+.destructor
. It may have an optional priority that is separated by a
+comma. Higher numerical values mean a higher priority. If no priority is
+given, the default priority of 7 is used. Be careful when assigning
+priorities to your own module destructors so they won't interfere with the
+ones in the cc65 library.
Example:
++
+
+ .destructor ModuleDone
+ .destructor ModDone, 16
+
+
+
+
+See the
+.CONDES
and
+.CONSTRUCTOR
commands and the separate
+section
+Module constructors/destructors explaining
+the feature in more detail.
.DWORD
+Define dword sized data (4 bytes) Must be followed by a sequence of +expressions.
+Example:
++
+
+ .dword $12344512, $12FA489
+
+
+
+
+
+
+.ELSE
+Conditional assembly: Reverse the current condition.
+ + +.ELSEIF
+Conditional assembly: Reverse current condition and test a new one.
+ + +.END
+Forced end of assembly. Assembly stops at this point, even if the command +is read from an include file.
+ + +.ENDENUM
+End a
+.ENUM
declaration.
.ENDIF
+Conditional assembly: Close a
+.IF...
or
+
+.ELSE
branch.
.ENDMAC, .ENDMACRO
+End of macro definition (see section +Macros).
+ + +.ENDPROC
+End of local lexical level (see
+.PROC
).
.ENDREP, .ENDREPEAT
+End a
+.REPEAT
block.
.ENDSCOPE
+End of local lexical level (see
+.SCOPE
).
.ENDSTRUCT
+Ends a struct definition. See the
+.STRUCT
+command and the separate section named
+"Structs and unions".
.ENUM
+Start an enumeration. This directive is very similar to the C enum
+keyword. If a name is given, a new scope is created for the enumeration,
+otherwise the enumeration members are placed in the enclosing scope.
In the enumeration body, symbols are declared. The first symbol has a value +of zero, and each following symbol will get the value of the preceding plus +one. This behaviour may be overridden by an explicit assignment. Two symbols +may have the same value.
+Example:
++
+
+ .enum errorcodes
+ no_error
+ file_error
+ parse_error
+ .endenum
+
+
+
+
+Above example will create a new scope named errorcodes
with three
+symbols in it that get the values 0, 1 and 2 respectively. Another way
+to write this would have been:
+
+
+ .scope errorcodes
+ no_error = 0
+ file_error = 1
+ parse_error = 2
+ .endscope
+
+
+
+
+Please note that explicit scoping must be used to access the identifiers:
++
+
+ .word errorcodes::no_error
+
+
+
+
+A more complex example:
++
+
+ .enum
+ EUNKNOWN = -1
+ EOK
+ EFILE
+ EBUSY
+ EAGAIN
+ EWOULDBLOCK = EAGAIN
+ .endenum
+
+
+
+
+In this example, the enumeration does not have a name, which means that the
+members will be visible in the enclosing scope and can be used in this scope
+without explicit scoping. The first member (EUNKNOWN
) has the value -1.
+The value for the following members is incremented by one, so EOK
would
+be zero and so on. EWOULDBLOCK
is an alias for EGAIN
, so it has an
+override for the value using an already defined symbol.
.ERROR
+Force an assembly error. The assembler will output an error message +preceded by "User error" and will not produce an object file.
+This command may be used to check for initial conditions that must be +set before assembling a source file.
+Example:
++
+
+ .if foo = 1
+ ...
+ .elseif bar = 1
+ ...
+ .else
+ .error "Must define foo or bar!"
+ .endif
+
+
+
+
+See also the
+.WARNING
and
+.OUT
directives.
.EXITMAC, .EXITMACRO
+Abort a macro expansion immediately. This command is often useful in +recursive macros. See separate section +Macros.
+ + +.EXPORT
+Make symbols accessible from other modules. Must be followed by a comma +separated list of symbols to export, with each one optionally followed by an +address specification and (also optional) an assignment. Using an additional +assignment in the export statement allows to define and export a symbol in +one statement. The default is to export the symbol with the address size it +actually has. The assembler will issue a warning, if the symbol is exported +with an address size smaller than the actual address size.
+Examples:
++
+
+ .export foo
+ .export bar: far
+ .export foobar: far = foo * bar
+ .export baz := foobar, zap: far = baz - bar
+
+
+
+
+As with constant definitions, using :=
instead of =
marks the
+symbols as a label.
See:
+.EXPORTZP
.EXPORTZP
+Make symbols accessible from other modules. Must be followed by a comma
+separated list of symbols to export. The exported symbols are explicitly
+marked as zero page symbols. An assignment may be included in the
+.EXPORTZP
statement. This allows to define and export a symbol in one
+statement.
Examples:
++
+
+ .exportzp foo, bar
+ .exportzp baz := $02
+
+
+
+
+See:
+.EXPORT
.FARADDR
+Define far (24 bit) address data. The command must be followed by a +sequence of (not necessarily constant) expressions.
+Example:
++
+
+ .faraddr DrawCircle, DrawRectangle, DrawHexagon
+
+
+
+
+See:
+.ADDR
.FEATURE
+This directive may be used to enable one or more compatibility features
+of the assembler. While the use of .FEATURE
should be avoided when
+possible, it may be useful when porting sources written for other
+assemblers. There is no way to switch a feature off, once you have
+enabled it, so using
+
+
+ .FEATURE xxx
+
+
+
+
+will enable the feature until end of assembly is reached.
+The following features are available:
++
at_in_identifiers
+ Accept the at character (`@') as a valid character in identifiers. The +at character is not allowed to start an identifier, even with this +feature enabled.
+ +c_comments
Allow C like comments using /*
and */
as left and right
+comment terminators. Note that C comments may not be nested. There's also a
+pitfall when using C like comments: All statements must be terminated by
+"end-of-line". Using C like comments, it is possible to hide the newline,
+which results in error messages. See the following non working example:
+
+
+ lda #$00 /* This comment hides the newline
+*/ sta $82
+
+
+
+
+
+dollar_in_identifiers
+ Accept the dollar sign (`$') as a valid character in identifiers. The +dollar character is not allowed to start an identifier, even with this +feature enabled.
+ +dollar_is_pc
The dollar sign may be used as an alias for the star (`*'), which +gives the value of the current PC in expressions. +Note: Assignment to the pseudo variable is not allowed.
+ +labels_without_colons
Allow labels without a trailing colon. These labels are only accepted, +if they start at the beginning of a line (no leading white space).
+ +leading_dot_in_identifiers
+ Accept the dot (`.') as the first character of an identifier. This may be +used for example to create macro names that start with a dot emulating +control directives of other assemblers. Note however, that none of the +reserved keywords built into the assembler, that starts with a dot, may be +overridden. When using this feature, you may also get into trouble if +later versions of the assembler define new keywords starting with a dot.
+ +loose_char_term
Accept single quotes as well as double quotes as terminators for char +constants.
+ +loose_string_term
Accept single quotes as well as double quotes as terminators for string +constants.
+ +missing_char_term
Accept single quoted character constants where the terminating quote is +missing. +
+
+ lda #'a
+
+
+
+
+Note: This does not work in conjunction with .FEATURE
+loose_string_term
, since in this case the input would be ambiguous.
+
+org_per_seg
+ This feature makes relocatable/absolute mode local to the current segment.
+Using
+.ORG
when org_per_seg
is in
+effect will only enable absolute mode for the current segment. Dito for
+
+.RELOC
.
pc_assignment
Allow assignments to the PC symbol (`*' or `$' if dollar_is_pc
+is enabled). Such an assignment is handled identical to the
+.ORG
command (which is usually not needed, so just
+removing the lines with the assignments may also be an option when porting
+code written for older assemblers).
ubiquitous_idents
Allow the use of instructions names as names for macros and symbols. This +makes it possible to "overload" instructions by defining a macro with the +same name. This does also make it possible to introduce hard to find errors +in your code, so be careful!
+ +It is also possible to specify features on the command line using the
+
+--feature
command line option.
+This is useful when translating sources written for older assemblers, when
+you don't want to change the source code.
As an example, to translate sources written for Andre Fachats xa65 +assembler, the features
++
+ labels_without_colons, pc_assignment, loose_char_term + ++ +
may be helpful. They do not make ca65 completely compatible, so you may not +be able to translate the sources without changes, even when enabling these +features. However, I have found several sources that translate without +problems when enabling these features on the command line.
+ + +.FILEOPT, .FOPT
+Insert an option string into the object file. There are two forms of +this command, one specifies the option by a keyword, the second +specifies it as a number. Since usage of the second one needs knowledge +of the internal encoding, its use is not recommended and I will only +describe the first form here.
+The command is followed by one of the keywords
++
+
+ author
+ comment
+ compiler
+
+
+
+
+a comma and a string. The option is written into the object file +together with the string value. This is currently unidirectional and +there is no way to actually use these options once they are in the +object file.
+Examples:
++
+
+ .fileopt comment, "Code stolen from my brother"
+ .fileopt compiler, "BASIC 2.0"
+ .fopt author, "J. R. User"
+
+
+
+
+
+
+.FORCEIMPORT
+Import an absolute symbol from another module. The command is followed by a
+comma separated list of symbols to import. The command is similar to
+.IMPORT
, but the import reference is always
+written to the generated object file, even if the symbol is never referenced
+(
+.IMPORT
will not generate import
+references for unused symbols).
Example:
++
+
+ .forceimport needthisone, needthistoo
+
+
+
+
+See:
+.IMPORT
.GLOBAL
+Declare symbols as global. Must be followed by a comma separated list of
+symbols to declare. Symbols from the list, that are defined somewhere in the
+source, are exported, all others are imported. Additional
+.IMPORT
or
+.EXPORT
commands for the same symbol are allowed.
Example:
++
+
+ .global foo, bar
+
+
+
+
+
+
+.GLOBALZP
+Declare symbols as global. Must be followed by a comma separated list of
+symbols to declare. Symbols from the list, that are defined somewhere in the
+source, are exported, all others are imported. Additional
+.IMPORTZP
or
+.EXPORTZP
commands for the same symbol are allowed. The symbols
+in the list are explicitly marked as zero page symbols.
Example:
++
+
+ .globalzp foo, bar
+
+
+
+
+
+.HIBYTES
+Define byte sized data by extracting only the high byte (that is, bits 8-15) from
+each expression. This is equivalent to
+.BYTE
with
+the operator '>' prepended to each expression in its list.
Example:
++
+
+ .lobytes $1234, $2345, $3456, $4567
+ .hibytes $fedc, $edcb, $dcba, $cba9
+
+
+
+
+which is equivalent to
++
+
+ .byte $34, $45, $56, $67
+ .byte $fe, $ed, $dc, $cb
+
+
+
+
+Example:
++
+
+ .define MyTable TableItem0, TableItem1, TableItem2, TableItem3
+
+ TableLookupLo: .lobytes MyTable
+ TableLookupHi: .hibytes MyTable
+
+
+
+
+which is equivalent to
++
+
+ TableLookupLo: .byte <TableItem0, <TableItem1, <TableItem2, <TableItem3
+ TableLookupHi: .byte >TableItem0, >TableItem1, >TableItem2, >TableItem3
+
+
+
+
+See also:
+.BYTE
,
+
+.LOBYTES
,
+
+.BANKBYTES
.I16
+Valid only in 65816 mode. Switch the index registers to 16 bit.
+Note: This command will not emit any code, it will tell the assembler to +create 16 bit operands for immediate operands.
+See also the
+.I8
and
+.SMART
commands.
.I8
+Valid only in 65816 mode. Switch the index registers to 8 bit.
+Note: This command will not emit any code, it will tell the assembler to +create 8 bit operands for immediate operands.
+See also the
+.I16
and
+.SMART
commands.
.IF
+Conditional assembly: Evaluate an expression and switch assembler output +on or off depending on the expression. The expression must be a constant +expression, that is, all operands must be defined.
+A expression value of zero evaluates to FALSE, any other value evaluates +to TRUE.
+ + +.IFBLANK
+Conditional assembly: Check if there are any remaining tokens in this line,
+and evaluate to FALSE if this is the case, and to TRUE otherwise. If the
+condition is not true, further lines are not assembled until an
+.ESLE
,
+.ELSEIF
or
+
+.ENDIF
directive.
This command is often used to check if a macro parameter was given. Since an +empty macro parameter will evaluate to nothing, the condition will evaluate +to FALSE if an empty parameter was given.
+Example:
++
+
+ .macro arg1, arg2
+ .ifblank arg2
+ lda #arg1
+ .else
+ lda #arg2
+ .endif
+ .endmacro
+
+
+
+
+See also:
+.BLANK
.IFCONST
+Conditional assembly: Evaluate an expression and switch assembler output +on or off depending on the constness of the expression.
+A const expression evaluates to to TRUE, a non const expression (one +containing an imported or currently undefined symbol) evaluates to +FALSE.
+See also:
+.CONST
.IFDEF
+Conditional assembly: Check if a symbol is defined. Must be followed by +a symbol name. The condition is true if the the given symbol is already +defined, and false otherwise.
+See also:
+.DEFINED
.IFNBLANK
+Conditional assembly: Check if there are any remaining tokens in this line,
+and evaluate to TRUE if this is the case, and to FALSE otherwise. If the
+condition is not true, further lines are not assembled until an
+.ELSE
,
+.ELSEIF
or
+
+.ENDIF
directive.
This command is often used to check if a macro parameter was given. +Since an empty macro parameter will evaluate to nothing, the condition +will evaluate to FALSE if an empty parameter was given.
+Example:
++
+
+ .macro arg1, arg2
+ lda #arg1
+ .ifnblank arg2
+ lda #arg2
+ .endif
+ .endmacro
+
+
+
+
+See also:
+.BLANK
.IFNDEF
+Conditional assembly: Check if a symbol is defined. Must be followed by +a symbol name. The condition is true if the the given symbol is not +defined, and false otherwise.
+See also:
+.DEFINED
.IFNREF
+Conditional assembly: Check if a symbol is referenced. Must be followed +by a symbol name. The condition is true if if the the given symbol was +not referenced before, and false otherwise.
+See also:
+.REFERENCED
.IFP02
+Conditional assembly: Check if the assembler is currently in 6502 mode
+(see
+.P02
command).
.IFP816
+Conditional assembly: Check if the assembler is currently in 65816 mode
+(see
+.P816
command).
.IFPC02
+Conditional assembly: Check if the assembler is currently in 65C02 mode
+(see
+.PC02
command).
.IFPSC02
+Conditional assembly: Check if the assembler is currently in 65SC02 mode
+(see
+.PSC02
command).
.IFREF
+Conditional assembly: Check if a symbol is referenced. Must be followed +by a symbol name. The condition is true if if the the given symbol was +referenced before, and false otherwise.
+This command may be used to build subroutine libraries in include files +(you may use separate object modules for this purpose too).
+Example:
++
+
+ .ifref ToHex ; If someone used this subroutine
+ ToHex: tay ; Define subroutine
+ lda HexTab,y
+ rts
+ .endif
+
+
+
+
+See also:
+.REFERENCED
.IMPORT
+Import a symbol from another module. The command is followed by a comma +separated list of symbols to import, with each one optionally followed by +an address specification.
+Example:
++
+
+ .import foo
+ .import bar: zeropage
+
+
+
+
+See:
+.IMPORTZP
.IMPORTZP
+Import a symbol from another module. The command is followed by a comma +separated list of symbols to import. The symbols are explicitly imported +as zero page symbols (that is, symbols with values in byte range).
+Example:
++
+
+ .importzp foo, bar
+
+
+
+
+See:
+.IMPORT
.INCBIN
+Include a file as binary data. The command expects a string argument +that is the name of a file to include literally in the current segment. +In addition to that, a start offset and a size value may be specified, +separated by commas. If no size is specified, all of the file from the +start offset to end-of-file is used. If no start position is specified +either, zero is assumed (which means that the whole file is inserted).
+Example:
++
+
+ ; Include whole file
+ .incbin "sprites.dat"
+
+ ; Include file starting at offset 256
+ .incbin "music.dat", $100
+
+ ; Read 100 bytes starting at offset 200
+ .incbin "graphics.dat", 200, 100
+
+
+
+
+
+
+.INCLUDE
+Include another file. Include files may be nested up to a depth of 16.
+Example:
++
+
+ .include "subs.inc"
+
+
+
+
+
+
+.INTERRUPTOR
+Export a symbol and mark it as an interruptor. This may be used together +with the linker to build a table of interruptor subroutines that are called +in an interrupt.
+Note: The linker has a feature to build a table of marked routines, but it +is your code that must call these routines, so just declaring a symbol as +interruptor does nothing by itself.
+An interruptor is always exported as an absolute (16 bit) symbol. You don't
+need to use an additional .export
statement, this is implied by
+.interruptor
. It may have an optional priority that is separated by a
+comma. Higher numeric values mean a higher priority. If no priority is
+given, the default priority of 7 is used. Be careful when assigning
+priorities to your own module constructors so they won't interfere with the
+ones in the cc65 library.
Example:
++
+
+ .interruptor IrqHandler
+ .interruptor Handler, 16
+
+
+
+
+See the
+.CONDES
command and the separate
+section
+Module constructors/destructors explaining
+the feature in more detail.
.LINECONT
+Switch on or off line continuations using the backslash character +before a newline. The option is off by default. +Note: Line continuations do not work in a comment. A backslash at the +end of a comment is treated as part of the comment and does not trigger +line continuation. +The command must be followed by a '+' or '-' character to switch the +option on or off respectively.
+Example:
++
+
+ .linecont + ; Allow line continuations
+
+ lda \
+ #$20 ; This is legal now
+
+
+
+
+
+
+.LIST
+Enable output to the listing. The command must be followed by a boolean
+switch ("on", "off", "+" or "-") and will enable or disable listing
+output.
+The option has no effect if the listing is not enabled by the command line
+switch -l. If -l is used, an internal counter is set to 1. Lines are output
+to the listing file, if the counter is greater than zero, and suppressed if
+the counter is zero. Each use of .LIST
will increment or decrement the
+counter.
Example:
++
+
+ .list on ; Enable listing output
+
+
+
+
+
+
+.LISTBYTES
+Set, how many bytes are shown in the listing for one source line. The +default is 12, so the listing will show only the first 12 bytes for any +source line that generates more than 12 bytes of code or data. +The directive needs an argument, which is either "unlimited", or an +integer constant in the range 4..255.
+Examples:
++
+
+ .listbytes unlimited ; List all bytes
+ .listbytes 12 ; List the first 12 bytes
+ .incbin "data.bin" ; Include large binary file
+
+
+
+
+
+
+.LOBYTES
+Define byte sized data by extracting only the low byte (that is, bits 0-7) from
+each expression. This is equivalent to
+.BYTE
with
+the operator '<' prepended to each expression in its list.
Example:
++
+
+ .lobytes $1234, $2345, $3456, $4567
+ .hibytes $fedc, $edcb, $dcba, $cba9
+
+
+
+
+which is equivalent to
++
+
+ .byte $34, $45, $56, $67
+ .byte $fe, $ed, $dc, $cb
+
+
+
+
+Example:
++
+
+ .define MyTable TableItem0, TableItem1, TableItem2, TableItem3
+
+ TableLookupLo: .lobytes MyTable
+ TableLookupHi: .hibytes MyTable
+
+
+
+
+which is equivalent to
++
+
+ TableLookupLo: .byte <TableItem0, <TableItem1, <TableItem2, <TableItem3
+ TableLookupHi: .byte >TableItem0, >TableItem1, >TableItem2, >TableItem3
+
+
+
+
+See also:
+.BYTE
,
+
+.HIBYTES
,
+
+.BANKBYTES
.LOCAL
+This command may only be used inside a macro definition. It declares a +list of identifiers as local to the macro expansion.
+A problem when using macros are labels: Since they don't change their name,
+you get a "duplicate symbol" error if the macro is expanded the second time.
+Labels declared with
+.LOCAL
have their
+name mapped to an internal unique name (___ABCD__
) with each macro
+invocation.
Some other assemblers start a new lexical block inside a macro expansion.
+This has some drawbacks however, since that will not allow any symbol
+to be visible outside a macro, a feature that is sometimes useful. The
+
+.LOCAL
command is in my eyes a better way
+to address the problem.
You get an error when using
+.LOCAL
outside
+a macro.
.LOCALCHAR
+Defines the character that start "cheap" local labels. You may use one +of '@' and '?' as start character. The default is '@'.
+Cheap local labels are labels that are visible only between two non
+cheap labels. This way you can reuse identifiers like "loop
" without
+using explicit lexical nesting.
Example:
++
+
+ .localchar '?'
+
+ Clear: lda #$00 ; Global label
+ ?Loop: sta Mem,y ; Local label
+ dey
+ bne ?Loop ; Ok
+ rts
+ Sub: ... ; New global label
+ bne ?Loop ; ERROR: Unknown identifier!
+
+
+
+
+
+
+.MACPACK
+Insert a predefined macro package. The command is followed by an +identifier specifying the macro package to insert. Available macro +packages are:
++
+
+ atari Defines the scrcode macro.
+ cbm Defines the scrcode macro.
+ cpu Defines constants for the .CPU variable.
+ generic Defines generic macros like add and sub.
+ longbranch Defines conditional long jump macros.
+
+
+
+
+Including a macro package twice, or including a macro package that +redefines already existing macros will lead to an error.
+Example:
++
+
+ .macpack longbranch ; Include macro package
+
+ cmp #$20 ; Set condition codes
+ jne Label ; Jump long on condition
+
+
+
+
+Macro packages are explained in more detail in section +Macro packages.
+ + +.MAC, .MACRO
+Start a classic macro definition. The command is followed by an identifier +(the macro name) and optionally by a comma separated list of identifiers +that are macro parameters.
+See section +Macros.
+ + +.ORG
+Start a section of absolute code. The command is followed by a constant
+expression that gives the new PC counter location for which the code is
+assembled. Use
+.RELOC
to switch back to
+relocatable code.
By default, absolute/relocatable mode is global (valid even when switching
+segments). Using .FEATURE
+org_per_seg
+it can be made segment local.
Please note that you do not need .ORG
in most cases. Placing
+code at a specific address is the job of the linker, not the assembler, so
+there is usually no reason to assemble code to a specific address.
Example:
++
+
+ .org $7FF ; Emit code starting at $7FF
+
+
+
+
+
+
+.OUT
+Output a string to the console without producing an error. This command
+is similar to .ERROR
, however, it does not force an assembler error
+that prevents the creation of an object file.
Example:
++
+
+ .out "This code was written by the codebuster(tm)"
+
+
+
+
+See also the
+.WARNING
and
+.ERROR
directives.
.P02
+Enable the 6502 instruction set, disable 65SC02, 65C02 and 65816
+instructions. This is the default if not overridden by the
+
+--cpu
command line option.
See:
+.PC02
,
+.PSC02
and
+.P816
.P816
+Enable the 65816 instruction set. This is a superset of the 65SC02 and +6502 instruction sets.
+See:
+.P02
,
+.PSC02
and
+.PC02
.PAGELEN, .PAGELENGTH
+Set the page length for the listing. Must be followed by an integer
+constant. The value may be "unlimited", or in the range 32 to 127. The
+statement has no effect if no listing is generated. The default value is -1
+(unlimited) but may be overridden by the --pagelength
command line
+option. Beware: Since ca65 is a one pass assembler, the listing is generated
+after assembly is complete, you cannot use multiple line lengths with one
+source. Instead, the value set with the last .PAGELENGTH
is used.
Examples:
++
+
+ .pagelength 66 ; Use 66 lines per listing page
+
+ .pagelength unlimited ; Unlimited page length
+
+
+
+
+
+
+.PC02
+Enable the 65C02 instructions set. This instruction set includes all +6502 and 65SC02 instructions.
+See:
+.P02
,
+.PSC02
and
+.P816
.POPSEG
+Pop the last pushed segment from the stack, and set it.
+This command will switch back to the segment that was last pushed onto the
+segment stack using the
+.PUSHSEG
+command, and remove this entry from the stack.
The assembler will print an error message if the segment stack is empty +when this command is issued.
+See:
+.PUSHSEG
.PROC
+Start a nested lexical level with the given name and adds a symbol with this
+name to the enclosing scope. All new symbols from now on are in the local
+lexical level and are accessible from outside only via
+explicit scope specification. Symbols defined outside this local
+level may be accessed as long as their names are not used for new symbols
+inside the level. Symbols names in other lexical levels do not clash, so you
+may use the same names for identifiers. The lexical level ends when the
+
+.ENDPROC
command is read. Lexical levels
+may be nested up to a depth of 16 (this is an artificial limit to protect
+against errors in the source).
Note: Macro names are always in the global level and in a separate name +space. There is no special reason for this, it's just that I've never +had any need for local macro definitions.
+Example:
++
+
+ .proc Clear ; Define Clear subroutine, start new level
+ lda #$00
+ L1: sta Mem,y ; L1 is local and does not cause a
+ ; duplicate symbol error if used in other
+ ; places
+ dey
+ bne L1 ; Reference local symbol
+ rts
+ .endproc ; Leave lexical level
+
+
+
+
+
+
+
+.PSC02
+Enable the 65SC02 instructions set. This instruction set includes all +6502 instructions.
+ + + +.PUSHSEG
+Push the currently active segment onto a stack. The entries on the stack +include the name of the segment and the segment type. The stack has a size +of 16 entries.
+.PUSHSEG
allows together with
+.POPSEG
+to switch to another segment and to restore the old segment later, without
+even knowing the name and type of the current segment.
The assembler will print an error message if the segment stack is already +full, when this command is issued.
+See:
+.POPSEG
.RELOC
+Switch back to relocatable mode. See the
+.ORG
command.
.REPEAT
+Repeat all commands between .REPEAT
and
+.ENDREPEAT
constant number of times. The command is followed by
+a constant expression that tells how many times the commands in the body
+should get repeated. Optionally, a comma and an identifier may be specified.
+If this identifier is found in the body of the repeat statement, it is
+replaced by the current repeat count (starting with zero for the first time
+the body is repeated).
.REPEAT
statements may be nested. If you use the same repeat count
+identifier for a nested .REPEAT
statement, the one from the inner
+level will be used, not the one from the outer level.
Example:
+The following macro will emit a string that is "encrypted" in that all +characters of the string are XORed by the value $55.
++
+
+ .macro Crypt Arg
+ .repeat .strlen(Arg), I
+ .byte .strat(Arg, I) ^ $55
+ .endrep
+ .endmacro
+
+
+
+
+See:
+.ENDREPEAT
.RES
+Reserve storage. The command is followed by one or two constant +expressions. The first one is mandatory and defines, how many bytes of +storage should be defined. The second, optional expression must by a +constant byte value that will be used as value of the data. If there +is no fill value given, the linker will use the value defined in the +linker configuration file (default: zero).
+Example:
++
+
+ ; Reserve 12 bytes of memory with value $AA
+ .res 12, $AA
+
+
+
+
+
+
+.RODATA
+Switch to the RODATA segment. The name of the RODATA segment is always +"RODATA", so this is a shortcut for
++
+
+ .segment "RODATA"
+
+
+
+
+The RODATA segment is a segment that is used by the compiler for +readonly data like string constants.
+See also the
+.SEGMENT
command.
.SCOPE
+Start a nested lexical level with the given name. All new symbols from now
+on are in the local lexical level and are accessible from outside only via
+explicit scope specification. Symbols defined
+outside this local level may be accessed as long as their names are not used
+for new symbols inside the level. Symbols names in other lexical levels do
+not clash, so you may use the same names for identifiers. The lexical level
+ends when the
+.ENDSCOPE
command is
+read. Lexical levels may be nested up to a depth of 16 (this is an
+artificial limit to protect against errors in the source).
Note: Macro names are always in the global level and in a separate name +space. There is no special reason for this, it's just that I've never +had any need for local macro definitions.
+Example:
++
+
+ .scope Error ; Start new scope named Error
+ None = 0 ; No error
+ File = 1 ; File error
+ Parse = 2 ; Parse error
+ .endscope ; Close lexical level
+
+ ...
+ lda #Error::File ; Use symbol from scope Error
+
+
+
+
+
+
+
+.SEGMENT
+Switch to another segment. Code and data is always emitted into a +segment, that is, a named section of data. The default segment is +"CODE". There may be up to 254 different segments per object file +(and up to 65534 per executable). There are shortcut commands for +the most common segments ("CODE", "DATA" and "BSS").
+The command is followed by a string containing the segment name (there are
+some constraints for the name - as a rule of thumb use only those segment
+names that would also be valid identifiers). There may also be an optional
+address size separated by a colon. See the section covering
+address sizes
for more information.
The default address size for a segment depends on the memory model specified +on the command line. The default is "absolute", which means that you don't +have to use an address size modifier in most cases.
+"absolute" means that the is a segment with 16 bit (absolute) addressing. +That is, the segment will reside somewhere in core memory outside the zero +page. "zeropage" (8 bit) means that the segment will be placed in the zero +page and direct (short) addressing is possible for data in this segment.
+Beware: Only labels in a segment with the zeropage attribute are marked +as reachable by short addressing. The `*' (PC counter) operator will +work as in other segments and will create absolute variable values.
+Please note that a segment cannot have two different address sizes. A +segment specified as zeropage cannot be declared as being absolute later.
+Examples:
++
+
+ .segment "ROM2" ; Switch to ROM2 segment
+ .segment "ZP2": zeropage ; New direct segment
+ .segment "ZP2" ; Ok, will use last attribute
+ .segment "ZP2": absolute ; Error, redecl mismatch
+
+
+
+
+See:
+.BSS
,
+.CODE
,
+.DATA
and
+.RODATA
.SETCPU
+Switch the CPU instruction set. The command is followed by a string that
+specifies the CPU. Possible values are those that can also be supplied to
+the
+--cpu
command line option,
+namely: 6502, 6502X, 65SC02, 65C02, 65816, sunplus and HuC6280. Please
+note that support for the sunplus CPU is not available in the freeware
+version, because the instruction set of the sunplus CPU is "proprietary
+and confidential".
See:
+.CPU
,
+
+.IFP02
,
+
+.IFP816
,
+
+.IFPC02
,
+
+.IFPSC02
,
+
+.P02
,
+
+.P816
,
+
+.PC02
,
+
+.PSC02
.SMART
+Switch on or off smart mode. The command must be followed by a '+' or '-' +character to switch the option on or off respectively. The default is off +(that is, the assembler doesn't try to be smart), but this default may be +changed by the -s switch on the command line.
+In smart mode the assembler will do the following:
++
REP
and SEP
instructions in 65816 mode
+and update the operand sizes accordingly. If the operand of such an
+instruction cannot be evaluated by the assembler (for example, because
+the operand is an imported symbol), a warning is issued. Beware: Since
+the assembler cannot trace the execution flow this may lead to false
+results in some cases. If in doubt, use the .Inn
and .Ann
+instructions to tell the assembler about the current settings.RTS
instruction by RTL
if it is
+used within a procedure declared as far
, or if the procedure has
+no explicit address specification, but it is far
because of the
+memory model used.Example:
++
+
+ .smart ; Be smart
+ .smart - ; Stop being smart
+
+
+
+
+See:
+.A16
,
+
+.A8
,
+
+.I16
,
+
+.I8
.STRUCT
+Starts a struct definition. Structs are covered in a separate section named +"Structs and unions".
+See:
+.ENDSTRUCT
.SUNPLUS
+Enable the SunPlus instructions set. This command will not work in the +freeware version of the assembler, because the instruction set is +"proprietary and confidential".
+See:
+.P02
,
+.PSC02
,
+.PC02
, and
+
+.P816
.TAG
+Allocate space for a struct or union.
+Example:
++
+
+ .struct Point
+ xcoord .word
+ ycoord .word
+ .endstruct
+
+ .bss
+ .tag Point ; Allocate 4 bytes
+
+
+
+
+
+
+.WARNING
+Force an assembly warning. The assembler will output a warning message
+preceded by "User warning". This warning will always be output, even if
+other warnings are disabled with the
+-W0
+command line option.
This command may be used to output possible problems when assembling +the source file.
+Example:
++
+
+ .macro jne target
+ .local L1
+ .ifndef target
+ .warning "Forward jump in jne, cannot optimize!"
+ beq L1
+ jmp target
+ L1:
+ .else
+ ...
+ .endif
+ .endmacro
+
+
+
+
+See also the
+.ERROR
and
+.OUT
directives.
.WORD
+Define word sized data. Must be followed by a sequence of (word ranged, +but not necessarily constant) expressions.
+Example:
++
+
+ .word $0D00, $AF13, _Clear
+
+
+
+
+
+
+.ZEROPAGE
+Switch to the ZEROPAGE segment and mark it as direct (zeropage) segment. +The name of the ZEROPAGE segment is always "ZEROPAGE", so this is a +shortcut for
++
+
+ .segment "ZEROPAGE", zeropage
+
+
+
+
+Because of the "zeropage" attribute, labels declared in this segment are +addressed using direct addressing mode if possible. You must instruct +the linker to place this segment somewhere in the address range 0..$FF +otherwise you will get errors.
+See:
+.SEGMENT
Macros may be thought of as "parametrized super instructions". Macros are +sequences of tokens that have a name. If that name is used in the source +file, the macro is "expanded", that is, it is replaced by the tokens that +were specified when the macro was defined.
+ + +In it's simplest form, a macro does not have parameters. Here's an +example:
++
+
+ .macro asr ; Arithmetic shift right
+ cmp #$80 ; Put bit 7 into carry
+ ror ; Rotate right with carry
+ .endmacro
+
+
+
+The macro above consists of two real instructions, that are inserted into +the code, whenever the macro is expanded. Macro expansion is simply done +by using the name, like this:
++
+
+ lda $2010
+ asr
+ sta $2010
+
+
+
+
+
+When using macro parameters, macros can be even more useful:
++
+
+ .macro inc16 addr
+ clc
+ lda addr
+ adc #$01
+ sta addr
+ lda addr+1
+ adc #$00
+ sta addr+1
+ .endmacro
+
+
+
+When calling the macro, you may give a parameter, and each occurrence of +the name "addr" in the macro definition will be replaced by the given +parameter. So
++
+
+ inc16 $1000
+
+
+
+will be expanded to
++
+
+ clc
+ lda $1000
+ adc #$01
+ sta $1000
+ lda $1000+1
+ adc #$00
+ sta $1000+1
+
+
+
+A macro may have more than one parameter, in this case, the parameters +are separated by commas. You are free to give less parameters than the +macro actually takes in the definition. You may also leave intermediate +parameters empty. Empty parameters are replaced by empty space (that is, +they are removed when the macro is expanded). If you have a look at our +macro definition above, you will see, that replacing the "addr" parameter +by nothing will lead to wrong code in most lines. To help you, writing +macros with a variable parameter list, there are some control commands:
+
+.IFBLANK
tests the rest of the line and
+returns true, if there are any tokens on the remainder of the line. Since
+empty parameters are replaced by nothing, this may be used to test if a given
+parameter is empty.
+.IFNBLANK
tests the
+opposite.
Look at this example:
++
+
+ .macro ldaxy a, x, y
+ .ifnblank a
+ lda #a
+ .endif
+ .ifnblank x
+ ldx #x
+ .endif
+ .ifnblank y
+ ldy #y
+ .endif
+ .endmacro
+
+
+
+This macro may be called as follows:
++
+
+ ldaxy 1, 2, 3 ; Load all three registers
+
+ ldaxy 1, , 3 ; Load only a and y
+
+ ldaxy , , 3 ; Load y only
+
+
+
+There's another helper command for determining, which macro parameters are
+valid:
+.PARAMCOUNT
This command is
+replaced by the parameter count given, including intermediate empty macro
+parameters:
+
+
+ ldaxy 1 ; .PARAMCOUNT = 1
+ ldaxy 1,,3 ; .PARAMCOUNT = 3
+ ldaxy 1,2 ; .PARAMCOUNT = 2
+ ldaxy 1, ; .PARAMCOUNT = 2
+ ldaxy 1,2,3 ; .PARAMCOUNT = 3
+
+
+
+Macro parameters may optionally be enclosed into curly braces. This allows the +inclusion of tokens that would otherwise terminate the parameter (the comma in +case of a macro parameter).
++
+
+ .macro foo arg1, arg2
+ ...
+ .endmacro
+
+ foo ($00,x) ; Two parameters passed
+ foo {($00,x)} ; One parameter passed
+
+
+
+In the first case, the macro is called with two parameters: '($00
'
+and 'x)'. The comma is not passed to the macro, since it is part of the
+calling sequence, not the parameters.
In the second case, '($00,x)' is passed to the macro, this time +including the comma.
+ + +Sometimes it is nice to write a macro that acts differently depending on the
+type of the argument supplied. An example would be a macro that loads a 16 bit
+value from either an immediate operand, or from memory. The
+.MATCH
and
+.XMATCH
+functions will allow you to do exactly this:
+
+
+ .macro ldax arg
+ .if (.match (.left (1, {arg}), #))
+ ; immediate mode
+ lda #<(.right (.tcount ({arg})-1, {arg}))
+ ldx #>(.right (.tcount ({arg})-1, {arg}))
+ .else
+ ; assume absolute or zero page
+ lda arg
+ ldx 1+(arg)
+ .endif
+ .endmacro
+
+
+
+Using the
+.MATCH
function, the macro is able to
+check if its argument begins with a hash mark. If so, two immediate loads are
+emitted, Otherwise a load from an absolute zero page memory location is
+assumed. Please note how the curly braces are used to enclose parameters to
+pseudo functions handling token lists. This is necessary, because the token
+lists may include commas or parens, which would be treated by the assembler
+as end-of-list.
The macro can be used as
++
+
+ foo: .word $5678
+ ...
+ ldax #$1234 ; X=$12, A=$34
+ ...
+ ldax foo ; X=$56, A=$78
+
+
+
+
+
+Macros may be used recursively:
++
+
+ .macro push r1, r2, r3
+ lda r1
+ pha
+ .if .paramcount > 1
+ push r2, r3
+ .endif
+ .endmacro
+
+
+
+There's also a special macro to help writing recursive macros:
+.EXITMACRO
This command will stop macro expansion
+immediately:
+
+
+ .macro push r1, r2, r3, r4, r5, r6, r7
+ .ifblank r1
+ ; First parameter is empty
+ .exitmacro
+ .else
+ lda r1
+ pha
+ .endif
+ push r2, r3, r4, r5, r6, r7
+ .endmacro
+
+
+
+When expanding this macro, the expansion will push all given parameters +until an empty one is encountered. The macro may be called like this:
++
+
+ push $20, $21, $32 ; Push 3 ZP locations
+ push $21 ; Push one ZP location
+
+
+
+
+
+Now, with recursive macros,
+.IFBLANK
and
+
+.PARAMCOUNT
, what else do you need?
+Have a look at the inc16 macro above. Here is it again:
+
+
+ .macro inc16 addr
+ clc
+ lda addr
+ adc #$01
+ sta addr
+ lda addr+1
+ adc #$00
+ sta addr+1
+ .endmacro
+
+
+
+If you have a closer look at the code, you will notice, that it could be +written more efficiently, like this:
++
+
+ .macro inc16 addr
+ inc addr
+ bne Skip
+ inc addr+1
+ Skip:
+ .endmacro
+
+
+
+But imagine what happens, if you use this macro twice? Since the label +"Skip" has the same name both times, you get a "duplicate symbol" error. +Without a way to circumvent this problem, macros are not as useful, as +they could be. One solution is, to start a new lexical block inside the +macro:
++
+
+ .macro inc16 addr
+ .proc
+ inc addr
+ bne Skip
+ inc addr+1
+ Skip:
+ .endproc
+ .endmacro
+
+
+
+Now the label is local to the block and not visible outside. However,
+sometimes you want a label inside the macro to be visible outside. To make
+that possible, there's a new command that's only usable inside a macro
+definition:
+.LOCAL
. .LOCAL
declares one
+or more symbols as local to the macro expansion. The names of local variables
+are replaced by a unique name in each separate macro expansion. So we could
+also solve the problem above by using .LOCAL
:
+
+
+ .macro inc16 addr
+ .local Skip ; Make Skip a local symbol
+ clc
+ lda addr
+ adc #$01
+ sta addr
+ bcc Skip
+ inc addr+1
+ Skip: ; Not visible outside
+ .endmacro
+
+
+
+
+
+Starting with version 2.5 of the assembler, there is a second macro type
+available: C style macros using the .DEFINE
directive. These macros are
+similar to the classic macro type described above, but behaviour is sometimes
+different:
+
+.DEFINE
may not
+span more than a line. You may use line continuation (see
+.LINECONT
) to spread the definition over
+more than one line for increased readability, but the macro itself
+may not contain an end-of-line token.
+
+.DEFINE
share
+the name space with classic macros, but they are detected and replaced
+at the scanner level. While classic macros may be used in every place,
+where a mnemonic or other directive is allowed,
+.DEFINE
style macros are allowed anywhere in a line. So
+they are more versatile in some situations.
+
+.DEFINE
style macros may take
+parameters. While classic macros may have empty parameters, this is
+not true for
+.DEFINE
style macros.
+For this macro type, the number of actual parameters must match
+exactly the number of formal parameters.
+
+To make this possible, formal parameters are enclosed in braces when
+defining the macro. If there are no parameters, the empty braces may
+be omitted.
+
+.DEFINE
style macros may not
+contain end-of-line tokens, there are things that cannot be done. They
+may not contain several processor instructions for example. So, while
+some things may be done with both macro types, each type has special
+usages. The types complement each other.
+Let's look at a few examples to make the advantages and disadvantages +clear.
+To emulate assemblers that use "EQU
" instead of "=
" you may use the
+following .DEFINE
:
+
+
+ .define EQU =
+
+ foo EQU $1234 ; This is accepted now
+
+
+
+You may use the directive to define string constants used elsewhere:
++
+
+ ; Define the version number
+ .define VERSION "12.3a"
+
+ ; ... and use it
+ .asciiz VERSION
+
+
+
+Macros with parameters may also be useful:
++
+
+ .define DEBUG(message) .out message
+
+ DEBUG "Assembling include file #3"
+
+
+
+Note that, while formal parameters have to be placed in braces, this is +not true for the actual parameters. Beware: Since the assembler cannot +detect the end of one parameter, only the first token is used. If you +don't like that, use classic macros instead:
++
+
+ .macro message
+ .out message
+ .endmacro
+
+
+
+(This is an example where a problem can be solved with both macro types).
+ + +When using the
+-t option, characters are translated
+into the target character set of the specific machine. However, this happens
+as late as possible. This means that strings are translated if they are part
+of a
+.BYTE
or
+.ASCIIZ
command. Characters are translated as soon as they are
+used as part of an expression.
This behaviour is very intuitive outside of macros but may be confusing when +doing more complex macros. If you compare characters against numeric values, +be sure to take the translation into account.
+ + + + +Using the
+.MACPACK
directive, predefined
+macro packages may be included with just one command. Available macro packages
+are:
.MACPACK generic
+This macro package defines macros that are useful in almost any program. +Currently, two macros are defined:
++
+
+ .macro add Arg
+ clc
+ adc Arg
+ .endmacro
+
+ .macro sub Arg
+ sec
+ sbc Arg
+ .endmacro
+
+
+
+
+
+.MACPACK longbranch
+This macro package defines long conditional jumps. They are named like the
+short counterpart but with the 'b' replaced by a 'j'. Here is a sample
+definition for the "jeq
" macro, the other macros are built using the same
+scheme:
+
+
+ .macro jeq Target
+ .if .def(Target) .and ((*+2)-(Target) <= 127)
+ beq Target
+ .else
+ bne *+5
+ jmp Target
+ .endif
+ .endmacro
+
+
+
+All macros expand to a short branch, if the label is already defined (back +jump) and is reachable with a short jump. Otherwise the macro expands to a +conditional branch with the branch condition inverted, followed by an absolute +jump to the actual branch target.
+The package defines the following macros:
++
+
+ jeq, jne, jmi, jpl, jcs, jcc, jvs, jvc
+
+
+
+
+
+
+.MACPACK cbm
+The cbm macro package will define a macro named scrcode
. It takes a
+string as argument and places this string into memory translated into screen
+codes.
.MACPACK cpu
+This macro package does not define any macros but constants used to examine
+the value read from the
+.CPU
pseudo variable. For
+each supported CPU a constant similar to
+
+
+ CPU_6502
+ CPU_65SC02
+ CPU_65C02
+ CPU_65816
+ CPU_SUNPLUS
+ CPU_SWEET16
+ CPU_HUC6280
+
+
+
+is defined. These constants may be used to determine the exact type of the +currently enabled CPU. In addition to that, for each CPU instruction set, +another constant is defined:
++
+
+ CPU_ISET_6502
+ CPU_ISET_65SC02
+ CPU_ISET_65C02
+ CPU_ISET_65816
+ CPU_ISET_SUNPLUS
+ CPU_ISET_SWEET16
+ CPU_ISET_HUC6280
+
+
+
+The value read from the
+.CPU
pseudo variable may
+be checked with
+.BITAND
to determine if the
+currently enabled CPU supports a specific instruction set. For example the
+65C02 supports all instructions of the 65SC02 CPU, so it has the
+CPU_ISET_65SC02
bit set in addition to its native CPU_ISET_65C02
+bit. Using
+
+
+ .if (.cpu .bitand CPU_ISET_65SC02)
+ lda (sp)
+ .else
+ ldy #$00
+ lda (sp),y
+ .endif
+
+
+
+it is possible to determine if the
++
+
+ lda (sp)
+
+
+
+instruction is supported, which is the case for the 65SC02, 65C02 and 65816 +CPUs (the latter two are upwards compatible to the 65SC02).
+ + + +For better orthogonality, the assembler defines similar symbols as the +compiler, depending on the target system selected:
++
__APPLE2__
- Target system is apple2
__APPLE2ENH__
- Target system is apple2enh
__ATARI__
- Target system is atari
__ATMOS__
- Target system is atmos
__BBC__
- Target system is bbc
__C128__
- Target system is c128
__C16__
- Target system is c16
__C64__
- Target system is c64
__CBM__
- Target is a Commodore system__CBM510__
- Target system is cbm510
__CBM610__
- Target system is cbm610
__GEOS__
- Target system is geos
__LUNIX__
- Target system is lunix
__NES__
- Target system is nes
__PET__
- Target system is pet
__PLUS4__
- Target system is plus4
__SUPERVISION__
- Target system is supervision
__VIC20__
- Target system is vic20
Structs and unions are special forms of +scopes. They +are to some degree comparable to their C counterparts. Both have a list of +members. Each member allocates storage and may optionally have a name, which, +in case of a struct, is the offset from the beginning and, in case of a union, +is always zero.
+ + +Here is an example for a very simple struct with two members and a total size +of 4 bytes:
++
+
+ .struct Point
+ xcoord .word
+ ycoord .word
+ .endstruct
+
+
+
+A union shares the total space between all its members, its size is the same +as that of the largest member.
+A struct or union must not necessarily have a name. If it is anonymous, no +local scope is opened, the identifiers used to name the members are placed +into the current scope instead.
+A struct may contain unnamed members and definitions of local structs. The +storage allocators may contain a multiplier, as in the example below:
++
+
+ .struct Circle
+ .struct Point
+ .word 2 ; Allocate two words
+ .endstruct
+ Radius .word
+ .endstruct
+
+
+
+
+
+.TAG
keyword
+Using the +.TAG keyword, it is possible to reserve space +for an already defined struct or unions within another struct:
++
+
+ .struct Point
+ xcoord .word
+ ycoord .word
+ .endstruct
+
+ .struct Circle
+ Origin .tag Point
+ Radius .byte
+ .endstruct
+
+
+
+Space for a struct or union may be allocated using the +.TAG directive.
++
+
+ C: .tag Circle
+
+
+
+Currently, members are just offsets from the start of the struct or union. To +access a field of a struct, the member offset has to be added to the address +of the struct itself:
++
+
+ lda C+Circle::Radius ; Load circle radius into A
+
+
+
+This may change in a future version of the assembler.
+ + +Structs and unions are currently implemented as nested symbol tables (in fact, +they were a by-product of the improved scoping rules). Currently, the +assembler has no idea of types. This means that the +.TAG keyword will only allocate space. You won't be able to initialize +variables declared with +.TAG, and adding an embedded +structure to another structure with +.TAG will not make +this structure accessible by using the '::' operator.
+ + + +Note: This section applies mostly to C programs, so the explanation +below uses examples from the C libraries. However, the feature may also be +useful for assembler programs.
+ + +Using the
+.CONSTRUCTOR
,
+.DESTRUCTOR
and
+.INTERRUPTOR
keywords it it possible to export functions in a
+special way. The linker is able to generate tables with all functions of a
+specific type. Such a table will only include symbols from object
+files that are linked into a specific executable. This may be used to add
+initialization and cleanup code for library modules, or a table of interrupt
+handler functions.
The C heap functions are an example where module initialization code is used.
+All heap functions (malloc
, free
, ...) work with a few
+variables that contain the start and the end of the heap, pointers to the free
+list and so on. Since the end of the heap depends on the size and start of the
+stack, it must be initialized at runtime. However, initializing these
+variables for programs that do not use the heap are a waste of time and
+memory.
So the central module defines a function that contains initialization code and
+exports this function using the .CONSTRUCTOR
statement. If (and only if)
+this module is added to an executable by the linker, the initialization
+function will be placed into the table of constructors by the linker. The C
+startup code will call all constructors before main
and all destructors
+after main
, so without any further work, the heap initialization code is
+called once the module is linked in.
While it would be possible to add explicit calls to initialization functions +in the startup code, the new approach has several advantages:
++
The symbols are sorted in increasing priority order by the linker when using +one of the builtin linker configurations, so the functions with lower +priorities come first and are followed by those with higher priorities. The C +library runtime subroutine that walks over the function tables calls the +functions starting from the top of the table - which means that functions with +a high priority are called first.
+So when using the C runtime, functions are called with high priority functions +first, followed by low priority functions.
+ + +When using these special symbols, please take care of the following:
++
condes
and callirq
modules
+in the C runtime for an example on how to do this.
+FEATURE CONDES
statement in the linker config file. Each table has to
+be requested separately.
+
+.CONSTRUCTOR
,
+.DESTRUCTOR
and
+.INTERRUPTOR
statements, there is also a more generic command:
+
+.CONDES
. This allows to specify an
+additional type. Predefined types are 0 (constructor), 1 (destructor) and 2
+(interruptor). The linker generates a separate table for each type on request.
+Sometimes it is necessary to port code written for older assemblers to ca65.
+In some cases, this can be done without any changes to the source code by
+using the emulation features of ca65 (see
+.FEATURE
). In other cases, it is necessary to make changes to the
+source code.
Probably the biggest difference is the handling of the
+.ORG
directive. ca65 generates relocatable code, and placement is
+done by the linker. Most other assemblers generate absolute code, placement is
+done within the assembler and there is no external linker.
In general it is not a good idea to write new code using the emulation +features of the assembler, but there may be situations where even this rule is +not valid.
+ +You need to use some of the ca65 emulation features to simulate the behaviour +of such simple assemblers.
++
+
+ ; if you want TASS style labels without colons
+ .feature labels_without_colons
+
+ ; if you want TASS style character constants
+ ; ("a" instead of the default 'a')
+ .feature loose_char_term
+
+ .word *+2 ; the cbm load address
+
+ [yourcode here]
+
+
+
+
+notice that the two emulation features are mostly useful for porting
+sources originally written in/for TASS, they are not needed for the
+actual "simple assembler operation" and are not recommended if you are
+writing new code from scratch.
+
+.RES
directive.
+
+
+
+ ; *=$2000
+ .res $2000-* ; reserve memory up to $2000
+
+
+
+
+Please note that other than the original TASS, ca65 can never move the program
+counter backwards - think of it as if you are assembling to disk with TASS.
+.ifeq
/.endif
/.goto
etc.) must be
+rewritten to match ca65 syntax. Most importantly notice that due to the lack
+of .goto
, everything involving loops must be replaced by
+
+.REPEAT
.
+
+.ORG
directive instead of
+.offs
-constructs.
+
+
+
+ .org $1800
+
+ [floppy code here]
+
+ .reloc ; back to normal
+
+
+
+
+
+ cl65 --start-addr 0x0ffe -t none myprog.s -o myprog.prg
+
+
+
+
+Note that you need to use the actual start address minus two, since two bytes
+are used for the cbm load address.
+If you have problems using the assembler, if you find any bugs, or if +you're doing something interesting with the assembler, I would be glad to +hear from you. Feel free to contact me by email +( +uz@cc65.org).
+ + + +ca65 (and all cc65 binutils) are (C) Copyright 1998-2003 Ullrich von +Bassewitz. For usage of the binaries and/or sources the following +conditions do apply:
+This software is provided 'as-is', without any expressed or implied +warranty. In no event will the authors be held liable for any damages +arising from the use of this software.
+Permission is granted to anyone to use this software for any purpose, +including commercial applications, and to alter it and redistribute it +freely, subject to the following restrictions:
++
The assembler accepts the following options:
++
+
+---------------------------------------------------------------------------
+Usage: ca65 [options] file
+Short options:
+ -D name[=value] Define a symbol
+ -I dir Set an include directory search path
+ -U Mark unresolved symbols as import
+ -V Print the assembler version
+ -W n Set warning level n
+ -g Add debug info to object file
+ -h Help (this text)
+ -i Ignore case of symbols
+ -l Create a listing if assembly was ok
+ -mm model Set the memory model
+ -o name Name the output file
+ -s Enable smart mode
+ -t sys Set the target system
+ -v Increase verbosity
+
+Long options:
+ --auto-import Mark unresolved symbols as import
+ --cpu type Set cpu type
+ --debug-info Add debug info to object file
+ --feature name Set an emulation feature
+ --forget-inc-paths Forget include search paths
+ --help Help (this text)
+ --ignore-case Ignore case of symbols
+ --include-dir dir Set an include directory search path
+ --listing Create a listing if assembly was ok
+ --list-bytes n Maximum number of bytes per listing line
+ --macpack-dir dir Set a macro package directory
+ --memory-model model Set the memory model
+ --pagelength n Set the page length for the listing
+ --smart Enable smart mode
+ --target sys Set the target system
+ --verbose Increase verbosity
+ --version Print the assembler version
+---------------------------------------------------------------------------
+
+
+
+
+
+Here is a description of all the command line options:
++
--cpu type
Set the default for the CPU type. The option takes a parameter, which +may be one of
+6502, 65SC02, 65C02, 65816, sunplus, sweet16, HuC6280
+The sunplus cpu is not available in the freeware version, because the +instruction set is "proprietary and confidential".
+ + +--feature name
Enable an emulation feature. This is identical as using .FEATURE
+in the source with two exceptions: Feature names must be lower case, and
+each feature must be specified by using an extra --feature
option,
+comma separated lists are not allowed.
See the discussion of the
+.FEATURE
+command for a list of emulation features.
--forget-inc-paths
Forget the builtin include paths. This is most useful when building +customized assembler modules, in which case the standard header files should +be ignored.
+ + +-g, --debug-info
When this option (or the equivalent control command .DEBUGINFO
) is
+used, the assembler will add a section to the object file that contains
+all symbols (including local ones) together with the symbol values and
+source file positions. The linker will put these additional symbols into
+the VICE label file, so even local symbols can be seen in the VICE
+monitor.
-h, --help
Print the short option summary shown above.
+ + +-i, --ignore-case
This option makes the assembler case insensitive on identifiers and labels.
+This option will override the default, but may itself be overridden by the
+
+.CASE
control command.
-l, --listing
Generate an assembler listing. The listing file will always have the +name of the main input file with the extension replaced by ".lst". This +may change in future versions.
+ + +--list-bytes n
Set the maximum number of bytes printed in the listing for one line of
+input. See the
+.LISTBYTES
directive
+for more information. The value zero can be used to encode an unlimited
+number of printed bytes.
--macpack-dir dir
This options allows to specify a directory containing macro files that are
+used instead of the builtin images when a
+.MACPACK
directive is encountered. If --macpack-dir
+was specified, a .mac
extension is added to the package name and
+the resulting file is loaded from the given directory. This is most useful
+when debugging the builtin macro packages.
-mm model, --memory-model model
Define the default memory model. Possible model specifiers are near, far and +huge.
+ + +-o name
The default output name is the name of the input file with the extension +replaced by ".o". If you don't like that, you may give another name with +the -o option. The output file will be placed in the same directory as +the source file, or, if -o is given, the full path in this name is used.
+ + +--pagelength n
sets the length of a listing page in lines. See the
+.PAGELENGTH
directive for more information.
-s, --smart-mode
In smart mode (enabled by -s or the
+.SMART
+pseudo instruction) the assembler will track usage of the REP
and
+SEP
instructions in 65816 mode and update the operand sizes
+accordingly. If the operand of such an instruction cannot be evaluated by
+the assembler (for example, because the operand is an imported symbol), a
+warning is issued.
Beware: Since the assembler cannot trace the execution flow this may +lead to false results in some cases. If in doubt, use the .ixx and .axx +instructions to tell the assembler about the current settings. Smart +mode is off by default.
+ + +-t sys, --target sys
Set the target system. This will enable translation of character strings +and character constants into the character set of the target platform. +The default for the target system is "none", which means that no translation +will take place. The assembler supports the same target systems as the +compiler, see there for a list.
+ + +-v, --verbose
Increase the assembler verbosity. Usually only needed for debugging +purposes. You may use this option more than one time for even more +verbose output.
+ + +-D
This option allows you to define symbols on the command line. Without a +value, the symbol is defined with the value zero. When giving a value, +you may use the '$' prefix for hexadecimal symbols. Please note +that for some operating systems, '$' has a special meaning, so +you may have to quote the expression.
+ + +-I dir, --include-dir dir
Name a directory which is searched for include files. The option may be +used more than once to specify more than one directory to search. The +current directory is always searched first before considering any +additional directories. See also the section about +search paths.
+ + +-U, --auto-import
Mark symbols that are not defined in the sources as imported symbols. This
+should be used with care since it delays error messages about typos and such
+until the linker is run. The compiler uses the equivalent of this switch
+(
+.AUTOIMPORT
) to enable auto imported
+symbols for the runtime library. However, the compiler is supposed to
+generate code that runs through the assembler without problems, something
+which is not always true for assembler programmers.
-V, --version
Print the version number of the assembler. If you send any suggestions +or bugfixes, please include the version number.
+ + +-Wn
Set the warning level for the assembler. Using -W2 the assembler will +even warn about such things like unused imported symbols. The default +warning level is 1, and it would probably be silly to set it to +something lower.
+ +Include files are searched in the following places:
++
/usr/lib/cc65/asminc
+on Linux systems.CA65_INC
if it is defined.asminc
of the directory defined in the
+environment variable CC65_HOME
, if it is defined.-I
option on the command line.The assembler accepts the standard 6502/65816 assembler syntax. One line may +contain a label (which is identified by a colon), and, in addition to the +label, an assembler mnemonic, a macro, or a control command (see section +Control Commands for supported control +commands). Alternatively, the line may contain a symbol definition using +the '=' token. Everything after a semicolon is handled as a comment (that is, +it is ignored).
+Here are some examples for valid input lines:
++
+
+ Label: ; A label and a comment
+ lda #$20 ; A 6502 instruction plus comment
+ L1: ldx #$20 ; Same with label
+ L2: .byte "Hello world" ; Label plus control command
+ mymac $20 ; Macro expansion
+ MySym = 3*L1 ; Symbol definition
+ MaSym = Label ; Another symbol
+
+
+
+The assembler accepts
++
+.P02
command was given).
+.PSC02
command was given).
+.PC02
command was given).
+.P816
command was given).
+.SUNPLUS
command was given).In 65816 mode several aliases are accepted in addition to the official +mnemonics:
++
+
+ BGE is an alias for BCS
+ BLT is an alias for BCC
+ CPA is an alias for CMP
+ DEA is an alias for DEC A
+ INA is an alias for INC A
+ SWA is an alias for XBA
+ TAD is an alias for TCD
+ TAS is an alias for TCS
+ TDA is an alias for TDC
+ TSA is an alias for TSC
+
+
+
+
+
+
+6502X mode is an extension to the normal 6502 mode. In this mode, several +mnemonics for illegal instructions of the NMOS 6502 CPUs are accepted. Since +these instructions are illegal, there are no official mnemonics for them. The +unofficial ones are taken from +http://oxyron.net/graham/opcodes02.html. Please note that only the +ones marked as "stable" are supported. The following table uses information +from the mentioned web page, for more information, see there.
++
ALR: A:=(A and #{imm})*2;
ANC: A:=A and #{imm};
Generates opcode $0B.ARR: A:=(A and #{imm})/2;
AXS: X:=A and X-#{imm};
DCP: {adr}:={adr}-1; A-{adr};
ISC: {adr}:={adr}+1; A:=A-{adr};
LAS: A,X,S:={adr} and S;
LAX: A,X:={adr};
RLA: {adr}:={adr}rol; A:=A and {adr};
RRA: {adr}:={adr}ror; A:=A adc {adr};
SAX: {adr}:=A and X;
SLO: {adr}:={adr}*2; A:=A or {adr};
SRE: {adr}:={adr}/2; A:=A xor {adr};
SWEET 16 is an interpreter for a pseudo 16 bit CPU written by Steve Wozniak +for the Apple ][ machines. It is available in the Apple ][ ROM. ca65 can +generate code for this pseudo CPU when switched into sweet16 mode. The +following is special in sweet16 mode:
++
+.LOCALCHAR
command.
+R0
.. R15
. In sweet16 mode,
+these identifiers are reserved words.
+Please note that the assembler does neither supply the interpreter needed for +SWEET 16 code, nor the zero page locations needed for the SWEET 16 registers, +nor does it call the interpreter. All this must be done by your program. Apple +][ programmers do probably know how to use sweet16 mode.
+For more information about SWEET 16, see +http://www.6502.org/source/interpreters/sweet16.htm.
+ + +For literal values, the assembler accepts the widely used number formats: A +preceding '$' or a trailing 'h' denotes a hex value, a preceding '%' +denotes a binary value, and a bare number is interpreted as a decimal. There +are currently no octal values and no floats.
+ + +Please note that when using the conditional directives (.IF
and friends),
+the input must consist of valid assembler tokens, even in .IF
branches
+that are not assembled. The reason for this behaviour is that the assembler
+must still be able to detect the ending tokens (like .ENDIF
), so
+conversion of the input stream into tokens still takes place. As a consequence
+conditional assembly directives may not be used to prevent normal text
+(used as a comment or similar) from being assembled.
All expressions are evaluated with (at least) 32 bit precision. An +expression may contain constant values and any combination of internal and +external symbols. Expressions that cannot be evaluated at assembly time +are stored inside the object file for evaluation by the linker. +Expressions referencing imported symbols must always be evaluated by the +linker.
+ + +Sometimes, the assembler must know about the size of the value that is the +result of an expression. This is usually the case, if a decision has to be +made, to generate a zero page or an absolute memory references. In this +case, the assembler has to make some assumptions about the result of an +expression:
++
Note: If the assembler is not able to evaluate the expression at assembly +time, the linker will evaluate it and check for range errors as soon as +the result is known.
+ + +In the context of a boolean expression, any non zero value is evaluated as
+true, any other value to false. The result of a boolean expression is 1 if
+it's true, and zero if it's false. There are boolean operators with extreme
+low precedence with version 2.x (where x > 0). The .AND
and .OR
+operators are shortcut operators. That is, if the result of the expression is
+already known, after evaluating the left hand side, the right hand side is
+not evaluated.
Sometimes an expression must evaluate to a constant without looking at any
+further input. One such example is the
+.IF
command
+that decides if parts of the code are assembled or not. An expression used in
+the .IF
command cannot reference a symbol defined later, because the
+decision about the .IF
must be made at the point when it is read. If the
+expression used in such a context contains only constant numerical values,
+there is no problem. When unresolvable symbols are involved it may get harder
+for the assembler to determine if the expression is actually constant, and it
+is even possible to create expressions that aren't recognized as constant.
+Simplifying the expressions will often help.
In cases where the result of the expression is not needed immediately, the +assembler will delay evaluation until all input is read, at which point all +symbols are known. So using arbitrary complex constant expressions is no +problem in most cases.
+ + + +
+
+Operator | Description | Precedence |
+ | Built-in string functions | 0 |
+ | ||
+ | Built-in pseudo-variables | 1 |
+ | Built-in pseudo-functions | 1 |
++ | Unary positive | 1 |
+- | Unary negative | 1 |
+~ .BITNOT | Unary bitwise not | 1 |
+< .LOBYTE | Unary low-byte operator | 1 |
+> .HIBYTE | Unary high-byte operator | 1 |
+^ .BANKBYTE | Unary bank-byte operator | 1 |
+ | ||
+* | Multiplication | 2 |
+/ | Division | 2 |
+.MOD | Modulo operator | 2 |
+& .BITAND | Bitwise and | 2 |
+^ .BITXOR | Binary bitwise xor | 2 |
+<< .SHL | Shift-left operator | 2 |
+>> .SHR | Shift-right operator | 2 |
+ | ||
++ | Binary addition | 3 |
+- | Binary subtraction | 3 |
+| .BITOR | Bitwise or | 3 |
+ | ||
+= | Compare operator (equal) | 4 |
+<> | Compare operator (not equal) | 4 |
+< | Compare operator (less) | 4 |
+> | Compare operator (greater) | 4 |
+<= | Compare operator (less or equal) | 4 |
+>= | Compare operator (greater or equal) | 4 |
+ | ||
+&& .AND | Boolean and | 5 |
+.XOR | Boolean xor | 5 |
+ | ||
+|| .OR | Boolean or | 6 |
+ | ||
+! .NOT | Boolean not | 7 |
+ |
To force a specific order of evaluation, parentheses may be used, as usual.
+ + + +A symbol or label is an identifier that starts with a letter and is followed
+by letters and digits. Depending on some features enabled (see
+
+at_in_identifiers
,
+
+dollar_in_identifiers
and
+
+leading_dot_in_identifiers
)
+other characters may be present. Use of identifiers consisting of a single
+character will not work in all cases, because some of these identifiers are
+reserved keywords (for example "A" is not a valid identifier for a label,
+because it is the keyword for the accumulator).
The assembler allows you to use symbols instead of naked values to make +the source more readable. There are a lot of different ways to define and +use symbols and labels, giving a lot of flexibility.
+ +Numeric constants are defined using the equal sign or the label assignment +operator. After doing
++
+
+ two = 2
+
+
+
+may use the symbol "two" in every place where a number is expected, and it is +evaluated to the value 2 in this context. The label assignment operator causes +the same, but causes the symbol to be marked as a label, which may cause a +different handling in the debugger:
++
+
+ io := $d000
+
+
+
+The right side can of course be an expression:
++
+
+ four = two * two
+
+
+
+
+
+A label is defined by writing the name of the label at the start of the line +(before any instruction mnemonic, macro or pseudo directive), followed by a +colon. This will declare a symbol with the given name and the value of the +current program counter.
+ + +Using the
+.PROC
directive, it is possible to
+create regions of code where the names of labels and symbols are local to this
+region. They are not known outside of this region and cannot be accessed from
+there. Such regions may be nested like PROCEDUREs in Pascal.
See the description of the
+.PROC
+directive for more information.
Cheap local labels are defined like standard labels, but the name of the
+label must begin with a special symbol (usually '@', but this can be
+changed by the
+.LOCALCHAR
+directive).
Cheap local labels are visible only between two non cheap labels. As soon as a
+standard symbol is encountered (this may also be a local symbol if inside a
+region defined with the
+.PROC
directive), the
+cheap local symbol goes out of scope.
You may use cheap local labels as an easy way to reuse common label +names like "Loop". Here is an example:
++
+
+ Clear: lda #$00 ; Global label
+ ldy #$20
+ @Loop: sta Mem,y ; Local label
+ dey
+ bne @Loop ; Ok
+ rts
+ Sub: ... ; New global label
+ bne @Loop ; ERROR: Unknown identifier!
+
+
+
+
+If you really want to write messy code, there are also unnamed labels. These +labels do not have a name (you guessed that already, didn't you?). A colon is +used to mark the absence of the name.
+Unnamed labels may be accessed by using the colon plus several minus or plus +characters as a label designator. Using the '-' characters will create a back +reference (use the n'th label backwards), using '+' will create a forward +reference (use the n'th label in forward direction). An example will help to +understand this:
++
+
+ : lda (ptr1),y ; #1
+ cmp (ptr2),y
+ bne :+ ; -> #2
+ tax
+ beq :+++ ; -> #4
+ iny
+ bne :- ; -> #1
+ inc ptr1+1
+ inc ptr2+1
+ bne :- ; -> #1
+
+ : bcs :+ ; #2 -> #3
+ ldx #$FF
+ rts
+
+ : ldx #$01 ; #3
+ : rts ; #4
+
+
+
+As you can see from the example, unnamed labels will make even short +sections of code hard to understand, because you have to count labels +to find branch targets (this is the reason why I for my part do +prefer the "cheap" local labels). Nevertheless, unnamed labels are +convenient in some situations, so it's your decision.
+ + +While there are drawbacks with this approach, it may be handy in some
+situations. Using
+.DEFINE
, it is
+possible to define symbols or constants that may be used elsewhere. Since
+the macro facility works on a very low level, there is no scoping. On the
+other side, you may also define string constants this way (this is not
+possible with the other symbol types).
Example:
++
+
+ .DEFINE two 2
+ .DEFINE version "SOS V2.3"
+
+ four = two * two ; Ok
+ .byte version ; Ok
+
+ .PROC ; Start local scope
+ two = 3 ; Will give "2 = 3" - invalid!
+ .ENDPROC
+
+
+
+
+
+.DEBUGINFO
+If
+.DEBUGINFO
is enabled (or
+-g is given on the command line), global, local and
+cheap local labels are written to the object file and will be available in the
+symbol file via the linker. Unnamed labels are not written to the object file,
+because they don't have a name which would allow to access them.
ca65 implements several sorts of scopes for symbols.
+ +All (non cheap local) symbols that are declared outside of any nested scopes +are in global scope.
+ + +A special scope is the scope for cheap local symbols. It lasts from one non +local symbol to the next one, without any provisions made by the programmer. +All other scopes differ in usage but use the same concept internally.
+ + +A nested scoped for generic use is started with
+.SCOPE
and closed with
+.ENDSCOPE
.
+The scope can have a name, in which case it is accessible from the outside by
+using
+explicit scopes. If the scope does not
+have a name, all symbols created within the scope are local to the scope, and
+aren't accessible from the outside.
A nested scope can access symbols from the local or from enclosing scopes by +name without using explicit scope names. In some cases there may be +ambiguities, for example if there is a reference to a local symbol that is not +yet defined, but a symbol with the same name exists in outer scopes:
++
+
+ .scope outer
+ foo = 2
+ .scope inner
+ lda #foo
+ foo = 3
+ .endscope
+ .endscope
+
+
+
+In the example above, the lda
instruction will load the value 3 into the
+accumulator, because foo
is redefined in the scope. However:
+
+
+ .scope outer
+ foo = $1234
+ .scope inner
+ lda foo,x
+ foo = $12
+ .endscope
+ .endscope
+
+
+
+Here, lda
will still load from $12,x
, but since it is unknown to the
+assembler that foo
is a zeropage symbol when translating the instruction,
+absolute mode is used instead. In fact, the assembler will not use absolute
+mode by default, but it will search through the enclosing scopes for a symbol
+with the given name. If one is found, the address size of this symbol is used.
+This may lead to errors:
+
+
+ .scope outer
+ foo = $12
+ .scope inner
+ lda foo,x
+ foo = $1234
+ .endscope
+ .endscope
+
+
+
+In this case, when the assembler sees the symbol foo
in the lda
+instruction, it will search for an already defined symbol foo
. It will
+find foo
in scope outer
, and a close look reveals that it is a
+zeropage symbol. So the assembler will use zeropage addressing mode. If
+foo
is redefined later in scope inner
, the assembler tries to change
+the address in the lda
instruction already translated, but since the new
+value needs absolute addressing mode, this fails, and an error message "Range
+error" is output.
Of course the most simple solution for the problem is to move the definition
+of foo
in scope inner
upwards, so it precedes its use. There may be
+rare cases when this cannot be done. In these cases, you can use one of the
+address size override operators:
+
+
+ .scope outer
+ foo = $12
+ .scope inner
+ lda a:foo,x
+ foo = $1234
+ .endscope
+ .endscope
+
+
+
+This will cause the lda
instruction to be translated using absolute
+addressing mode, which means changing the symbol reference later does not
+cause any errors.
A nested procedure is created by use of
+.PROC
. It
+differs from a
+.SCOPE
in that it must have a
+name, and a it will introduce a symbol with this name in the enclosing scope.
+So
+
+
+ .proc foo
+ ...
+ .endscope
+
+
+
+is actually the same as
++
+
+ foo:
+ .scope foo
+ ...
+ .endscope
+
+
+
+This is the reason why a procedure must have a name. If you want a scope
+without a name, use
+.SCOPE
.
Note: As you can see from the example above, scopes and symbols live in
+different namespaces. There can be a symbol named foo
and a scope named
+foo
without any conflicts (but see the section titled
+"Scope search order").
Structs, unions and enums are explained in a
+separate section, I do only cover them here, because if they are declared with a
+name, they open a nested scope, similar to
+.SCOPE
. However, when no name is specified, the behaviour is
+different: In this case, no new scope will be opened, symbols declared within
+a struct, union, or enum declaration will then be added to the enclosing scope
+instead.
Accessing symbols from other scopes is possible by using an explicit scope
+specification, provided that the scope where the symbol lives in has a name.
+The namespace token (::
) is used to access other scopes:
+
+
+ .scope foo
+ bar: .word 0
+ .endscope
+
+ ...
+ lda foo::bar ; Access foo in scope bar
+
+
+
+The only way to deny access to a scope from the outside is to declare a scope
+without a name (using the
+.SCOPE
command).
A special syntax is used to specify the global scope: If a symbol or scope is +preceded by the namespace token, the global scope is searched:
++
+
+ bar = 3
+
+ .scope foo
+ bar = 2
+ lda #::bar ; Access the global bar (which is 3)
+ .endscope
+
+
+
+
+
+The assembler searches for a scope in a similar way as for a symbol. First, it +looks in the current scope, and then it walks up the enclosing scopes until +the scope is found.
+However, one important thing to note when using explicit scope syntax is, that +a symbol may be accessed before it is defined, but a scope may not be +used without a preceding definition. This means that in the following +example:
++
+
+ .scope foo
+ bar = 3
+ .endscope
+
+ .scope outer
+ lda #foo::bar ; Will load 3, not 2!
+ .scope foo
+ bar = 2
+ .endscope
+ .endscope
+
+
+
+the reference to the scope foo
will use the global scope, and not the
+local one, because the local one is not visible at the point where it is
+referenced.
Things get more complex if a complete chain of scopes is specified:
++
+
+ .scope foo
+ .scope outer
+ .scope inner
+ bar = 1
+ .endscope
+ .endscope
+ .scope another
+ .scope nested
+ lda #outer::inner::bar ; 1
+ .endscope
+ .endscope
+ .endscope
+
+ .scope outer
+ .scope inner
+ bar = 2
+ .endscope
+ .endscope
+
+
+
+When outer::inner::bar
is referenced in the lda
instruction, the
+assembler will first search in the local scope for a scope named outer
.
+Since none is found, the enclosing scope (another
) is checked. There is
+still no scope named outer
, so scope foo
is checked, and finally
+scope outer
is found. Within this scope, inner
is searched, and in
+this scope, the assembler looks for a symbol named bar
.
Please note that once the anchor scope is found, all following scopes
+(inner
in this case) are expected to be found exactly in this scope. The
+assembler will search the scope tree only for the first scope (if it is not
+anchored in the root scope). Starting from there on, there is no flexibility,
+so if the scope named outer
found by the assembler does not contain a
+scope named inner
, this would be an error, even if such a pair does exist
+(one level up in global scope).
Ambiguities that may be introduced by this search algorithm may be removed by
+anchoring the scope specification in the global scope. In the example above,
+if you want to access the "other" symbol bar
, you would have to write:
+
+
+ .scope foo
+ .scope outer
+ .scope inner
+ bar = 1
+ .endscope
+ .endscope
+ .scope another
+ .scope nested
+ lda #::outer::inner::bar ; 2
+ .endscope
+ .endscope
+ .endscope
+
+ .scope outer
+ .scope inner
+ bar = 2
+ .endscope
+ .endscope
+
+
+
+
+
+ca65 assigns each segment and each symbol an address size. This is true, even +if the symbol is not used as an address. You may also think of a value range +of the symbol instead of an address size.
+Possible address sizes are:
++
Since the assembler uses default address sizes for the segments and symbols, +it is usually not necessary to override the default behaviour. In cases, where +it is necessary, the following keywords may be used to specify address sizes:
++
The assembler assigns an address size to each segment. Since the +representation of a label within this segment is "segment start + offset", +labels will inherit the address size of the segment they are declared in.
+The address size of a segment may be changed, by using an optional address
+size modifier. See the
+segment directive
for
+an explanation on how this is done.
The default address size of a segment depends on the memory model used. Since +labels inherit the address size from the segment they are declared in, +changing the memory model is an easy way to change the address size of many +symbols at once.
+ + + + +Pseudo variables are readable in all cases, and in some special cases also +writable.
+ +*
+Reading this pseudo variable will return the program counter at the start +of the current input line.
+Assignment to this variable is possible when
+.FEATURE pc_assignment
is used. Note: You should not use
+assignments to *
, use
+.ORG
instead.
.CPU
+Reading this pseudo variable will give a constant integer value that
+tells which CPU is currently enabled. It can also tell which instruction
+set the CPU is able to translate. The value read from the pseudo variable
+should be further examined by using one of the constants defined by the
+"cpu" macro package (see
+.MACPACK
).
It may be used to replace the .IFPxx pseudo instructions or to construct +even more complex expressions.
+Example:
++
+
+ .macpack cpu
+ .if (.cpu .bitand CPU_ISET_65816)
+ phx
+ phy
+ .else
+ txa
+ pha
+ tya
+ pha
+ .endif
+
+
+
+
+
+
+.PARAMCOUNT
+This builtin pseudo variable is only available in macros. It is replaced by +the actual number of parameters that were given in the macro invocation.
+Example:
++
+
+ .macro foo arg1, arg2, arg3
+ .if .paramcount <> 3
+ .error "Too few parameters for macro foo"
+ .endif
+ ...
+ .endmacro
+
+
+
+
+See section +Macros.
+ + +.TIME
+Reading this pseudo variable will give a constant integer value that +represents the current time in POSIX standard (as seconds since the +Epoch).
+It may be used to encode the time of translation somewhere in the created +code.
+Example:
++
+
+ .dword .time ; Place time here
+
+
+
+
+
+
+.VERSION
+Reading this pseudo variable will give the assembler version according to +the following formula:
+VER_MAJOR*$100 + VER_MINOR*$10 + VER_PATCH
+It may be used to encode the assembler version or check the assembler for +special features not available with older versions.
+Example:
+Version 2.11.1 of the assembler will return $2B1 as numerical constant when
+reading the pseudo variable .VERSION
.
+
+
+
+
+
+
.DEBUGINFO
++
+
+
+
.BANKBYTE
+.BLANK
+.CONCAT
+.CONST
+.HIBYTE
+.HIWORD
+.IDENT
+.LEFT
+.LOBYTE
+.LOWORD
+.MATCH
+.MID
+.REF, .REFERENCED
+.RIGHT
+.SIZEOF
+.STRAT
+.SPRINTF
+.STRING
+.STRLEN
+.TCOUNT
+.XMATCH
++
.A16
+.A8
+.ADDR
+.ALIGN
+.ASCIIZ
+.ASSERT
+.AUTOIMPORT
+.BANKBYTES
+.BSS
+.BYT, .BYTE
+.CASE
+.CHARMAP
+.CODE
+.CONDES
+.CONSTRUCTOR
+.DATA
+.DBYT
+.DEBUGINFO
+.DEFINE
+.DEF, .DEFINED
+.DESTRUCTOR
+.DWORD
+.ELSE
+.ELSEIF
+.END
+.ENDENUM
+.ENDIF
+.ENDMAC, .ENDMACRO
+.ENDPROC
+.ENDREP, .ENDREPEAT
+.ENDSCOPE
+.ENDSTRUCT
+.ENUM
+.ERROR
+.EXITMAC, .EXITMACRO
+.EXPORT
+.EXPORTZP
+.FARADDR
+.FEATURE
+.FILEOPT, .FOPT
+.FORCEIMPORT
+.GLOBAL
+.GLOBALZP
+.HIBYTES
+.I16
+.I8
+.IF
+.IFBLANK
+.IFCONST
+.IFDEF
+.IFNBLANK
+.IFNDEF
+.IFNREF
+.IFP02
+.IFP816
+.IFPC02
+.IFPSC02
+.IFREF
+.IMPORT
+.IMPORTZP
+.INCBIN
+.INCLUDE
+.INTERRUPTOR
+.LINECONT
+.LIST
+.LISTBYTES
+.LOBYTES
+.LOCAL
+.LOCALCHAR
+.MACPACK
+.MAC, .MACRO
+.ORG
+.OUT
+.P02
+.P816
+.PAGELEN, .PAGELENGTH
+.PC02
+.POPSEG
+.PROC
+.PSC02
+.PUSHSEG
+.RELOC
+.REPEAT
+.RES
+.RODATA
+.SCOPE
+.SEGMENT
+.SETCPU
+.SMART
+.STRUCT
+.SUNPLUS
+.TAG
+.WARNING
+.WORD
+.ZEROPAGE
++
+
+
+
+
+
+
+
ca65html converts assembly source files written for use with the
+ca65
crossassembler into HTML. It is a standalone
+tool written in PERL; and as such, it does not understand the structure of
+assembler sources in the same depth as ca65 does, so it may fail in very rare
+cases. In all other cases, it generates very nice output.
The HTML converter accepts the following options:
++
+
+---------------------------------------------------------------------------
+Usage: ca65html [options] file ...
+Options:
+ --bgcolor c Use background color c instead of #FFFFFF
+ --colorize Add color highlights to the output
+ --commentcolor c Use color c for comments instead of #B22222
+ --crefs Generate references to the C source file(s)
+ --ctrlcolor c Use color c for directives instead of #228B22
+ --cvttabs Convert tabs to spaces in the output
+ --help This text
+ --htmldir dir Specify directory for HTML files
+ --indexcols n Use n columns on index page (default 6)
+ --indexname file Use file for the index file instead of index.html
+ --indexpage Create an index page
+ --indextitle title Use title as the index title instead of Index
+ --keywordcolor c Use color c for keywords instead of #A020F0
+ --linelabels Generate a linexxx HTML label for each line
+ --linenumbers Add line numbers to the output
+ --linkstyle style Use the given link style
+ --replaceext Replace source extension instead of appending .html
+ --textcolor c Use text color c instead of #000000
+ --verbose Be more verbose
+---------------------------------------------------------------------------
+
+
+
+
+
+Here is a description of all the command line options:
++
--bgcolor c
Set the background color. The argument c must be a valid HTML color, usually
+given as RGB triplet in the form #rrggbb
, where r, g, and b are the
+respective red, green, and blue parts as two-digit hex values. The default is
+#FFFFFF
(white). That color is used in the <body>
of the
+generated HTML output.
--colorize
Colorize the output. The converter outputs processor instructions, assembler +control commands, and comments in different colors.
+ + +--commentcolor c
Set the color used for comments. The argument c must be a valid HTML color,
+usually given as RGB triplet in the form #rrggbb
, where r, g, and b are
+the respective red, green, and blue parts as two-digit hex values. The
+default is #B22222
(red).
Note that this option has no effect if --colorize
is not also given.
--crefs
Generate references to the C file, when a .dbg
command is found with a
+file name. The converter assumes that the C source was also converted into
+HTML (for example by use of c2html
), has the name file.c.html
, and
+lives in the same directory as the assembler file. If the .dbg
+directive specifies a line, a link to the correct line in the C file is
+generated, using a label in the form linexxx
, as it is created by
+c2html
by use of the -n
option.
--commentcolor c
Set the color used for assembler control commands. The argument c must be a
+valid HTML color, usually given as RGB triplet in the form #rrggbb
,
+where r, g, and b are the respective red, green, and blue parts as two-digit
+hex values. The default is #228B22
(green).
Note that this option has no effect if --colorize
is not also given.
--cvttabs
Convert tabs in the input into spaces in the output, assuming the standard
+tab width of 8. This is useful if the --linenumbers
option is used to
+retain the indentation.
--help
Print the command line option summary shown above.
+ + +--htmldir dir
Specify an output directory for the generated HTML files.
+ + +--indexcols n
Use n columns on the index page. This option has no effect if used without
+--indexpage
.
--indexname name
Use another index file name instead of index.html
. This option has no
+effect if used without --indexpage
.
--indexpage
Causes the converter to generate an index page listing file names, and all +exports found in the converted files.
+ + +--indextitle title
Use "title" as the title of the index page. This option has no effect if
+used without --indexpage
.
--keywordcolor c
Set the color used for processor instructions. The argument c must be a
+valid HTML color, usually given as RGB triplet in the form #rrggbb
,
+where r, g, and b are the respective red, green, and blue parts as two-digit
+hex values. The default is #A020F0
(purple).
Note that this option has no effect if --colorize
is not also given.
--linelabels
Generate a label for each line using the name linexxx
where xxx is the
+number of the line.
Note: The converter will not make use of this label. Use this option if you +have other HTML pages referencing the converted assembler file.
+ + +--linenumbers
Generate line numbers on the left side of the output.
+ + +--linkstyle n
Influences the style used when generating links for imports. If n is zero
+(the default), the converter creates a link to the actual symbol if it is
+defined somewhere in the input files. If not, it creates a link to the
+.import
statement. If n is one, the converter will always generate a
+HTML link to .import
statement.
--replaceext
Replace the file extension of the input file instead of appending .html
+when generating the output file name.
--textcolor c
Set the color for normal text. The argument c must be a valid HTML color,
+usually given as RGB triplet in the form #rrggbb
, where r, g, and b are
+the respective red, green, and blue parts as two-digit hex values. The
+default is #000000
(black). This color is used in the <body>
+of the generated HTML output.
--verbose
Increase the converter verbosity. Without this option, ca65html is quiet +when working. If you have a slow machine and lots of files to convert, you +might like a little bit more progress information.
+ +Since ca65html is able to generate links between modules, the best way to use +it is to supply all modules to it in one run, instead of running each file +separately through it.
+ + +For now, ca65html will not read files included with .include
. Specifying
+the include files as normal input files on the command line works in many
+cases.
Since ca65html does not really parse the input, but does most of its work +applying text patterns, it doesn't know anything about scoping and advanced +features of the assembler. This means that it might miss a label. And, it +might choose the wrong color for an item, in rare cases. Because it's just a +tool for displaying sources in a nice form, I think that's OK. Anyway, if you +find a conversion problem, you can send me a short piece of example input code. +If possible, I will fix it.
+ + +While having colors in the output looks really nice, it has one drawback:
++
<span>
tags are created in the output,
+the size of the output file literally will explode. It seems to be the price
+that you have to pay for color.
+If you have problems using the converter, if you find any bugs, or if you're +doing something interesting with the assembler, I would be glad to hear from +you. Feel free to contact me by email ( +uz@cc65.org).
+ + + +ca65html is (c) Copyright 2000-2007 Ullrich von Bassewitz. For its use, the +following conditions apply:
+This software is provided 'as-is', without any expressed or implied +warranty. In no event will the authors be held liable for any damages +arising from the use of this software.
+Permission is granted to anyone to use this software for any purpose, +including commercial applications, and to alter it and redistribute it +freely, subject to the following restrictions:
++
+
+
+
+
+
The ld65 linker combines several object modules created by the ca65 +assembler, producing an executable file. The object modules may be read +from a library created by the ar65 archiver (this is somewhat faster and +more convenient). The linker was designed to be as flexible as possible. +It complements the features that are built into the ca65 macroassembler:
++
The linker is called as follows:
++
+
+---------------------------------------------------------------------------
+Usage: ld65 [options] module ...
+Short options:
+ -( Start a library group
+ -) End a library group
+ -C name Use linker config file
+ -D sym=val Define a symbol
+ -L path Specify a library search path
+ -Ln name Create a VICE label file
+ -S addr Set the default start address
+ -V Print the linker version
+ -h Help (this text)
+ -m name Create a map file
+ -o name Name the default output file
+ -t sys Set the target system
+ -u sym Force an import of symbol `sym'
+ -v Verbose mode
+ -vm Verbose map file
+
+Long options:
+ --cfg-path path Specify a config file search path
+ --config name Use linker config file
+ --dbgfile name Generate debug information
+ --define sym=val Define a symbol
+ --dump-config name Dump a builtin configuration
+ --end-group End a library group
+ --force-import sym Force an import of symbol `sym'
+ --help Help (this text)
+ --lib file Link this library
+ --lib-path path Specify a library search path
+ --mapfile name Create a map file
+ --module-id id Specify a module id
+ --obj file Link this object file
+ --obj-path path Specify an object file search path
+ --start-addr addr Set the default start address
+ --start-group Start a library group
+ --target sys Set the target system
+ --version Print the linker version
+---------------------------------------------------------------------------
+
+
+
+
+
+Here is a description of all the command line options:
++
-(, --start-group
Start a library group. The libraries specified within a group are searched +multiple times to resolve crossreferences within the libraries. Normally, +crossreferences are only resolved within a library, that is the library is +searched multiple times. Libraries specified later on the command line +cannot reference otherwise unreferenced symbols in libraries specified +earlier, because the linker has already handled them. Library groups are +a solution for this problem, because the linker will search repeatedly +through all libraries specified in the group, until all possible open +symbol references have been satisfied.
+ + +-), --end-group
End a library group. See the explanation of the
+--start-group
option.
-h, --help
Print the short option summary shown above.
+ + +-m name, --mapfile name
This option (which needs an argument that will used as a filename for +the generated map file) will cause the linker to generate a map file. +The map file does contain a detailed overview over the modules used, the +sizes for the different segments, and a table containing exported +symbols.
+ + +-o name
The -o switch is used to give the name of the default output file. +Depending on your output configuration, this name may NOT be used as +name for the output file. However, for the builtin configurations, this +name is used for the output file name.
+ + +-t sys, --target sys
The argument for the -t switch is the name of the target system. Since this
+switch will activate a builtin configuration, it may not be used together
+with the
+-C
option. The following target
+systems are currently supported:
+
There are a few more targets defined but neither of them is actually +supported.
+ + +-u sym[:addrsize], --force-import sym[:addrsize]
Force an import of a symbol. While object files are always linked to the
+output file, regardless if there are any references, object modules from
+libraries get only linked in if an import can be satisfied by this module.
+The --fore-import
option may be used to add a reference to a symbol and
+as a result force linkage of the module that exports the identifier.
The name of the symbol may optionally be followed by a colon and an address +size specifier. If no address size is specified, the default address size +for the target machine is used.
+Please note that the symbol name needs to have the internal representation, +meaning you have to prepend an underline for C identifiers.
+ + +-v, --verbose
Using the -v option, you may enable more output that may help you to +locate problems. If an undefined symbol is encountered, -v causes the +linker to print a detailed list of the references (that is, source file +and line) for this symbol.
+ + +-vm
Must be used in conjunction with
+-m
+(generate map file). Normally the map file will not include empty segments
+and sections, or unreferenced symbols. Using this option, you can force the
+linker to include all this information into the map file.
-C
This gives the name of an output config file to use. See section 4 for more
+information about config files. -C may not be used together with
+-t
.
-D sym=value, --define sym=value
This option allows to define an external symbol on the command line. Value
+may start with a '$' sign or with 0x
for hexadecimal values,
+otherwise a leading zero denotes octal values. See also the
+SYMBOLS section in the configuration file.
-L path, --lib-path path
Specify a library search path. This option may be used more than once. It
+adds a directory to the search path for library files. Libraries specified
+without a path are searched in current directory, in the directory given in
+the LD65_LIB
environment variable, and in the list of directories
+specified using --lib-path
.
-Ln
This option allows you to create a file that contains all global labels and
+may be loaded into VICE emulator using the ll
(load label) command. You
+may use this to debug your code with VICE. Note: Older versions had some
+bugs in the label code. If you have problems, please get the latest VICE
+version.
-S addr, --start-addr addr
Using -S you may define the default starting address. If and how this +address is used depends on the config file in use. For the builtin +configurations, only the "none", "apple2" and "apple2enh" systems honor an +explicit start address, all other builtin config provide their own.
+ + +-V, --version
This option print the version number of the linker. If you send any +suggestions or bugfixes, please include this number.
+ + +--cfg-path path
Specify a config file search path. This option may be used more than once.
+It adds a directory to the search path for config files. A config file given
+with the
+-C
option that has no path in
+its name is searched in the current directory, in the directory given in the
+LD65_CFG
environment variable, and in the list of directories specified
+using --cfg-path
.
--dbgfile name
Specify an output file for debug information. Available information will be
+written to this file. Using the -g
option for the compiler and assembler
+will increase the amount of information available. Please note that debug
+information generation is currently being developed, so the format of the
+file and it's contents are subject to change without further notice.
--lib file
Links a library to the output. Use this command line option instead of just +naming the library file, if the linker is not able to determine the file +type because of an unusual extension.
+ + +--obj file
Links an object file to the output. Use this command line option instead +of just naming the object file, if the linker is not able to determine the +file type because of an unusual extension.
+ + +--obj-path path
Specify an object file search path. This option may be used more than once.
+It adds a directory to the search path for object files. An object file
+passed to the linker that has no path in its name is searched in current
+directory, in the directory given in the LD65_OBJ
environment variable,
+and in the list of directories specified using --obj-path
.
Starting with version 2.10 there are now several search paths for files needed +by the linker: One for libraries, one for object files and one for config +files.
+ + +The library search path contains in this order:
++
/usr/lib/cc65/lib
on
+Linux systems.LD65_LIB
if it is defined.lib
of the directory defined in the environment
+variable CC65_HOME
, if it is defined.
+--lib-path
option on the command line.The object file search path contains in this order:
++
/usr/lib/cc65/obj
on
+Linux systems.LD65_OBJ
if it is defined.obj
of the directory defined in the environment
+variable CC65_HOME
, if it is defined.
+--obj-path
option on the command line.The config file search path contains in this order:
++
/usr/lib/cc65/cfg
on
+Linux systems.LD65_CFG
if it is defined.cfg
of the directory defined in the environment
+variable CC65_HOME
, if it is defined.
+--cfg-path
option on the command line.The linker does several things when combining object modules:
+First, the command line is parsed from left to right. For each object file +encountered (object files are recognized by a magic word in the header, so +the linker does not care about the name), imported and exported +identifiers are read from the file and inserted in a table. If a library +name is given (libraries are also recognized by a magic word, there are no +special naming conventions), all modules in the library are checked if an +export from this module would satisfy an import from other modules. All +modules where this is the case are marked. If duplicate identifiers are +found, the linker issues a warning.
+This procedure (parsing and reading from left to right) does mean, that a +library may only satisfy references for object modules (given directly or from +a library) named before that library. With the command line
++
+
+ ld65 crt0.o clib.lib test.o
+
+
+
+the module test.o may not contain references to modules in the library +clib.lib. If this is the case, you have to change the order of the modules +on the command line:
++
+
+ ld65 crt0.o test.o clib.lib
+
+
+
+Step two is, to read the configuration file, and assign start addresses +for the segments and define any linker symbols (see +Configuration files).
+After that, the linker is ready to produce an output file. Before doing that, +it checks it's data for consistency. That is, it checks for unresolved +externals (if the output format is not relocatable) and for symbol type +mismatches (for example a zero page symbol is imported by a module as absolute +symbol).
+Step four is, to write the actual target files. In this step, the linker will +resolve any expressions contained in the segment data. Circular references are +also detected in this step (a symbol may have a circular reference that goes +unnoticed if the symbol is not used).
+Step five is to output a map file with a detailed list of all modules, +segments and symbols encountered.
+And, last step, if you give the
+-v
switch
+twice, you get a dump of the segment data. However, this may be quite
+unreadable if you're not a developer:-)
Configuration files are used to describe the layout of the output file(s). Two +major topics are covered in a config file: The memory layout of the target +architecture, and the assignment of segments to memory areas. In addition, +several other attributes may be specified.
+Case is ignored for keywords, that is, section or attribute names, but it is +not ignored for names and strings.
+ + + +Memory areas are specified in a MEMORY
section. Lets have a look at an
+example (this one describes the usable memory layout of the C64):
+
+
+ MEMORY {
+ RAM1: start = $0800, size = $9800;
+ ROM1: start = $A000, size = $2000;
+ RAM2: start = $C000, size = $1000;
+ ROM2: start = $E000, size = $2000;
+ }
+
+
+
+As you can see, there are two ram areas and two rom areas. The names +(before the colon) are arbitrary names that must start with a letter, with +the remaining characters being letters or digits. The names of the memory +areas are used when assigning segments. As mentioned above, case is +significant for these names.
+The syntax above is used in all sections of the config file. The name
+(ROM1
etc.) is said to be an identifier, the remaining tokens up to the
+semicolon specify attributes for this identifier. You may use the equal sign
+to assign values to attributes, and you may use a comma to separate
+attributes, you may also leave both out. But you must use a semicolon to
+mark the end of the attributes for one identifier. The section above may also
+have looked like this:
+
+
+ # Start of memory section
+ MEMORY
+ {
+ RAM1:
+ start $0800
+ size $9800;
+ ROM1:
+ start $A000
+ size $2000;
+ RAM2:
+ start $C000
+ size $1000;
+ ROM2:
+ start $E000
+ size $2000;
+ }
+
+
+
+There are of course more attributes for a memory section than just start and +size. Start and size are mandatory attributes, that means, each memory area +defined must have these attributes given (the linker will check that). I +will cover other attributes later. As you may have noticed, I've used a +comment in the example above. Comments start with a hash mark (`#'), the +remainder of the line is ignored if this character is found.
+ + +Let's assume you have written a program for your trusty old C64, and you would
+like to run it. For testing purposes, it should run in the RAM
area. So
+we will start to assign segments to memory sections in the SEGMENTS
+section:
+
+
+ SEGMENTS {
+ CODE: load = RAM1, type = ro;
+ RODATA: load = RAM1, type = ro;
+ DATA: load = RAM1, type = rw;
+ BSS: load = RAM1, type = bss, define = yes;
+ }
+
+
+
+What we are doing here is telling the linker, that all segments go into the
+RAM1
memory area in the order specified in the SEGMENTS
section. So
+the linker will first write the CODE
segment, then the RODATA
+segment, then the DATA
segment - but it will not write the BSS
+segment. Why? Enter the segment type: For each segment specified, you may also
+specify a segment attribute. There are four possible segment attributes:
+
+
+ ro means readonly
+ rw means read/write
+ bss means that this is an uninitialized segment
+ zp a zeropage segment
+
+
+
+So, because we specified that the segment with the name BSS is of type bss,
+the linker knows that this is uninitialized data, and will not write it to an
+output file. This is an important point: For the assembler, the BSS
+segment has no special meaning. You specify, which segments have the bss
+attribute when linking. This approach is much more flexible than having one
+fixed bss segment, and is a result of the design decision to supporting an
+arbitrary segment count.
If you specify "type = bss
" for a segment, the linker will make sure that
+this segment does only contain uninitialized data (that is, zeroes), and issue
+a warning if this is not the case.
For a bss
type segment to be useful, it must be cleared somehow by your
+program (this happens usually in the startup code - for example the startup
+code for cc65 generated programs takes care about that). But how does your
+code know, where the segment starts, and how big it is? The linker is able to
+give that information, but you must request it. This is, what we're doing with
+the "define = yes
" attribute in the BSS
definitions. For each
+segment, where this attribute is true, the linker will export three symbols.
+
+
+ __NAME_LOAD__ This is set to the address where the
+ segment is loaded.
+ __NAME_RUN__ This is set to the run address of the
+ segment. We will cover run addresses
+ later.
+ __NAME_SIZE__ This is set to the segment size.
+
+
+
+Replace NAME
by the name of the segment, in the example above, this would
+be BSS
. These symbols may be accessed by your code.
Now, as we've configured the linker to write the first three segments and +create symbols for the last one, there's only one question left: Where does +the linker put the data? It would be very convenient to have the data in a +file, wouldn't it?
+ +We don't have any files specified above, and indeed, this is not needed in a
+simple configuration like the one above. There is an additional attribute
+"file" that may be specified for a memory area, that gives a file name to
+write the area data into. If there is no file name given, the linker will
+assign the default file name. This is "a.out" or the one given with the
+
+-o
option on the command line. Since the
+default behaviour is ok for our purposes, I did not use the attribute in the
+example above. Let's have a look at it now.
The "file" attribute (the keyword may also be written as "FILE" if you like
+that better) takes a string enclosed in double quotes (`"') that specifies the
+file, where the data is written. You may specify the same file several times,
+in that case the data for all memory areas having this file name is written
+into this file, in the order of the memory areas defined in the MEMORY
+section. Let's specify some file names in the MEMORY
section used above:
+
+
+ MEMORY {
+ RAM1: start = $0800, size = $9800, file = %O;
+ ROM1: start = $A000, size = $2000, file = "rom1.bin";
+ RAM2: start = $C000, size = $1000, file = %O;
+ ROM2: start = $E000, size = $2000, file = "rom2.bin";
+ }
+
+
+
+The %O
used here is a way to specify the default behaviour explicitly:
+%O
is replaced by a string (including the quotes) that contains the
+default output name, that is, "a.out" or the name specified with the
+-o
option on the command line. Into this file, the
+linker will first write any segments that go into RAM1
, and will append
+then the segments for RAM2
, because the memory areas are given in this
+order. So, for the RAM areas, nothing has really changed.
We've not used the ROM areas, but we will do that below, so we give the file
+names here. Segments that go into ROM1
will be written to a file named
+"rom1.bin", and segments that go into ROM2
will be written to a file
+named "rom2.bin". The name given on the command line is ignored in both cases.
Assigning an empty file name for a memory area will discard the data written +to it. This is useful, if the a memory area has segments assigned that are +empty (for example because they are of type bss). In that case, the linker +will create an empty output file. This may be suppressed by assigning an empty +file name to that memory area.
+ + +Let us look now at a more complex example. Say, you've successfully tested +your new "Super Operating System" (SOS for short) for the C64, and you +will now go and replace the ROMs by your own code. When doing that, you +face a new problem: If the code runs in RAM, we need not to care about +read/write data. But now, if the code is in ROM, we must care about it. +Remember the default segments (you may of course specify your own):
++
+
+ CODE read only code
+ RODATA read only data
+ DATA read/write data
+ BSS uninitialized data, read/write
+
+
+
+Since BSS
is not initialized, we must not care about it now, but what
+about DATA
? DATA
contains initialized data, that is, data that was
+explicitly assigned a value. And your program will rely on these values on
+startup. Since there's no other way to remember the contents of the data
+segment, than storing it into one of the ROMs, we have to put it there. But
+unfortunately, ROM is not writable, so we have to copy it into RAM before
+running the actual code.
The linker cannot help you copying the data from ROM into RAM (this must be +done by the startup code of your program), but it has some features that will +help you in this process.
+First, you may not only specify a "load
" attribute for a segment, but
+also a "run
" attribute. The "load
" attribute is mandatory, and, if
+you don't specify a "run
" attribute, the linker assumes that load area
+and run area are the same. We will use this feature for our data area:
+
+
+ SEGMENTS {
+ CODE: load = ROM1, type = ro;
+ RODATA: load = ROM2, type = ro;
+ DATA: load = ROM2, run = RAM2, type = rw, define = yes;
+ BSS: load = RAM2, type = bss, define = yes;
+ }
+
+
+
+Let's have a closer look at this SEGMENTS
section. We specify that the
+CODE
segment goes into ROM1
(the one at $A000). The readonly data
+goes into ROM2
. Read/write data will be loaded into ROM2
but is run
+in RAM2
. That means that all references to labels in the DATA
+segment are relocated to be in RAM2
, but the segment is written to
+ROM2
. All your startup code has to do is, to copy the data from it's
+location in ROM2
to the final location in RAM2
.
So, how do you know, where the data is located? This is the second point,
+where you get help from the linker. Remember the "define
" attribute?
+Since we have set this attribute to true, the linker will define three
+external symbols for the data segment that may be accessed from your code:
+
+
+ __DATA_LOAD__ This is set to the address where the segment
+ is loaded, in this case, it is an address in
+ ROM2.
+ __DATA_RUN__ This is set to the run address of the segment,
+ in this case, it is an address in RAM2.
+ __DATA_SIZE__ This is set to the segment size.
+
+
+
+So, what your startup code must do, is to copy __DATA_SIZE__
bytes from
+__DATA_LOAD__
to __DATA_RUN__
before any other routines are called.
+All references to labels in the DATA
segment are relocated to RAM2
+by the linker, so things will work properly.
There are some other attributes not covered above. Before starting the +reference section, I will discuss the remaining things here.
+You may request symbols definitions also for memory areas. This may be +useful for things like a software stack, or an i/o area.
++
+
+ MEMORY {
+ STACK: start = $C000, size = $1000, define = yes;
+ }
+
+
+
+This will define three external symbols that may be used in your code:
++
+
+ __STACK_START__ This is set to the start of the memory
+ area, $C000 in this example.
+ __STACK_SIZE__ The size of the area, here $1000.
+ __STACK_LAST__ This is NOT the same as START+SIZE.
+ Instead, it it defined as the first
+ address that is not used by data. If we
+ don't define any segments for this area,
+ the value will be the same as START.
+
+
+
+A memory section may also have a type. Valid types are
++
+
+ ro for readonly memory
+ rw for read/write memory.
+
+
+
+The linker will assure, that no segment marked as read/write or bss is put +into a memory area that is marked as readonly.
+Unused memory in a memory area may be filled. Use the "fill = yes
"
+attribute to request this. The default value to fill unused space is zero. If
+you don't like this, you may specify a byte value that is used to fill these
+areas with the "fillval
" attribute. This value is also used to fill unfilled
+areas generated by the assemblers .ALIGN
and .RES
directives.
The symbol %S
may be used to access the default start address (that is,
+the one defined in the
+FEATURES section, or the
+value given on the command line with the
+-S
+option).
Segments may be aligned to some memory boundary. Specify "align = num
" to
+request this feature. Num must be a power of two. To align all segments on a
+page boundary, use
+
+
+ SEGMENTS {
+ CODE: load = ROM1, type = ro, align = $100;
+ RODATA: load = ROM2, type = ro, align = $100;
+ DATA: load = ROM2, run = RAM2, type = rw, define = yes,
+ align = $100;
+ BSS: load = RAM2, type = bss, define = yes, align = $100;
+ }
+
+
+
+If an alignment is requested, the linker will add enough space to the output
+file, so that the new segment starts at an address that is dividable by the
+given number without a remainder. All addresses are adjusted accordingly. To
+fill the unused space, bytes of zero are used, or, if the memory area has a
+"fillval
" attribute, that value. Alignment is always needed, if you have
+used the .ALIGN
command in the assembler. The alignment of a segment
+must be equal or greater than the alignment used in the .ALIGN
command.
+The linker will check that, and issue a warning, if the alignment of a segment
+is lower than the alignment requested in an .ALIGN
command of one of the
+modules making up this segment.
For a given segment you may also specify a fixed offset into a memory area or
+a fixed start address. Use this if you want the code to run at a specific
+address (a prominent case is the interrupt vector table which must go at
+address $FFFA). Only one of ALIGN
or OFFSET
or START
may be
+specified. If the directive creates empty space, it will be filled with zero,
+of with the value specified with the "fillval
" attribute if one is given.
+The linker will warn you if it is not possible to put the code at the
+specified offset (this may happen if other segments in this area are too
+large). Here's an example:
+
+
+ SEGMENTS {
+ VECTORS: load = ROM2, type = ro, start = $FFFA;
+ }
+
+
+
+or (for the segment definitions from above)
++
+
+ SEGMENTS {
+ VECTORS: load = ROM2, type = ro, offset = $1FFA;
+ }
+
+
+
+The "align
", "start
" and "offset
" attributes change placement
+of the segment in the run memory area, because this is what is usually
+desired. If load and run memory areas are equal (which is the case if only the
+load memory area has been specified), the attributes will also work. There is
+also an "align_load
" attribute that may be used to align the start of the
+segment in the load memory area, in case different load and run areas have
+been specified. There are no special attributes to set start or offset for
+just the load memory area.
To suppress the warning, the linker issues if it encounters a segment that is
+not found in any of the input files, use "optional=yes
" as additional
+segment attribute. Be careful when using this attribute, because a missing
+segment may be a sign of a problem, and if you're suppressing the warning,
+there is no one left to tell you about it.
The FILES
section is used to support other formats than straight binary
+(which is the default, so binary output files do not need an explicit entry
+in the FILES
section).
The FILES
section lists output files and as only attribute the format of
+each output file. Assigning binary format to the default output file would
+look like this:
+
+
+ FILES {
+ %O: format = bin;
+ }
+
+
+
+The only other available output format is the o65 format specified by Andre +Fachat (see the +6502 binary relocation format specification). It is defined like this:
++
+
+ FILES {
+ %O: format = o65;
+ }
+
+
+
+The necessary o65 attributes are defined in a special section labeled
+FORMAT
.
The FORMAT
section is used to describe file formats. The default (binary)
+format has currently no attributes, so, while it may be listed in this
+section, the attribute list is empty. The second supported format, o65, has
+several attributes that may be defined here.
+
+
+ FORMATS {
+ o65: os = lunix, version = 0, type = small,
+ import = LUNIXKERNEL,
+ export = _main;
+ }
+
+
+
+
+
+
+In addition to the MEMORY
and SEGMENTS
sections described above, the
+linker has features that may be enabled by an additional section labeled
+FEATURES
.
CONDES
is used to tell the linker to emit module constructor/destructor
+tables.
+
+
+ FEATURES {
+ CONDES: segment = RODATA,
+ type = constructor,
+ label = __CONSTRUCTOR_TABLE__,
+ count = __CONSTRUCTOR_COUNT__;
+ }
+
+
+
+The CONDES
feature has several attributes:
+
segment
This attribute tells the linker into which segment the table should be +placed. If the segment does not exist, it is created.
+ + +type
Describes the type of the routines to place in the table. Type may be one of
+the predefined types constructor
, destructor
, interruptor
, or
+a numeric value between 0 and 6.
label
This specifies the label to use for the table. The label points to the start +of the table in memory and may be used from within user written code.
+ + +count
This is an optional attribute. If specified, an additional symbol is defined +by the linker using the given name. The value of this symbol is the number +of entries (not bytes) in the table. While this attribute is optional, +it is often useful to define it.
+ + +order
Optional attribute that takes one of the keywords increasing
or
+decreasing
as an argument. Specifies the sorting order of the entries
+within the table. The default is increasing
, which means that the
+entries are sorted with increasing priority (the first entry has the lowest
+priority). "Priority" is the priority specified when declaring a symbol as
+.CONDES
with the assembler, higher values mean higher priority. You may
+change this behaviour by specifying decreasing
as the argument, the
+order of entries is reversed in this case.
Please note that the order of entries with equal priority is undefined.
+ +Without specifying the CONDES
feature, the linker will not create any
+tables, even if there are condes
entries in the object files.
For more information see the .CONDES
command in the
+ca65 manual.
STARTADDRESS
is used to set the default value for the start address,
+which can be referenced by the %S
symbol. The builtin default for the
+linker is $200.
+
+
+ FEATURES {
+ # Default start address is $1000
+ STARTADDRESS: default = $1000;
+ }
+
+
+
+Please note that order is important: The default start address must be defined
+before the %S
symbol is used in the config file. This does usually
+mean, that the FEATURES
section has to go to the top of the config file.
The configuration file may also be used to define symbols used in the link
+stage. The mandatory attribute for a symbol is its value. A second, boolean
+attribute named weak
is available. If a symbol is marked as weak, it may
+be overridden by defining a symbol of the same name from the command line. The
+default for symbols is that they're strong, which means that an attempt to
+define a symbol with the same name from the command line will lead to an
+error.
The following example defines the stack size for an application, but allows
+the programmer to override the value by specifying --define
+__STACKSIZE__=xxx
on the command line.
+
+
+ SYMBOLS {
+ # Define the stack size for the application
+ __STACKSIZE__: value = $800, weak = yes;
+ }
+
+
+
+
+
+
+The builtin configurations are part of the linker source. They can be retrieved
+with --dump-config
and don't have a special format. So if you need a
+special configuration, it's a good idea to start with the builtin configuration
+for your system. In a first step, just replace -t target
by -C
+configfile
. Then go on and modify the config file to suit your needs.
Several machine specific binary packages are distributed together with secondary
+configurations (in the cfg directory). These configurations can be used with
+-C configfile
too.
The builtin config files do contain segments that have a special meaning for +the compiler and the libraries that come with it. If you replace the builtin +config files, you will need the following information.
+ +The INIT segment is used for initialization code that may be reused once +execution reaches main() - provided that the program runs in RAM. You +may for example add the INIT segment to the heap in really memory +constrained systems.
+ +For the LOWCODE segment, it is guaranteed that it won't be banked out, so it +is reachable at any time by interrupt handlers or similar.
+ +This segment contains the startup code which initializes the C software stack +and the libraries. It is placed in its own segment because it needs to be +loaded at the lowest possible program address on several platforms.
+ +The ZPSAVE segment contains the original values of the zeropage locations used +by the ZEROPAGE segment. It is placed in its own segment because it must not be +initialized.
+ + + +If you have problems using the linker, if you find any bugs, or if you're +doing something interesting with it, I would be glad to hear from you. Feel +free to contact me by email ( +uz@cc65.org).
+ + + +ld65 (and all cc65 binutils) are (C) Copyright 1998-2005 Ullrich von +Bassewitz. For usage of the binaries and/or sources the following +conditions do apply:
+This software is provided 'as-is', without any expressed or implied +warranty. In no event will the authors be held liable for any damages +arising from the use of this software.
+Permission is granted to anyone to use this software for any purpose, +including commercial applications, and to alter it and redistribute it +freely, subject to the following restrictions:
++
+
+
+
+
+
+
+
+