craigthomas / cocoassembler Goto Github PK

View Code? Open in Web Editor NEW

19.0 19.0 6.0 268 KB

A Tandy Color Computer 1, 2, and 3 assembler written in Python

License: MIT License

Python 100.00%

6809 6809-assembly assembler assembly python tandy-color-computer trs-80

cocoassembler's Introduction

CoCo Assembler and File Utility

What is it?
Requirements
License
Installing
The Assembler
File Utility
Common Examples

What is it?

This project is an assembler for the Tandy Color Computer 1, 2 and 3 written in Python 3.6+. More broadly speaking, it is an assembler that targets the Motorola 6809 processor, meaning it targets any computer that used the 6809 as it's main CPU (e.g. the Dragon 32 and 64, Vectrex, Thomson TO7, etc). It is intended to be statement compatible with any code written for the EDTASM+ assembler. The assembler is capable of taking EDTASM+ assembly language code and translating it into 6809 machine code. Current support is for 6809 CPU instructions, but future enhancements will add 6309 instructions.

This project also includes a general purpose file utility, used mainly for manipulating CAS, DSK, and WAV files. The file utility specifically targets the disk file formats and cassette formats used by the Color Computer line of personal computers.

License

This project makes use of an MIT style license. Generally speaking, the license is extremely permissive, allowing you to copy, modify, distribute, sell, or distribute it for personal or commercial purposes. Please see the file called LICENSE for more information.

Requirements

The assembler can be run on any OS platform, including but not limited to:

Windows (XP, Vista, 7, 8, 10, 11, etc)
Linux (Ubuntu, Debian, Arch, Raspbian, etc)
Mac (Mojave, Catalina, Big Sur, Monterey, Ventura, etc)

The only requirement is Python 3.6 or greater will need to be installed and available on the search path, along with the Package Installer for Python (pip). To download Python, visit the Downloads section on the Python website. See the Python installation documentation for more information on ensuring the Python interpreter is installed on the search path, and that pip is installed along with it.

Installing

There is no specific installer that needs to be run in order to install the assembler. Simply copy the source files to a directory of your choice. A zipfile containing the latest release of the source files can be downloaded from the Releases section of the code repository. Unzip the contents to a directory of your choice.

Next, you will need to install the required packages for the file:

pip install -r requirements.txt

The Assembler

The assembler is contained in a file called assmbler.py and can be invoked with:

python3 assembler.py

In general, the assembler recognizes EDTASM+ mnemonics, along with a few other somewhat standardized mnemonics to make program compilation easier. By default, the assembler assumes it is assembling statements in 6809 machine code. Future releases will include a 6309 extension.

Assembler Usage

To run the assembler:

python3 assembler.py input_file

This will assemble the instructions found in file input_file and will generate the associated Color Computer machine instructions in binary format. You will need to save the assembled contents to a file to be useful. There are several switches that are available:

--print - prints out the assembled statements
--symbols - prints out the symbol table
--to_bin - save assembled contents to a binary file
--to_cas - save assembled contents to a cassette file
--to_dsk - save assembled contents to a virtual disk file
--name - saves the program with the specified name on a cassette or virtual disk file

Input File Format

The input file needs to follow the format below:

LABEL    MNEMONIC    OPERANDS    COMMENT

Where:

LABEL is a 10 character label for the statement
MNEMONIC is a 6809 operation mnemonic from the Mnemonic Table below
OPERANDS are registers, values, expressions, or labels
COMMENT is a 40 character comment describing the statement (must have a ; preceding it)

An example file:

; Print HELLO WORLD on the screen
            NAM     HELLO           ; Name of the program
CHROUT      EQU     $A30A           ; Location of CHROUT routine
POLCAT      EQU     $A000           ; Location of POLCAT routine
            ORG     $0E00           ; Originate at $0E00
START       JSR     $A928           ; Clear the screen
            LDX     #MESSAGE        ; Load X index with start of message
PRINT       LDA     ,X+             ; Load next character of message
            CMPA    #0              ; Check for null terminator
            BEQ     FINISH          ; Done printing, wait for keypress
            JSR     CHROUT          ; Print out the character
            BRA     PRINT           ; Print next char
MESSAGE     FCC     "HELLO WORLD"
            FDB     $0              ; Null terminator
FINISH      JSR     [POLCAT]        ; Read keyboard
            BEQ     FINISH          ; No key pressed, wait for keypress
            JMP     $A027           ; Restart BASIC
            END     START

Print Symbol Table

To print the symbol table that is generated during assembly, use the --symbols switch:

python3 assembler.py test.asm --symbols

Which will have the following output:

-- Symbol Table --
$A30A CHROUT
$A000 POLCAT
$0E00 START
$0E06 PRINT
$0E11 MESSAGE
$0E1E FINISH

The first column of output is the hex value of the symbol. This may be the address in memory where the symbol exits if it labels a mnemonic, or it may be the value that the symbol is defined as being if it references an EQU statement. The second columns is the symbol name itself.

Print Assembled Statements

To print out the assembled version of the program, use the --print switch:

python3 assembler.py test.asm --print

Which will have the following output:

-- Assembled Statements --
$0000                         NAM HELLO         ; Name of the program
$0000                CHROUT   EQU $A30A         ; Location of CHROUT routine
$0000                POLCAT   EQU $A000         ; Location of POLCAT routine
$0E00                         ORG $0E00         ; Originate at $0E00
$0E00 BDA928          START   JSR $A928         ; Clear the screen
$0E03 8E0E11                  LDX #MESSAGE      ; Load X index with start of message
$0E06 A680            PRINT   LDA ,X+           ; Load next character of message
$0E08 8100                   CMPA #0            ; Check for null terminator
$0E0A 2712                    BEQ FINISH        ; Done printing, wait for keypress
$0E0C BDA30A                  JSR CHROUT        ; Print out the character
$0E0F 20F5                    BRA PRINT         ; Print next char
$0E11 48454C4C4F    MESSAGE   FCC "HELLO WORLD" ;
$0E1C 0000                    FDB $0            ; Null terminator
$0E1E AD9FA000       FINISH   JSR [POLCAT]      ; Read keyboard
$0E22 27FA                    BEQ FINISH        ; No key pressed, wait for keypress
$0E24 7EA027                  JMP $A027         ; Restart BASIC
$0E27                         END START         ;

A single line of output is composed of 6 columns:

Line:   $0E00 BDA928          START   JSR $A928         ; Clear the screen
        ----- ------          -----   --- -----         ------------------
Column:   1     2               3      4    5                   6

The columns are as follows:

The offset in hex where the statement occurs ($0E00).
The machine code generated and truncated to 10 hex characters (BDA928).
The user-supplied label for the statement (START).
The instruction mnemonic (JSR).
Operands that the instruction processes ($A928).
The comment string (Clear the screen).

Save to Binary File

To save the assembled contents to a binary file, use the --to_bin switch:

python3 assembler.py test.asm --to_bin test.bin

The assembled program will be saved to the file test.bin. Note that this file may not be useful on its own, as it does not have any meta information about where the file should be loaded in memory (pseudo operations ORG and NAM will not have any effect on the assembled file).

NOTE: If the file test.bin exists, it will be erased and overwritten.

Save to Cassette File

To save the assembled contents to a cassette file, use the --to_cas switch:

python3 assembler.py test.asm --to_cas test.cas

This will assemble the program and save it to a cassette file called test.cas. The source code must include the NAM mnemonic to name the program (e.g. NAM myprog), or the --name switch must be used on the command line (e.g. --name myprog). The program name on the cassette file will be MYPROG.

NOTE: if the file test.cas exists, assembly will stop and the file will not be overwritten. If you wish to add the program to test.cas, you must specify the --append flag during assembly:

python3 assembler.py test.asm --to_cas test.cas --append

To load from the cassette file, you must use BASIC's CLOADM command as follows:

CLOADM"MYPROG"

Save to Disk File

To save the assembled contents to a disk file, use the --to_dsk switch:

python3 assembler.py test.asm --to_dsk test.dsk

This will assemble the program and save it to a disk file called test.dsk. The source code must include the NAM mnemonic to name the program on disk (e.g. NAM myprog), or the --name switch must be used on the command line (e.g. --name myprog). The program name on the disk file will be MYPROG.BIN.

NOTE: if the file test.dsk exists, assembly will stop and the file will not be updated. If you wish to add the program to test.dsk, you must specify the --append flag during assembly:

python3 assembler.py test.asm --to_dsk test.dsk --append

To load from the disk file, you must use Disk Basic's LOADM command as follows:

LOADM"MYPROG.BIN"

Mnemonic Table

Below are the mnemonics that are accepted by the assembler (these mnemonics are compatible with EDTASM+ assembler mnemonics). For the mnemonics below, special symbols are:

A - 8-bit accumulator register.
B - 8-bit accumulator register.
CC - 8-bit condition code register.
D - 16-bit accumulator register comprised of A and B, with A being high byte, and B being low byte.
M - a memory location with a value between 0 and 65535.
S - 16-bit system stack register.
U - 16-bit user stack register.
X - 16-bit index register.
Y - 16-bit index register.

Note that the operations described below may use different addressing modes for operands (e.g. LDA may use immediate, direct, indexed or extended addressing to load values). See the section below on addressing modes, and consult the MC6809 datasheet for more information on what addressing modes are applicable, as well as the number of cycles used to execute each operation.

Mnemonics

Mnemonic	Description	Example
`ABX`	Adds contents of `B` to value in register `X` and stores in `X`.	`ABX`
`ADCA`	Adds contents of `A`, memory value `M`, and carry bit and stores in `A`.	`ADCA $FFEE`
`ADCB`	Adds contents of `B`, memory value `M`, and carry bit and stores in `B`.	`ADCB $FFEF`
`ADDA`	Adds contents of `A` with memory value `M` and stores in `A`.	`ADDA #$03`
`ADDB`	Adds contents of `B` with memory value `M` and stores in `B`.	`ADDB #$90`
`ADDD`	Adds contents of `D` (`A:B`) with memory value `M:M+1` and stores in `D`.	`ADDD #$2000`
`ANDA`	Performs a logical AND on `A` with memory value `M` and stores in `A`.	`ANDA #$05`
`ANDB`	Performs a logical AND on `B` with memory value `M` and stores in `B`.	`ANDB $FFEE`
`ANDCC`	Performs a logical AND on `CC` with immediate value `M` and stores in `CC`.	`ANDCC #$01`
`ASLA`	Shifts bits in `A` left (0 placed in bit 0, bit 7 goes to carry in `CC`).	`ASLA`
`ASLB`	Shifts bits in `B` left (0 placed in bit 0, bit 7 goes to carry in `CC`).	`ASLB`
`ASL`	Shifts bits in `M` left (0 placed in bit 0, bit 7 goes to carry in `CC`).	`ASL $0E00`
`ASRA`	Shifts bits in `A` right (bit 0 goes to carry in `CC`, bit 7 remains same).	`ASRA`
`ASRB`	Shifts bits in `B` right (bit 0 goes to carry in `CC`, bit 7 remains same).	`ASRB`
`ASR`	Shifts bits in `M` right (bit 0 goes to carry in `CC`, bit 7 remains same).	`ASR $0E00`
`BCC`	Branches if carry bit in `CC` is 0.	`BCC CLS`
`BCS`	Branches if carry bit in `CC` is 1.	`BCS CLS`
`BEQ`	Branches if zero bit in `CC` is 1.	`BEQ CLS`
`BGE`	Branches if negative and overflow bits in `CC` are equal.	`BGE CLS`
`BGT`	Branches if negative and overflow bits are equal, and zero bit is zero in `CC`.	`BGT CLS`
`BHI`	Branches if carry and zero bit in `CC` are 0.	`BHI CLS`
`BHS`	Same as `BCC`.	`BHS CLS`
`BITA`	Logically ANDs `A` with memory contents `M` and sets bits in `CC`.	`BITA #$1E`
`BITB`	Logically ANDs `B` with memory contents `M` and sets bits in `CC`.	`BITB #$1E`
`BLE`	Branches if negative and overflow bits are not equal, or zero bit is 1 in `CC`.	`BLE CLS`
`BLO`	Same as `BCS`.	`BLO CLS`
`BLS`	Branches if zero bit is 1, or carry bit is 1 in `CC`.	`BLS CLS`
`BLT`	Branches if negative bit is not equal overflow bit in `CC`.	`BLT CLS`
`BMI`	Branches if negative bit is 1 in `CC`.	`BMI CLS`
`BNE`	Branches if zero bit is 0 in `CC`.	`BNE CLS`
`BGE`	Branches if negative bit is 0 in `CC`.	`BGE CLS`
`BRA`	Branch always.	`BRA CLS`
`BRN`	Branch never - essentially 2-byte `NOP`.	`BRN CLS`
`BSR`	Saves the value of `PC` on the `S` stack and branches to subroutine.	`BSR PRINT`
`BVC`	Branches if overflow bit is 0 in `CC`.	`BVC CLS`
`BVS`	Branches if overflow bit is 1 in `CC`.	`BVS CLS`
`CLRA`	Zeroes out the `A` register, and clears `CC`.	`CLRA`
`CLRB`	Zeroes out the `B` register, and clears `CC`.	`CLRB`
`CLR`	Zeroes out the memory contents `M` and clears `CC`.	`CLR $01E0`
`CMPA`	Subtract value from `A` register, and sets bits in `CC`.	`CMPA #$1E`
`CMPB`	Subtract value from `B` register, and sets bits in `CC`.	`CMPB #$1E`
`CMPD`	Subtract value from `D` register, and sets bits in `CC`.	`CMPD #$1E1F`
`CMPS`	Subtract value from `S` register, and sets bits in `CC`.	`CMPS #$1E1F`
`CMPU`	Subtract value from `U` register, and sets bits in `CC`.	`CMPU #$1E1F`
`CMPX`	Subtract value from `X` register, and sets bits in `CC`.	`CMPX #$1E1F`
`CMPY`	Subtract value from `Y` register, and sets bits in `CC`.	`CMPY #$1E1F`
`COMA`	Perform logical complement of value in `A` and store in `A`.	`COMA`
`COMB`	Perform logical complement of value in `B` and store in `B`.	`COMA`
`COM`	Perform logical complement of value in memory location `M` and store in `M`.	`COM $FFEE`
`CWAI`	Clear `CC` register, push state onto stack, and wait for interrupt.	`CWAI`
`DAA`	Perform decimal addition adjust in `A`. Converts to Binary Coded Decimal.	`DAA`
`DECA`	Decrement the value in `A` by 1.	`DECA`
`DECB`	Decrement the value in `B` by 1.	`DECB`
`DEC`	Decrement the value in memory location `M` by 1.	`DEC $FFEE`
`EORA`	Perform exclusive OR with value in `A`, and store in `A`.	`EORA #$1F`
`EORB`	Perform exclusive OR with value in `B`, and store in `B`.	`EORB #$1F`
`EXG`	Swap values in registers.	`EXG A,B`
`INCA`	Increment value in `A` by 1.	`INCA`
`INCB`	Increment value in `B` by 1.	`INCB`
`INC`	Increment the value in memory location `M` by 1.	`INC $FFEE`
`JMP`	Unconditional jump to location.	`JMP $C000`
`JSR`	Jump to subroutine at the specified location.	`JSR PRINT`
`LBCC`	Same as `BCC`, except can branch more than -126 and +129 bytes.	`LBCC CLS`
`LBCS`	Same as `BCS`, except can branch more than -126 and +129 bytes.	`LBCS CLS`
`LBEQ`	Same as `BEQ`, except can branch more than -126 and +129 bytes.	`LBEQ CLS`
`LBGE`	Same as `BGE`, except can branch more than -126 and +129 bytes.	`LBGE CLS`
`LBGT`	Same as `BGT`, except can branch more than -126 and +129 bytes.	`LBGT CLS`
`LBHI`	Same as `BHI`, except can branch more than -126 and +129 bytes.	`LBHI CLS`
`LBHS`	Same as `BHS`, except can branch more than -126 and +129 bytes.	`LBHS CLS`
`LBLE`	Same as `BLE`, except can branch more than -126 and +129 bytes.	`LBLE CLS`
`LBLO`	Same as `BLO`, except can branch more than -126 and +129 bytes.	`LBLO CLS`
`LBLS`	Same as `BLS`, except can branch more than -126 and +129 bytes.	`LBLS CLS`
`LBLT`	Same as `BLT`, except can branch more than -126 and +129 bytes.	`LBLT CLS`
`LBMI`	Same as `BMI`, except can branch more than -126 and +129 bytes.	`LBMI CLS`
`LBNE`	Same as `BNE`, except can branch more than -126 and +129 bytes.	`LBNE CLS`
`LBPL`	Same as `BPL`, except can branch more than -126 and +129 bytes.	`LBPL CLS`
`LBRA`	Same as `BRA`, except can branch more than -126 and +129 bytes.	`LBRA CLS`
`LBRN`	Same as `BRN`, except can branch more than -126 and +129 bytes.	`LBRN CLS`
`LBSR`	Same as `BSR`, except can branch more than -126 and +129 bytes.	`LBSR CLS`
`LBVC`	Same as `BVC`, except can branch more than -126 and +129 bytes.	`LBVC CLS`
`LBVS`	Same as `BVS`, except can branch more than -126 and +129 bytes.	`LBVS CLS`
`LDA`	Loads `A` with the specified value.	`LDA #$FE`
`LDB`	Loads `B` with the specified value.	`LDB #$FE`
`LDD`	Loads `D` with the specified value.	`LDD #$FEFE`
`LDS`	Loads `S` with the specified value.	`LDS #$FEFE`
`LDU`	Loads `U` with the specified value.	`LDU #$FEEE`
`LDX`	Loads `X` with the specified value.	`LDX #$FEEE`
`LDY`	Loads `Y` with the specified value.	`LDY #$FEEE`
`LEAS`	Loads `S` with the address computed from an indexed addressing mode operand.	`LEAS A,X`
`LEAU`	Loads `U` with the address computed from an indexed addressing mode operand.	`LEAU A,X`
`LEAX`	Loads `X` with the address computed from an indexed addressing mode operand.	`LEAX A,X`
`LEAY`	Loads `Y` with the address computed from an indexed addressing mode operand.	`LEAY A,X`
`LSLA`	Logically shift bits left in `A`, bit 7 stored in carry of `CC`, bit 0 gets 0.	`LSLA`
`LSLB`	Logically shift bits left in `B`, bit 7 stored in carry of `CC`, bit 0 gets 0.	`LSLB`
`LSL`	Logically shift bits left in `M`, bit 7 stored in carry of `CC`, bit 0 gets 0.	`LSL $FFEE`
`LSRA`	Logically shift bits right in `A`, bit 0 stored in carry of `CC`, bit 7 gets 0.	`LSRA`
`LSRB`	Logically shift bits right in `B`, bit 0 stored in carry of `CC`, bit 7 gets 0.	`LSRB`
`LSR`	Logically shift bits right in `M`, bit 0 stored in carry of `CC`, bit 7 gets 0.	`LSR $FFEE`
`MUL`	Unsigned multiple of `A` and `B`, and stored in `D`.	`MUL`
`NEGA`	Perform twos complement of `A` and store in `A`.	`NEGA`
`NEGB`	Perform twos complement of `B` and store in `B`.	`NEGB`
`NEG`	Perform twos complement of `M` and store in `M`.	`NEG $FFEE`
`NOP`	Do nothing but advance program counter to next memory location.	`NOP`
`ORA`	Perform logical OR of `A` and value, and store in `A`.	`ORA #$A1`
`ORB`	Perform logical OR of `B` and value, and store in `B`.	`ORB #$CD`
`ORCC`	Perform logical OR of `CC` and value, and store in `CC`.	`ORCC #$C0`
`PSHS`	Pushes specified registers onto the `S` stack.	`PSHS A,B,X`
`PSHU`	Pushes specified registers onto the `U` stack.	`PSHU A,B,X`
`PULS`	Pulls values from the `S` stack back into their registers.	`PULS A,B,X`
`PULU`	Pulls values from the `U` stack back into their registers.	`PULU A,B,X`
`ROLA`	Rotates bits left in `A`, bit 7 stored in carry, and carry stored in bit 0.	`ROLA`
`ROLB`	Rotates bits left in `B`, bit 7 stored in carry, and carry stored in bit 0.	`ROLB`
`ROL`	Rotates bits left in `M`, bit 7 stored in carry, and carry stored in bit 0.	`ROL $FFEE`
`RORA`	Rotates bits right in `A`, carry stored in bit 7, and bit 0 stored in carry.	`RORA`
`RORB`	Rotates bits right in `B`, carry stored in bit 7, and bit 0 stored in carry.	`RORB`
`ROR`	Rotates bits right in `M`, carry stored in bit 7, and bit 0 stored in carry.	`ROL $FFEE`
`RTI`	Return from interrupt, pops `CC`, then `A`, `B`, `DP`, `X`, `Y`, `U` if `E` set.	`RTI`
`RTS`	Return from subroutine, pops `PC` from `S` stack.	`RTS`
`SBCA`	Subtract byte and carry from `A` and store in `A`.	`SBCA #$C0`
`SBCB`	Subtract byte and carry from `B` and store in `B`.	`SBCB #$C0`
`SEX`	Sign extend bit 7 from `B` into all of `A` and into negative of `CC`.	`SEX`
`STA`	Stores value in `A` at the specified memory location `M`.	`STA $FFEE`
`STB`	Stores value in `B` at the specified memory location `M`.	`STB $1EEE`
`STD`	Stores value in `D` at the specified memory location `M`.	`STD $1EEE`
`STS`	Stores value in `S` at the specified memory location `M`.	`STS $1EEE`
`STU`	Stores value in `U` at the specified memory location `M`.	`STU $1EEE`
`STX`	Stores value in `X` at the specified memory location `M`.	`STX $1EEE`
`STY`	Stores value in `Y` at the specified memory location `M`.	`STY $1EEE`
`SUBA`	Subtracts the 8-bit value from `A` and stores in `A`.	`SUBA #$1E`
`SUBB`	Subtracts the 8-bit value from `B` and stores in `B`.	`SUBB #$1E`
`SUBD`	Subtracts the 16-bit value from `D` and stores in `D`.	`SUBB #$1E`
`SWI`	Push all registers on the stack and branch to subroutine at `$FFFA`.	`SWI`
`SWI2`	Push all registers on the stack and branch to subroutine at `$FFF4`.	`SWI2`
`SWI3`	Push all registers on the stack and branch to subroutine at `$FFF2`.	`SWI3`
`SYNC`	Halt and wait for interrupt.	`SYNC`
`TFR`	Transfer source register value to destination register.	`TFR A,B`
`TSTA`	Test value in `A`, and set negative in `CC` and zero in `CC` as required.	`TSTA`
`TSTB`	Test value in `B`, and set negative in `CC` and zero in `CC` as required.	`TSTB`
`TST`	Test value in `M`, and set negative in `CC` and zero in `CC` as required.	`TST $FFFE`

Pseudo Operations

Mnemonic	Description	Example
`FCB`	Defines a byte constant value. Separate multiple bytes with `,`.	`FCB $1C,$AA`
`FCC`	Defines a string constant value enclosed in a matching pair of delimiters.	`FCC "hello"`
`FDB`	Defines a word constant value. Separate multiple word with `,`.	`FDB $2000,$CAFE`
`END`	Defines the end of the program.	`END`
`EQU`	Defines a symbol with a set value.	`SCRMEM EQU $1000`
`INCLUDE`	Includes another assembly source file at this location.	`INCLUDE globals.asm`
`NAM`	Sets the name for the program when assembled to disk or cassette.	`NAM myprog`
`ORG`	Defines where in memory the program should originate at.	`ORG $0E00`
`RMB`	Defines a block of n bytes initialized to a zero value.	`RMB $8`
`SETDP`	Sets the direct page value for the assembler (see notes below).	`SETDP $0E00`

Notes

SETDP - this mnemonic is used for memory and instruction optimization by the assembler. For example, if SETDP $0E00 is set, any machine instructions that use $0EXX as a memory location will be assembled using direct page addressing. Instructions such as JMP $0E8F will become JMP <$8F. The programmer is responsible for loading the direct page register manually - this mnemonic does not output opcodes that change the direct page register.

Addressing Modes

The assembler understands several type of addressing modes for each operation. Please note that you will need to consult the MC6809 Data Sheet for information regarding what operations support which addressing modes. The different modes are explained below. Again, the discussion below is a simplified explanation of the material that exists in the MC6809 Data Sheet.

Inherent

The operation takes no additional information to execute. For example, the decimal additional adjust operation requires no additional inputs and is specified with:

DAA

Immediate

The operation requires a value that is specified directly following the mnemonic. All immediate values must be prefixed with an # symbol. For example, to load the value of $FE into register A:

LDA #$FE

Extended

The operation requires a 16-bit address that is used for execution of the instruction. The address may be specified as a hard-coded value, or as a symbol. For example:

JMP $FFEE
BSR POLCAT

Extended Indirect

The operation requires a 16-bit address where the memory contents specify another 16-bit address. Extended indirect addressing mode is denoted by using square brackets to enclose the operand. For example:

JMP [$010E]
BSR [PRINT]

Direct

Works similarly to extended addressing, however, instead of using a 16-bit value to specify the address, uses the contents of the direct page DP register and an 8-bit value to form the full 16-bit address. This addressing mode saves space. Direct addressing also has a lower clock-cycle count for execution. For example, assuming that the DP register is set to $FF, then the following statement:

JMP $EE

Is equivalent to the extended addressing mode variant of:

JMP $FFEE

Additionally, the direct addressing mode can be explicitly invoked on the operand by prefixing the operand with an < symbol. For example:

JMP <PRINT

In general, the assembler will attempt to optimize addressing modes by converting any extended addressing operands to direct operands wherever possible. By default, the assembler assumes that the DP register will be $00. The SETDP pseudo operation is used to inform the assembler that the direct page content has changed. For all lines following the SETDP invocation, the new high 8-bit value will be used for optimization. For example:

SETDP $FF
JMP $FFEE

The assembler will convert the operand specified in the JMP instruction to a direct operand, since the upper 8-bits of the instruction are $FF and the direct page reigster is set to $FF. Thus, the resultant output will be equivalent to:

JMP $EE

Note that the SETDP pseudo operation does not actually modify the contents of DP - it is up to the programmer to manipulate the direct page register prior to using SETDP.

Indexed

Works with one of the pointer registers X, Y, U, or S. The calculation of the address required by the operand is relative to the offset as pointed to by the pointer register used. In the simplest case, the offset is zero offset, which is just relative to the pointer register itself. For example:

LDB X

The above statement would load the value from memory pointed to by the X pointer register. If for example, X held $FFEE, then the load operation would load B with the contents in $FFEE. When numeric values are used, then the offset is calculated by taking the pointer register and summing the value with the register value. For example:

LDB -2,X

In this example, if X held $FF02, the the load operation would load B with the contents in $FF00. Note that symbols can also appear:

LDB ADJ,X

Pointer registers can also be auto incremented or decremented in indexed mode. The register values are either pre-decremented, or post-incremented. Values can be incremented or decremented by 1 or 2 steps. For example:

LDB ,X+
LDB ,X++
LDB ,-X
LDB ,--X

The first statement above will load B with the value pointed to by X, and then increment the value of the X register. The second statement will do the same, but increment the value of X twice. The third statement will decrement X prior to the load, and the fourth statement will decrement the X register value twice before the load.

Indexed Indirect

Similar to extended indirect addressing, indexed statements may also be indirect as well. For example:

LDB [,X]

Program Counter Relative

In order to support position independent code, a final form of addressing is supported relative to the program counter. In this case, instead of the offset being relative to a pointer register, the offset is specified relative to the program counter. For example:

LDA $10,PCR

This will load the value of A with the contents of the memory position $10 bytes above where the PC is currently pointing. Again, since this is considered to be a form of indexed expression, indirect addressing based on program counter is also possible:

LDA [$10,PCR]

File Utility

The file utility included with the assembler package provides a method for manipulating and extracting information from image files (e.g. CAS or DSK files).

Listing Files

To list the files contained within a cassette or disk image, use the --list switch:

python file_util.py --list test.cas

The utility will print a list of all the files contained within the image, along with associated meta-data:

-- File #1 --
Filename:   HELLO
File Type:  Object
Data Type:  Binary
Gap Status: No Gaps
Load Addr:  $0E00
Exec Addr:  $0E00
Data Len:   39 bytes

-- File #2 --
Filename:   WORLD
File Type:  Object
Data Type:  Binary
Gap Status: No Gaps
Load Addr:  $0F00
Exec Addr:  $0F00
Data Len:   73 bytes

-- File #3 --
Filename:   ANOTHER
File Type:  Object
Data Type:  Binary
Gap Status: No Gaps
Load Addr:  $0C00
Exec Addr:  $0C00
Data Len:   24 bytes

Extracting to Binary File

To extract the files from a disk or cassette image, and save each one as a binary, use the --to_bin switch:

python file_util.py --to_bin test.cas

To command will list the files being extracted and their status:

-- File #1 [HELLO] --
Saved as HELLO.bin
-- File #2 [WORLD] --
Saved as WORLD.bin
-- File #3 [ANOTHER] --
Saved as ANOTHER.bin

Note that no meta-data is saved with the extraction (meaning that load addresses, and execute addresses are not saved in the resulting .bin files). If you only wish to extract a specific subset of files, you can provide a space-separated, case-insensitive list of filenames to extract with the --files switch:

python file_util.py --to_bin test.cas --files hello another

Which will result in:

-- File #1 [HELLO] --
Saved as HELLO.bin
-- File #3 [ANOTHER] --
Saved as ANOTHER.bin

Extracting to Cassette File

To extract the files from a disk image, and save each one as a cassette file, use the --to_cas switch:

python file_util.py --to_cas test.dsk

To command will list the files being extracted and their status:

-- File #1 [HELLO] --
Saved as HELLO.cas
-- File #2 [WORLD] --
Saved as WORLD.cas
-- File #3 [ANOTHER] --
Saved as ANOTHER.cas

If you only wish to extract a specific subset of files, you can provide a space-separated, case-insensitive list of filenames to extract with the --files switch:

python file_util.py --to_cas test.dsk --files hello another

Which will result in:

-- File #1 [HELLO] --
Saved as HELLO.cas
-- File #3 [ANOTHER] --
Saved as ANOTHER.cas

Common Examples

Below are a collection of common examples with their command-line usage.

Appending to a Cassette

To assemble a program called myprog.asm and add it to an existing cassette file:

python assembler.py myprog.asm --cas_file my_cassette.cas --append

Listing Files in an Image

To list the files contained within an container file (such as CAS or DSK file):

python file_util.py --list my_cassette.cas

This will print out a list of all the file contents on the cassette image my_cassette.cas. Note that BIN or binary only file contents only have a single program with no meta-information stored in them. As such, no information will be available for binary file types.

Extracting Binary Files from Cassette Images

To extract all the binary files from a cassette image:

python file_util.py --to_bin my_cassette.cas

Will extract all of the files in the image file my_cassette.bin to separate files ending in a .bin extension. No meta-information about the files will be saved in .bin format.

Extracting Binary Files from Disk Images

To extract all the binary files from a disk image:

python file_util.py --to_bin my_disk.dsk

Will extract all of the files in the image file my_disk.dsk to separate files ending in a .bin extension. No meta-information about the files will be saved in .bin format.

cocoassembler's People

Contributors

Stargazers

Watchers

Forkers

robmcmullen rob-smallshire stahta01 volkan68 tonypdmtr trendingtechnology

cocoassembler's Issues

Implement `ASRD` instruction

Implement the ASRD 6309 instruction mnemonic in the assembler. Performs an arithmetic shift right of double-byte register D, storing in D.

Inherent - $1047, 2 bytes

Example:

ASRD

Implement `ANDD` instruction

Implement the ANDD 6309 instruction mnemonic in the assembler. Performs a logical AND with a double-byte and register D, storing in D.

Immediate - $1084, 4 bytes
Direct - $1094, 3 bytes
Indexed - $10A4, 3+ bytes
Extended - $10B4, 4 bytes

Example:

ANDD #$1010

Allow multiple single byte definitions per `FCB` pseudo operation

Currently, FCB only allows one byte to be defined per line, so if multiple bytes need to be specified, it multiple lines need to be inserted - one FCB definition per line:

VAR     FCB    $48    ; HELLO
        FCB    $45
        FCB    $4C
        FCB    $4C
        FCB    $4F

In some assembly language listings, the FCB pseudo operation allows for multiple single bytes to be defined:

VAR     FCB    $48,$45,$4C,$4C,$4F    ; HELLO

It would be more convenient to allow for this condensed version.

Implement `ANDR` instruction

Implement the ANDR 6309 instruction mnemonic in the assembler. Performs a logical AND with a source and destination register, storing the result in the destination register.

Immediate - $1034, 4 bytes

D - 0000
X - 0001
Y - 0010
U - 0011
S - 0100
PC - 0101
W - 0110
V - 0111
A - 1000
B - 1001
CC - 1010
DP - 1011
E - 1110
F - 1111

Example:

ANDR A,B

Implement `SETDP` pseudo operation

Describe the enhancement
The SETDP pseudo-operation is used to tell the assembler to optimize code. Specifically, the SETDP mnemonic allows the programmer to specify what the contents of the direct page register are during compilation. The assembler can then look for any addresses during assembly that have a most significant byte the same as what is stored in the direct page register. It will then transform those statements from their extended addressing mode equivalents into direct addressing mode statements instead.

For example, normally:

LDA $0F01

Would generate the following machine code:

B6 0F 01

However, if the direct page were loaded with $0F, and the SETDP pseudo-operation were implemented, then:

SETDP $0F
LDA $0F01

Would generate the following machine code:

96 01

Direct Addressing Mode does not seem to work

Describe the bug
When using direct addressing mode, no bytes seem to be created. I am using ADCA as an example, but the issue seems to affect all direct addressing mode mnemonics.

To Reproduce
Steps to reproduce the behavior:

Create a file and call it test.asm:

        ORG    $1000
        ADCA    <$9F

Compile using: python assembler.py --print test.asm

Expected behavior
With the print parameter in place, it should produce this output:

-- Assembled Statements --
$1000                         ORG $1000                          ;                                         
$1000      999F               ADCA <$9F

Actual behavior
I am getting no byte codes at all:

-- Assembled Statements --
$1000                         ORG $1000                          ;                                         
$1000                         ADCA <$9F

Check branch instructions for automated optimizations

In certain circumstances, a user may issue an instruction such as:

LBCC FUNCTION

This instructs the CPU to perform a long branch to where FUNCTION is located in memory. Long branches are useful since they can be used to branch to anywhere in the 64K memory space. However, if FUNCTION is within +129 bytes or -126 bytes of the current instruction, then we can transform the LBCC to a BCC instead, saving several op-cycles, as well as a byte in instruction op-codes.

PSHU and PULU throw errors

Describe the bug
When a PSHU or PULU instruction is entered, varying error messages are thrown regarding their operands. For example a PSHU A throws an error that [A] not in symbol table. Similarly, adding additional registers throws other errors, for example when PSHU A,B is entered, it throws a Instruction [PSHU] does not support indexed addressing error.

To Reproduce
Steps to reproduce the behavior:

Create a new file called pshu_test.asm and add the following instruction to it:

        PSHU A,B

Run the assembler with:

python3 ./assembler.py pshu_test.asm

Expected behavior
The file should assemble correctly.

Desktop (please complete the following information):

OS: Ubuntu 20.04
Python Version: 3.8.10

Additional context
This is probably due to the fact that the two instructions (PSHU and PULU) are not tagged as being is_special in the instruction table, and accordingly in the operands.py file under SpecialOperand class, not being included with PSHS and PULS.

Extend `ADD` to include `E` and `F` registers

Extend the ADD mnemonic to allow for E and F variants. Adds the byte of the specified memory location to the register, and stores in the register.

ADDE:
Immediate - $118B, 3 byes
Direct - $119B, 3 bytes
Indexed - $11AB, 3+ bytes
Extended - $11BB, 4 bytes

ADDF:
Immediate - $11CB, 3 byes
Direct - $11DB, 3 bytes
Indexed - $11EB, 3+ bytes
Extended - $11FB, 4 bytes

The command works with the following syntax:

ADDE #$7A

Example:

$11 8B 7A     ADDE #$7A

Implement `ADDR` instruction

Implement the ADDR 6309 instruction mnemonic in the assembler. Adds contents of a source register to the contents of the destination register. All registers except for Q and MD are allowed.

Immediate - $1030, 3 bytes

The command works with the following syntax:

ADDR r0,r1

Where r0 is the source and r1 is the destination. The source is stored as the high nibble of the operand byte, and the destination is stored as the low nibble of the operand byte. Valid source and destination values are:

0000 - D
0001 - X
0010 - Y
0011 - U
0100 - S
0101 - PC
0110 - W
0111 - V
1000 - A
1001 - B
1010 - CC
1011 - DP
1110 - E
1111 - F

Example:

$10 30 8E     ADDR A,E

Add ability to extract a file from a DSK image

Functionality should be added to allow a user to extract a file from a DSK image. The DSK image will be scanned for the filename in question, and it's contents will be extracted and saved to the host computer. This issue requires the completion of issue #8 to be completed first.

Add ability to extract a file from a CAS image

Add functionality that will allow a user to extract a file from a CAS image file. The CAS image will be scanned, and the file extracted and save to the host computer. The completion of this issue requires issue #8 to be completed first.

Expressions in left hand side of indexed program counter relative expressions not resolved

Describe the bug
When an expression shows up in the left hand side of an indexed program counter relative expression, it is not resolved correctly.

To Reproduce

Create a file called test.asm and add the following contents to it:

                ORG     $3F00
TEMP            EQU     $0001
START           STX     1+TEMP,PCR
                END     START

Assemble the file with the following command:

python3 assembler.py test.asm --print

The following output will appear:

-- Assembled Statements --
$3F00                         ORG $3F00                          ;       
$3F00                  TEMP   EQU $0001                          ;   
$3F00 AF8C00          START   STX 1+TEMP,PCR                     ;                     
$3F02                         END START                          ;

Expected behavior
Correct output should be:

-- Assembled Statements --
$3F00                         ORG $3F00                          ;             
$3F00                  TEMP   EQU $0001                          ;               
$3F00 AF8D0002        START   STX 1+TEMP,PCR                     ;                      
$3F04                         END START                          ;

Desktop (please complete the following information):

OS: Ubuntu 20.04
Python Version: 3.8.10

Additional Context
Currently the output of AF8C00 is exhibiting the problem as described in issue #51 . The correct output of AF8D as the instruction and addressing mode should be fixed by that issue first. Then the issue of the expression resolution can be fixed.

Increase test coverage for Version 1.0.0 release

There are currently several pathways through the codebase that do not have adequate test coverage. Prior to Release 1.0.0, tests should be implemented to cover missing LOC to ensure that future additions do not cause regressions or inadvertent bugs.

Add ability to assemble and append to existing DSK image

The purpose of this issue is to add the ability to append an assembled program to an existing DSK image. Only Disk Basic filesystems need to be supported. The FileUtility class will need to check to make sure the disk is properly formatted, and that enough free granules exist to add the file to the disk. The directory structure will also need to be updated to contain the file. This Issue requires issue #6 to be completed before being implemented.

Support binary numbers in assembly code

The purpose of this feature request is to allow for binary numbers to be represented as numeric literals. Right now, both decimal and hex values are allowed:

LDA    #$80    ; Loads 128 into A
LDA    #128    ; Loads 128 into A

Binary values should be allowable with a prefix of %:

LDA    #%10000000    ; Loads 128 into A

16-bit instructions using program counter relative addressing should use 16-bit offsets

Describe the bug
16-bit instructions using program counter relative addressing should use 16-bit offsets by default. Currently only LEAX, LEAY, LEAS and LEAU use the correct behavior. Program counter indexed relative addressing modes for instructions STX, STU, STD, ADDD, CMPX, LDD, LDX, LDU, SUBD, CMPD, CMPS, CMPU, LDS, CMPY, LDY, STS, and STY need to be fixed.

To Reproduce

Create a file called test.asm and add the following contents:

                ORG     $3F00
START           STX     1,PCR
                LDA     #$0A
NEXT            RTS
                END     START

Assemble the file with:

python3 assembler.py test.asm --print

Output will be:

-- Assembled Statements --
$3F00                         ORG $3F00                          ;            
$3F00 AF8C01          START   STX 1,PCR                          ;          
$3F03 860A                    LDA #$0A                           ;   
$3F05 39               NEXT   RTS                                ;       
$3F06                         END START                          ;

Expected behavior
Correct output should be:

-- Assembled Statements --
$3F00                         ORG $3F00                          ;      
$3F00 AF8D0001        START   STX 1,PCR                          ;                  
$3F04 860A                    LDA #$0A                           ;              
$3F06 39               NEXT   RTS                                ;                
$3F07                         END START                          ;

Desktop (please complete the following information):

OS: Ubuntu 20.04
Python Version: 3.8.10

Add `fileutil` class for saving binary files

A new fileutil package should be introduced that will handle saving the assembled contents of a program to and from various disk and tape file formats. The cocoasm package will call the fileutil package to perform any I/O routines.

Add ability to extract from DSK and save to CAS files

Implement functionality that will let a user extract a file from a DSK file and save it to a CAS file. The CAS file can be a new, blank cassette file, or it can be an existing CAS image file, in which case the extracted file will be appended to the CAS file.

Implement `ADDW` instruction

Implement the ADDW 6309 instruction mnemonic in the assembler. Adds contents of double byte to the W register.

Immediate - $108B, 4 bytes
Direct - $109B, 3 bytes
Indexed - $10AB, 3+ bytes
Extended - $10BB, 4 bytes

Example:

ADDW #$1010

Program counter relative indexing only allows 16-bit offsets

Describe the bug
When program counter relative indexing is used within a program, the assembler will by default use a 16-bit relative offset instead of checking to see if an 8-bit offset would be appropriate.

To Reproduce

Create the following file and call it pcrtest.asm:

            NAM     PCRTEST
            ORG     $0600
VAR         FCB     0
BEGIN       LDA     $FF
            STY     VAR,PCR
            END     BEGIN

Attempt to assemble with the following command:

python3 ./assemble.py --print pcrtest.asm

Output is:

-- Assembled Statements --
$0000                         NAM PCRTEST                       
$0600                         ORG $0600                          
$0600 00                VAR   FCB 0                              
$0601 96FF            BEGIN   LDA $FF                            
$0603 10AF8DFFFE              STY VAR,PCR                       
$0608                         END BEGIN

Expected behavior
With the print statement in place, should provide the following output:

-- Assembled Statements --
$0000                         NAM PCRTEST                       
$0600                         ORG $0600                          
$0600 00                VAR   FCB 0                              
$0601 96FF            BEGIN   LDA $FF                            
$0603 10AF8CF9                STY VAR,PCR                       
$0607                         END BEGIN

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: Ubuntu 20.04
Python Version: 3.8.10

Implement conditional assembly pseudo-operation

Many assemblers allow for conditional assembly of statement blocks. This issue will implement conditional assembly by adding a COND pseudo operation that defines the start of a conditional block, and ENDC which defines the end of a conditional block. Conditionals come in the form of:

COND <expression>
<statements>
ENDC

Where:

<expression> is a traditional expression that contains two values separated by an operation.
<statements> are the assembly language statements to be included in the conditional block.
ENDC ends the conditional block.

The conditional block is only assembled if the value of the <expression> results in a non-zero value. Conditional blocks cannot be nested.

Implement `BAND` instruction

Implement the BAND 6309 instruction mnemonic in the assembler. Logically ANDs the specified bit in the A, B, or CC with a bit in memory. Result is stored in the source register. Direct addressing mode only. The first two bytes of the instruction are the instruction code, the next byte is a postbyte, and the last byte is the address least significant byte.

Direct - $1130, 4 bytes

Example:

BAND B,2,4,$40

The above would AND bit 4 of B with bit 2 of DP:40, storing the result in B. Note the strange order here - following B you specify the bit in the memory location, followed by the bit in the register. The resulting machine code would be:

11 30 A2 40

The postbyte is composed of the following sections:

Bits 7-6: register code where 00 = CC, 01 = A, 10 = B, 11 = invalid
Bits 5-3: the bit number (0-7) in memory
Bits 2-0: the bit number (0-7) in the register

Allow user-specified output widths

Currently, the assembler prints out lines that are greater than 120 characters wide when using the --print or --symbols switches. Historically, console windows have been limited to 80 or 100 characters wide when initialized, unless users specify an override. In order for the output of the assembler to be more readable, the output width should be truncated to a reasonable size - approximately 100 or 120 characters. Additionally, a switch should allow for users to specify how wide they want the output to be when printing to the screen.

Add ability to assemble and append to existing cassette file

Add the ability to detect whether a cassette file already exists, and allow for appending to the file instead of blindly overwriting it.

Allow for negative values in operands

Describe the bug
Negative values within the operands causes an assembly error.

To Reproduce

Create the following file and call it negtest.asm:

                NAM     NEGTEST
                ORG     $0000
BEGIN           STD     -1,X
                END     BEGIN

Attempt to assemble with the following command:

python3 ./assemble.py --print negtest.asm

Will result in:

[-1] is an invalid value
$                 BEGIN   STD -1,X

Expected behavior
With the print statement in place, should provide the following output:

-- Assembled Statements --
$0000                         NAM NEGTEST                        ;                                         
$0600                         ORG $0600                          ;                                         
$0600 ED1F            BEGIN   STD -1,X                           ;                                         
$0602                         END BEGIN                          ;

Desktop (please complete the following information):

OS: Ubuntu 20.04
Python Version: 3.8.10

Add ability to assemble and save as WAV file

Similar in nature to a CAS file, implement functionality that will save the file as a WAV file so that the user can play back the file and use cassette IO on a CoCo to load the file.

Add ability to extract CAS files and save to DSK

Implement functionality that will let a user extract a file from a CAS file, and save it to a virtual DSK file. The DSK file can be a new, blank disk, or it can be an existing disk, in which case the operation will append to the file allocation table, and save it to the disk (assuming there is enough space).

Implement `AIM` instruction

Add the AIM instruction. Performs a logical AND of an 8-bit immediate value with the contents of a memory byte and stores it in the designated memory byte. Meant to collapse the LDA, ANDA, and STA 6809 operations to perform the same task.

AIM:
Direct - $02, 3 bytes
Indexed - $62, 3+ bytes
Extended - $72, 4 bytes

The command works with the following syntax:

AIM #$0E;$FFFE

Note the semi-colon used to separate the immediate value and the memory address.

Implement `ASLD` instruction

Implement the ASLD 6309 instruction mnemonic in the assembler. Performs an arithmetic shift left of the double-byte and register D, storing in D.

Inherent - $1048, 2 bytes

Example:

ASLD

Add ability to assemble and save to new DSK image

In order to be useful, assembled programs need to be stored in a format the contains the machine code offset where the program should be stored and executed from. One such format is in a DSK (virtual disk) format. The purpose of this Issue is to add a FileUtility package that the assembler can call on to save to the assembled statements as a binary file on a DSK file. In this issue, only JV1 style virtual disk types need to be supported. Only Disk Basic filesystem formats need to be supported. The first pass of this issue is to create a new DSK file with a freshly initialized filesystem on it, and the binary file placed in the DSK image. This issue requires the completion of issue #8 to be completed first.

Implement `ADCR` instruction

Implement the ADCR 6309 instruction mnemonic in the assembler. Adds contents of a source register, plus the carry flag, to the contents of the destination register. All registers except for Q and MD are allowed.

Immediate - $1031, 3 bytes

The command works with the following syntax:

ADCR r0,r1

0000 - D
0001 - X
0010 - Y
0011 - U
0100 - S
0101 - PC
0110 - W
0111 - V
1000 - A
1001 - B
1010 - CC
1011 - DP
1110 - E
1111 - F

Example:

$10 31 8E     ADCR A,E

Complete initial README documentation

Prior to Release 1.0.0, README documentation should be completed to ensure that users of the project know how to use the assembler, as well as understand the mnemonics used by the assembler.

Immediate negative values not translated correctly

Describe the bug
Immediate negative integer values not translated correctly.

To Reproduce
Steps to reproduce the behavior:

Create a file called test.asm.
Add the following lines to the test file:

            ORG  $0E00
    START   CMPB #-2
            END  START

Run the assembler on the test file as follows:

python3 assembler.py --print test.asm
Output will be as follows:

-- Assembled Statements --
$0E00                         ORG $0E00                          ;                                         
$0E00 C102            START  CMPB #-2                            ;                                         
$0E02                         END START                          ;

Expected behavior
The compiled statement for the CMPB #-2 should not be C1 02. The value 02 should be negative, so FE. The output below should be correct:

-- Assembled Statements --
$0E00                         ORG $0E00                          ;                                         
$0E00 C1FE            START  CMPB #-2                            ;                                         
$0E02                         END START                          ;

Desktop (please complete the following information):

OS: Ubuntu
Version: 20.04.4
Python Version: 3.8.10

LDD immediate defaults to 8-bit value

Describe the bug
Loading D with an 8-bit value results in a printing error when the assembly source is printed during compilation. The actual compiled source is correct, but the print values are incorrect.

To Reproduce
Steps to reproduce the behavior:

Create the following file and call it lddtest.asm:

            NAM     LDDBUG
            ORG     $0600
START       LDD     #$1
            END     START

Assemble the file with the following command:

python3 assembler.py --print lddbug.asm --bin_file lddbug.bin

Output is:

-- Assembled Statements --
$0000                         NAM LDDBUG                         ;                                         
$0600                         ORG $0600                          ;                                         
$0600 CC01            START   LDD #$1                            ;                                         
$0603                         END START                          ;

Note that the LDD #$1 is compiled to CC01 when it should be CC0001. The size of the instruction is correct however, as the next statement begins at $0603, indicating that the instruction plus operands consumed 3 bytes. However, when you view the file with a hex editor, you will see that it is two bytes long containing the sequence CC 01. This means that the 16-bit value is being ignored in favor of 8-bits instead.

Expected behavior
Expected that the resultant output would be:

-- Assembled Statements --
$0000                         NAM LDDBUG                         ;                                         
$0600                         ORG $0600                          ;                                         
$0600 CC0001          START   LDD #$1                            ;                                         
$0603                         END START                          ;

The resultant output file should contain the sequence:

CC 00 01

Desktop (please complete the following information):

OS: Ubuntu 20.04.3
Python Version: 3.8.10

Assembler fails to interpret RTS mnemonics

Describe the bug
Assembling RTS statements are skipped in some instances.

To Reproduce
Steps to reproduce the behavior:

Create the following file called rtsbug.asm:

                NAM     RTSBUG
                ORG     $0E00
START           LDA     #$01            ;
                RTS                     ;
                END     START

Assemble the listing with the following command:

python3 assembler.py --print rtsbug.asm

The RTS statement will not be compiled:

-- Assembled Statements --
$0000                         NAM RTSBUG                         ;                                         
$0E00                         ORG $0E00                          ;                                         
$0E00 8601            START   LDA #$01                           ;                                         
$0E02                         RTS ;                              ;                                         
$0E02                         END START                          ;

Expected behavior
Expected behavior results in the following output:

-- Assembled Statements --
$0000                         NAM RTSBUG                         ;                                         
$0E00                         ORG $0E00                          ;                                         
$0E00 8601            START   LDA #$01                           ;                                         
$0E02 39                      RTS ;                              ;                                         
$0E03                         END START                          ;

Desktop (please complete the following information):

OS: Ubuntu 20.04
Python Version: 3.8.10

Implement `ADCD` instruction

Implement the ADCD 6309 instruction mnemonic in the assembler. Adds contents of double byte, plus carry flag, plus memory value, and stores in D.

Immediate - $1089, 4 bytes
Direct - $1099, 3 bytes
Indexed - $10A9, 3+ bytes
Extended - $10B9, 4 bytes

Example:

ADCD #$1010

Implement macro definitions

A macro is a block of assembly language statements that can be defined and then inserted into an assembly language program at any location. A macro is defined by a MACRO and then ENDM pair. A macro must also have a label. The macro is called within the assembly language routine by using the symbol name as if it were any other mnemonic. Values can be passed to macros within the operand field when specifying the macro. Passed values are specified within the macro definition with the \ character and the number for the value (the first passed value will be \0, the next \1, etc). For example:

TEST    MACRO
        LDA \0
        ENDM

        ORG $0E00
        TEST #$40
        END

Macros cannot be nested.

Allow multiple double byte definitions per `FDB` pseudo operation

Currently, FDB only allows one double byte to be defined per line, so if multiple double bytes need to be specified, it multiple lines need to be inserted - one FDB definition per line:

VAR     FDB    $4832
        FDB    $4532
        FDB    $4C32
        FDB    $4C32
        FDB    $4F32

In some assembly language listings, the FDB pseudo operation allows for multiple double bytes to be defined:

VAR     FDB    $4832,$4532,$4C32,$4C32,$4F32

It would be more convenient to allow for this condensed version.

Implement 5-bit constant offset indexed optimization

Describe the enhancement
The assembler currently handles indexed addressing modes for 8-bit and 16-bit constant offsets correctly. However, it does not correctly implement any optimizations for 5-bit offsets. According to the Motorola data set for the 6809, constant 5-bit offsets values are allowed in the range of -16 to +15 to be stored directly in the post-byte. Currently, the assembler treats any 5-bit offsets the same way it treats 8-bit offsets, thus adding an additional byte to the operation. As an example:

LDA 5,Y

Generates the following code:

B6 A8 05

If implemented, then the 5-bit constant offset indexed optimization would instead produce:

B6 25

This saves an additional byte each time a 5-bit constant is used.

Add NAM mnemonic to set program name

Add the NAM mnemonic to the list of pseudo operations. This will allow the programmer to set the name of the program within the assembly listing so that the --name switch does not need to be passed when saving to disk or cassette images.

Use granules closer to the middle of the disk

A feature of Disk Extended Color Basic (DECB) is that it prefers to use granules that are closer to the file allocation table and directory entries (track 17) before using granules that are closer to the outer edges of the disk. This makes sense, since it means stepping the read/write head less frequently after reading the file system data. Currently, the assembler will do a search for free granules starting at granule 0 (the first track). While this isn't an issue when using virtual files, on real media, there may be a noticeable delay when loading data from disk, since the read/write head needs to step a lot in order to read the granule data. The purpose of this feature request is to enable the same behavior in the assembler for populating granules that is seen when using DECB.

Add ability to extract all files from a CAS image

Add functionality that will allow a user to extract all files from a CAS image file. The CAS image will be scanned, and the files extracted and save to the host computer. The completion of this issue requires issue #8 to be completed first.

Disk files that are non-object types cannot be read

Describe the bug
The file utility is unable to read files that do not have preamble and postamble data associated with their file content. Preamble and postamble content is meant to describe binary files. It contains information relating to the length of the binary file, the load address, and the executable address of the binary after it has been loaded into memory. Currently, the file_util.py script assumes that all files have preamble and postamble content, thus, when a BASIC file or other data file are read, they are dismissed as being malformed since the script cannot determine how many bytes are in the file (plus it sees the preamble header as being incorrect).

The fix needs to occur to the list_files function on the DiskFile class in cocoasm/virtualfiles/disk.py around line 140 (look for a TODO block). The function needs to check the file type to see if it is an object file. If not, then it needs an alternate way to calculate file length. Most notably, the FAT for the final granule records how many sectors are in use. The directory entry record returns how many bytes are used in the last sector of the last granule in the file. For non-object files, these two pieces of information can be used to obtain the total length of the file (number of data bytes to read). This should effectively let the file utility read the files on disk and properly extract them to cassette files or binary files.

To Reproduce

Download any DSK file image that contains a BASIC file or non-machine language file.
Run the file utility to list the files on the disk:

python3 file_util.py --list diskimage.dsk

No contents will be displayed.

Expected behavior
The BASIC file should be listed in the disk contents.

Desktop (please complete the following information):

OS: Ubuntu 20.04.5 LTS
Python Version: 3.8.10

Add ability to save any file to a DSK image

Functionality should be added that will allow a user to save any arbitrary file to a DSK image. This issue requires the completion of issue #8 to be completed first.

Implement RMB pseudo-operation

The RMB operation will set aside the specified number of bytes at the location where the RMB is defined. This means that if the user specifies:

RMB $8

then 8 bytes will be inserted at the specified location, all containing a zero value.

Add ability to assemble and save to a CAS file

Add unit tests for Operands, Statement, Program, and Instruction

Additional unit tests are needed to cover functionality in classes Operands, Statement, Program, and Instruction. Additional integration tests are also necessary to ensure top to bottom validity of the assembler. See CodeCov report for information on where unit tests are necessary.

Implement `BEOR` instruction

Implement the BEOR 6309 instruction mnemonic in the assembler. Logically XORs the specified bit in the A, B, or CC with a bit in memory. Result is stored in the source register. Direct addressing mode only. The first two bytes of the instruction are the instruction code, the next byte is a postbyte, and the last byte is the address least significant byte.

Direct - $1134, 4 bytes

Example:

BEOR B,2,4,$40

The above would XOR bit 4 of B with bit 2 of DP:40, storing the result in B. Note the strange order here - following B you specify the bit in the memory location, followed by the bit in the register. The resulting machine code would be:

11 34 A2 40

The postbyte is composed of the following sections:

Bits 7-6: register code where 00 = CC, 01 = A, 10 = B, 11 = invalid
Bits 5-3: the bit number (0-7) in memory
Bits 2-0: the bit number (0-7) in the register

Character literals not accepted as operands

Describe the bug
Character literals are not accepted as operands for immediate addressed instructions. Character literals are usually denoted with 'C for example, representing the letter C. This saves having to convert the character literal to a numeric value manually.

To Reproduce

Create a file called literal.asm with the following code:

        NAM LITERAL
        ORG $0600
START   LDA #'C
        END START

Assemble the file with the following command:

python3 assembler.py --print literal.asm

The following error will appear at the end of the stack trace:

Invalid operand value
line: START   LDA #'C

Expected behavior
The following output should occur from the assembler:

$0600                ORG $0600
$0600 8643   START   LDA #'C
$0602                END START

Desktop (please complete the following information):

OS: Ubuntu 20.04.1
Python Version: 3.8.10

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

Jobs

Jooble

craigthomas / cocoassembler Goto Github PK

cocoassembler's Introduction

CoCo Assembler and File Utility

Table of Contents

What is it?

License

Requirements

Installing

The Assembler

Assembler Usage

Input File Format

Print Symbol Table

Print Assembled Statements

Save to Binary File

Save to Cassette File

Save to Disk File

Mnemonic Table

Mnemonics

Pseudo Operations

Addressing Modes

Inherent

Immediate

Extended

Extended Indirect

Direct

Indexed

Indexed Indirect

Program Counter Relative

File Utility

Listing Files

Extracting to Binary File

Extracting to Cassette File

Common Examples

Appending to a Cassette

Listing Files in an Image

Extracting Binary Files from Cassette Images

Extracting Binary Files from Disk Images

cocoassembler's People

Contributors

Stargazers

Watchers

Forkers

cocoassembler's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs