code: | symbol table (optional): | |||||||||||
load |
|
|||||||||||
|
||||||||||||
disassembly: | ||||||||||||
Instructions
- Select your preferred color scheme by clicking the options "Dark" and "Light" at the top right.
- Paste, drag-and-drop, or upload (using the file dialog button) the object code into the code field. This may be a series of hex values or a binary file.
With the option ”strip addresses” on, any leading 4-hex-digit code addresses will be automatically discarded. - Optionaly edit, paste, drag-and-drop, or upload (using the file dialog button) any symbols and labels to be used by the disassembly into the symbol table field. (There are also a few ready-to-use presets.).
- Enter the start address of the code.
- Optionally specify a range (from – to) for the disassembly.
- Click the button "Disassemble"
- Optionally select a display format at the bottom of the the disassembler output.
Object code may be a series of byte values in hex format (in pairs of two or separated by white space and/or commas). Any heading line numbers or addresses preceded or followed by a colon (":
") are ignored, as are any comments (starting with a semicolon or a backslash). Using the file upload button or drag-and-drop, files may be text files containg a similar hex dump or binary files.
Description
This is a simple disassembler for the MOS 6502 MPU, meant to work as a versatile stand-alone application or together with its companion applications (compare the links at the top of the page).
Assembler Syntax (Ouput Format)
Instructions are transformed to 3-letter mnemonics followed by an operand or address. The disassembler uses the following mnemonic format, where "HHLL
" stands for a 16-bit address, "LL
" for 8-bit zero-page address, and “BB
” for byte value:
- CLC
- implied
- ROL A
- accumulator
- LDA #BB
- immediate
- LDA HHLL
- absolute
- LDA HHLL,X
- absolute, X-indexed
- LDA HHLL,Y
- absolute, Y-indexed
- LDA LL
- zeropage
- LDA LL,X
- zeropage, X-indexed
- LDA LL,Y
- zeropage, Y-indexed
- LDA (BB,X)
- X-indexed, indirect
- LDA (LL),Y
- indirect, Y-indexed
- JMP (HHLL)
- indirect
- BEQ HHLL
- relative addresses are transfortmed to target locations
- ???
- undefined opcode
The dissassembler may also generate a few pragmas, in order to produce ready-to-use code:
- * = $xxxx
- set program counter (= start address)
- .END
- end of assembler code
- .BYTE $xx
- used to embed undefined opcodes as data in the assembler code only representation
- .OPT ILLEGALS
- inserted on top of the code only representation, if "illegal opcodes" are activated and such instructions were actually found.
The disassembler will also auto-generate labels for any jump locations and branch targets. You may override these by a symbol table (see below), or switch to a pure format without any labels using the drop-dwon menu below the output.
BBC Micro Mode
The checbox "BBC Micro mode" activates a format compatible with the syntax used by the BBC BASIC embedded assembler. Using this, hexadecimal numbers will be formatted using the prefix "&
", any labels are preceded by a dot (.
), and the generated assembler code will be wrapped in square brackets ("[…]
").
Moreover, the disassembler will auto-generate the following directives:
- P% = &xxxx
- set the location counter (= start address)
- END
- end of assembler code
- EQUB &xx
- used to embed undefined opcodes as data in the assembler code only representation
- OPT ILLEGALS
- inserted on top of the code only representation, if "illegal opcodes" are activated and such instructions were actually found. (This is not a BBC BASIC standard directive.)
You may also chose to embed any literal data (undefined instructions) by the BBC BASIC indirection operator "?P%
" and consequently updating "P%
". This will generate the following directives:
- ?P% = &yy
- indirection used for undefined opcodes in the assembler code only representation
- P%?n = &yy
- indirection for consecutive data insertions
- P% = P%+n
- location counter update after a data insertion
Illegal Opcodes
The disassembler optionally supports undocumented instructions, the so-called "illegal opcodes", see the checkbox below the input area. However, it may be advisable to keep them switched off in order to identify any data sections.
The following mnemonics are used in the disassembly (common synonyms in parentheses):
ALR (ASR) ANC ANC (ANC2) — immediate, opcode0x2B
, operationally the same as "ANC" ANE (XAA) ARR DCP (DCM) ISC (ISB, INS) LAS (LAR) LAX LXA (LAX immediate) RLA RRA SAX (AXS, AAX) SBX (AXS, SAX) SHA (AHX, AXA) SHX (A11, SXA, XAS) SHY (A11, SYA, SAY) SLO (ASO) SRE (LSE) TAS (XAS, SHS) SBC (USBC) — opcode0xEB
, operationally the same as "SBC immediate" (0xE9) NOP (DOP, TOP) — various address modes JAM (KIL, HLT) — various, freezes the CPU
(These mnemonics are used universally accross the entire "virtual 6502" suite of apps.)
Symbol Tables
You may enter an optional list of symbol definitions to be used by the disassembler.
A valid symbol table consists of series of definitions, on per line, where a symbol name is separated either by white space (blanks) or a colon (:
) or an assignment operator (=
) from a a value expressions.
Names begin with a letter or the underscore and may be followed by letters, number characters, or the underscore. Names have a sgignificant length of up to 8 characters.
Value expressions may be simple numbers (see below for number formats) or complex expressions. Complex expressions may contain symbols already defined and use basic arithmetic operators (+
, −
, *
, /
), which are evaluated strictly from left to right, without precedence. You may group elements of an expression, either by square brackets ("[…]
") or normal parenthesis ("(…)
"). Moreover, any factor may be preceded by an unary minus (−
) and/or the low-byte operator (<
) or the high-byte operator (>
). Expressions are case-insensitive and the notation is generally compatible to the one used by the virtual 6502 assembler.
Numbers may come in any of the common notations, as there are:
Type | Format | Example |
hex | $[0-9A-F]+ &[0-9A-F]+ 0x[0-9A-F]+ | $C800 &D080 0x0801 |
decimal | [0-9]+ 0d[0-9] | 123 0d429 |
octal | @[0-7]+ 0[0-7]+ 0o[0-7]+ | @537 0773 0o7370 |
binary | %[01]+ 0b[01]+ | %1101001 0b0110 |
A few examples of valid definitions:
SCREEN = 0x4000 DATA: $4000 LOOKUP $4040 OFFSET = DATA-LOOKUP VARS = 49152 SCORE = VARS+OFFSET TEMP = VARS+(2*OFFSET) TEMP2 = VARS + [3 * OFFSET] BT = <-1 ;low-byte of -1 = $FF (comments are ignored)
Use a "W
" suffix to declare a symbol for write access only:
;Atari 2600 TIA registers CXP0FB = $02 ;read register WSYNC = $02 w ;for write access only
Finally, as symbol table may contain ".DATA
" statements declaring an address or a range of addresses as data to be excluded from the disassembly. Code at such addresses will by translated to ".BYTE
" pseudo instructions (or related BBC-style directives, respectively).
This may be useful for drilling down on a dissasembly.
.DATA $2040 ;exclude a single address .DATA $2040 ... $240F ;exclude the range $2040 … $240F (inclusive) .DATA $2040, $240F ;as above .DATA D1 ... D1+4 ;expressions are allowed
Note on hexadecimal numbers and word-size:
If a symbol is defined by hexadecimal number with at least 4 hex-digits and two leading zeros, or by an expression using such a value, the respective symbol will be considered of word-size, regardless of its value.
However, symbols that are used in zeropage address modes in the code will be automatically reformatted to 2-hex-character single-byte values. To ensure compatibilty with assemblers with zeropage auto-detection, any single-byte symbols used in ambiguous context as word-size operands will be marked by a "+$0000
" extension (or "+&0000
" in BBC Micro mode).
Further Considerations and Options
Generally, you should be able to use a symbol table directly as it is listed by virtual 6502 assembler.
The parsed symbol table will precede any output format which uses labels and symbols. Symbols which are used labels in the disassembly will be commented out by a preciding semicolon (;
), again to ensure compatibility with assemblers that may protest on assigning a label, which as been defined as a symbol already.
If the checkbox "include used symbols only" is checked, only those symbols, which are used in addresses and are not included as labels are listed. (This may be actually none.)
The disassembler will always append a table of auto-generated and defined symbols used as labels to a view that includes any labels. You may want to copy this and use it as the basis for a refined symbol table. This may be especially useful for any partial disassemblies that you may want to do later on.
Here are few symbol tables, also to be loaded using the drop-down menu found below the symbol table input:
- Commodore 64: c64.sym
- PET 2001 (ROM 2.0): pet2001.sym
- Atari VCS (Atari 2600): vcs.sym
- BBC Micro OS ABI: bbc.sym
The option "generate 'SYM+1' addresses" (default: on) generates labeled operands with a "+1" suffix for any instructions where a symbol has been defined for the address immediately preceding the given address.
E.g.
Object code (start address 0x0800): A9 01 85 B8 A9 40 85 B9 Symbol table: SETPTR = $0800 POINTER = $B8 Disassembly: * = $0800 0800 A9 01 SETPTR LDA #$01 0802 85 B8 STA POINTER 0804 A9 40 LDA #$40 0806 85 B9 STA POINTER+1 .END Disassembly in BBC Micro mode: P% = &0800 [ 0800 A9 01 .SETPTR LDA #&01 0802 85 B8 STA POINTER 0804 A9 40 LDA #&40 0806 85 B9 STA POINTER+1 ]
The option "add cycle counts" (default: off) adds comments with cycle counts for each intruction, where applicable. E.g., for the above example:
* = $0800 0800 A9 01 SETPTR LDA #$01 ;(2) 0802 85 B8 STA POINTER ;(3) 0804 A9 40 LDA #$40 ;(2) 0806 85 B9 STA POINTER+1 ;(3) .END
Disclaimer
This application is provided for free and AS IS, therefore without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. Use at own risk.
This application uses either Web Storage technology or, if this is not available, a cookie to store your choice for the preferred color mode for the virtual 6502 suite of applications. (Either the word "light" or the word "dark" is stored.)
For editing or patching up binary files, especially Commodore 8-bit PRG-format files, consider PRG-Edit, a small online hex-editor with support for various text encodings.
© Norbert Landsteiner 2005–2023, mass:werk