PETSCII Revealed
A closer look at the logic behind Commodore ASCII, AKA “PETSCII”, and the PET 2001.
The flavor of ASCII used by the Commodore 8 bit computers, commonly known as PETSCII, is asking for a bit of an explanation. PETSCII is a peculiar beast, close to ASCII, but not quite, somewhat compatible, but not really, there are duplicate ranges of characters all over the place, and the special characters are lacking any recognizable order… — But look at all these these funny graphics characters!
In order to make sense of this and how the character set is organized, it may be helpful to have a closer look at it with a particular focus on the PET 2001. At least, this is the very machine, this character set originated on and for which it was designed for, with no idea yet that this may become the ancestor of a succesful line of home computers. Here, we may discover logic, in what must remain a puzzling enigma on the more popular and better known machines that followed, like the C64.
Implementation details aside, since PETSCII is still Commodore ASCII, we may best start with ASCII.
ASCII
Let’s have a look at the organization of the ASCII code (7-bit) as it ought to be looked at, in groups of 32:
00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F | |
00 – 1F 00xxxxx 0 – 31 |
N U L | S O H | S T X | E T X | E O T | E N Q | A C K | B E L | B S | T A B | L F | V T | F F | C R | S O | S I | D L E | D C 1 | D C 2 | D C 3 | D C 4 | N A K | S Y N | E T B | C A N | E M | S U B | E S C | F S | G S | R S | U S |
20 – 3F 01xxxxx 32 – 63 |
! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | |
40 – 5F 10xxxxx 64 – 95 | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
60 – 7F 11xxxxx 96 – 127 |
` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | D EL |
Lo and behold this beauty and this logic!
Easily to discern, there are 4 groups, as there are,
- a control group (
0x00 – 0x1F
), - a punctuations and numeric group (
0x20 – 0x3F
), - an upper-case alphabetic group (
0x40 – 0x5F
), - a lower-case alphabetic group (
0x60 – 0x7F
).
As used from Baudot code and paper tape operations, the very last character with all bits set is the DELETE character, nullifying the previous code. (This is, BTW, a strong argument for even parity: on 8-bit paper tape you’d still want to have all rows punched for this.)
All these groups transform into one another, from a higher order one to a lower order one, in a logical manner:
- Masking bit 5 on a lower case character transforms it to upper case.
Thus, if you have old 6-bit, single-case character equipment, the encoding still works with minor adaptions for the lower case group. Mind that all the traditional, commercial punctuations and special characters, in use since the early days of punch cards, are outside the case specific groups and won’t be affected. - Dropping bits 5 and 6 transforms a character from the alphabetic group to a control character.
Thus,^@
becomesNUL
,^A
becomesSOH
,^H
becomesBS
,^I
becomesTAB
, and so on:
00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F | |
CTRL + | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
00 – 1F 00xxxxx 0 – 31 |
N U L | S O H | S T X | E T X | E O T | E N Q | A C K | B E L | B S | T A B | L F | V T | F F | C R | S O | S I | D L E | D C 1 | D C 2 | D C 3 | D C 4 | N A K | S Y N | E T B | C A N | E M | S U B | E S C | F S | G S | R S | U S |
Transforming from lower case to upper case and vice versa is as easy as masking bit 5 (“c & 0xD6
”) or OR-ing it (“c | 0x20
”), respectively. In terms of electronics and keyboards, implementing the SHIFT key or the CONTROL key is as easy as breaking one or two wires while the modifyer key is pressed.
But it’s even better than that, have a look at the punctations and numerals: drop bit 4 on SHIFT and your traditional keyboard layout is complete!
00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | |
20 – 2F 010xxxx 32 – 47 |
! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | |
30 – 3F 011xxxx 48 – 63 |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
PETSCII (Commodore ASCII)
Now that we have an idea about how ASCII code is laid out, let’s see what Commodore was doing, when it came up with PETSCII for the PET 2001. PETSCII, that is, is not about screen codes, but about the organization of the actual codes as used in strings, about what we get using “CHR$()
” and what we read by “ASC()
”. While screen codes are often confused with this, those are really just about matching these logical codes with visual representations as stored in the character ROMS. The latter may vary, as with special, national character sets or the ROMs, which went with business keyboards, but the internal organization stays the same, regardless of how the characters are represented on the screen.
Since it’s PETSCII as in PET-ASCII, let’s have a look at what we get on the original PET 2001:
Now, this looks familiar and also a bit strange. Let's compare this with the actual ASCII set:
00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F | |
0x00 | N U L | S O H | S T X | E T X | E O T | E N Q | A C K | B E L | B S | T A B | L F | V T | F F | C R | S O | S I | D L E | D C 1 | D C 2 | D C 3 | D C 4 | N A K | S Y N | E T B | C A N | E M | S U B | E S C | F S | G S | R S | U S |
0x20 | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | |
0x40 | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
0x60 | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | D EL |
00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F | |
0x00 | C R | D W N | R V S | H O M | D E L | R G T | ||||||||||||||||||||||||||
0x20 | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | |
0x40 | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ↑ | ← |
0x60 | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | |
0x80 | S C R | U P | R O F | C L R | I N S | L F T | ||||||||||||||||||||||||||
0xA0 | ▌ | ▄ | ▔ | ▁ | ▏ | ▒ | ▕ | ▄ | ◤ | ▄ | ├ | ▗ | └ | ┐ | ▂ | ┌ | ┴ | ┬ | ┤ | ▎ | ▍ | ▐ | ▔ | ▀ | ▃ | ▟ | ▖ | ▝ | ┘ | ▘ | ▚ | |
0xC0 | ─ | ♠ | │ | ─ | ─ | ▔ | ─ | │ | │ | ╮ | ╰ | ╯ | ▙ | ╲ | ╱ | ▛ | ▜ | ● | ▁ | ♥ | ▏ | ╭ | ╳ | ○ | ♣ | ▕ | ♦ | ┼ | ▌ | │ | π | ◥ |
0xE0 | ▌ | ▄ | ▔ | ▁ | ▏ | ▒ | ▕ | ▄ | ◤ | ▄ | ├ | ▗ | └ | ┐ | ▂ | ┌ | ┴ | ┬ | ┤ | ▎ | ▍ | ▐ | ▔ | ▀ | ▃ | ▟ | ▖ | ▝ | ┘ | ▘ | π |
(HOM … HOME, SRC … SHIFT + CR, ROF … RVS OFF, CLR … CLEAR. Block graphics are Unicode approximations.)
As we may see, the numeric/punctations group and the upper-case group are mostly the same. As the only exception to this, where ASCII has the caret and the underscore at 0x5E
and 0x5F
, PETSCII has — still recognizable — an up-arrow and a left-arrow, respectively. (Apparently, since BACKSPACE became delete-to-the-left with common on-screen editors, the original ASCII glyphs weren’t found of much use without overprint capabilities. Moreover, PETSCII doesn’t implement BACKSPACE in any way, but rather assigns its own code in place of DC4
.)
Update: Curt J. Sampson (@cjs) said in the Retro Computing Forum,
«Actually, what Commodore has are the original ASCII characters, from ASCII-1963. The 1965 revision replaced ‘↑
’ and ‘←
’ with ‘^
’ and ‘_
’. It’s not clear to my why Commodore, in 1976, went with ASCII-1963 rather than ASCII-1967. Maybe they just had only old reference books lying around?»
I stand corrected! (Or, at least, complemented, as I still think that the choice makes sense for the PET.)
However, where there are the lower-case characters in ASCII, from 0x60
to 0x7F
, PETSCII just repeats the upper-case group. Apparently, Commodre 8-bits are just 6-bit machines, as far as character encoding is concerned. But that isn’t all, since there are all those block graphics symbols as well, from 0xA0
to 0xFF
in the upper 8-bit bank, which show the same, peculiar mirroring of the group form 0xA0
to 0xBF
.
And, just to add another special flavor (and, because the exception proves the rule), the very last character code, 0xFF
, is used for π, normally found at 0xDF
. Apparently jammed in, much like a fix to an oversight.
As for control characters, there are, in its original form, just a few that show any effect. Namely those are the carriage return (CR
), the DC
-group (device control) from ASCII is repurposed, and the group separator (GS
) is used for a cursor key. However, these control codes are also in the upper bank, where they are inverted into their respective functional opposites. (There’s als SHIFT+RETURN, for which there’s no code in ASCII.)
Note: The placement of some the control characters is lacking a bit of an explanation. While it makes sense to use the DC
-group for cursor and screen controls, it would have been only logical to have CRSR RIGHT in position of DC2
, just after CRSR DOWN, with the code for reverse video swapped out, instead. Maybe, using CRSR RIGHT for GS
was found to be the least harmful and/or destructive, in case an actual group sparator was encountered in a file. As the PET was also intended for (small) business use, which may have involved files that originated on other systems, this may have been worth a consideration.
Later machines, like the VIC-20 and the C64, added further control codes for colors, function keys, and for switching character case. Also, the backslash character had to give way to the British Pound (GBP) currency symbol (“£
”).
Having a look at the lower case set of the PET 2001, we may finally recognize, what this is all about:
As we have already seen in a previous discussion of abbreviations in Commodore BASIC, the most significant bit (MSB), bit 7, is used to indicate a shifted character. Therefore, just a single bit has to be checked in PETSCII, using the sign-flag of the 6502 processor, where there had two bits to be checked using ASCII encoding. With the lower-case group of the ASCII code now being of no particular use, the upper-case group is just repeated instead.
Again, considering business use and the chance of encountering a file from another system, this isn’t a particularly bad choice for what is essentially a single case system. No need for explicit character conversions. While actual processing may require transformations, readability of foreign files is guaranteed out of the box.
Hence, the PET (and any of its 8-bit successors) is much more like a single case machine with switchable representations of the upper bank, than a real upper-case/lower-case machine. This is underlined by the fact that, in this original form, the unshifed characters stay the same in lower-case representations as they are in upper-case mode, with lower-case characters being effected in combination with the shift key! Just the opposite of what we would expect.
In order to work around these arguably unusual operations, Commodore swapped the lower-case and upper-case glyphs in the character ROMs of later machines. The internal implementation of the character sets, however, stayed the same.
BTW, this is how you select character sets on a PET 2001 (by interfacing directly with the hardware by POKEs to 0xE84C
, the Peripheral Control Register PCR of the VIA):
POKE 59468,12 :REM USE UPPER-CASE/GRAPHICS (DEFAULT) POKE 59468,14 :REM USE LOWER-CASE/UPPER-CASE
Still 8 Bits?
This peculiar mirroring at 0x60–0x7F
and 0xE0–0FF
may raise concerns. Are Commodore 8-bit machines still real 8-bit machines as far as character encoding is concerned? This really looks more like 7-bit with a bit of magical switching around added behind the scenes. Meaning, are those codes actually unique and just transformed by the output mechanism, or are the mirrored characters substituted as soon as they are processed? Mind that we previously proudly presented an algorithm to implement fast FIFO queues for byte-sized values by the abuse of string operations! Are approaches like this compromized and apt to fail for those mirrored ranges? — Let’s check to be sure:
Phew! — “CHR$()
” just jams a byte into a memory location in string storage and “ASC()
” retrieves the very same value again, even for a character in one of the mirrored code ranges. The same is true for any control characters. Moreover, all strings generated by “CHR$()
” have a string-length of 1
, even the non-printing control characters that do not show any effect when printed onto the screen. — As far as BASIC is concerned, these are indeed unique characters!
Familiar Unfamiliarites
If we dare to inspect the graphics characters and their individual locations in the character set a bit more closely, we may reveal a distinct lack of order and organization. Some related characters seem to form a group, with unrelated chacracters interpersed, while other groups are dispersed all over the place. While there seems to be some sort of order in a few places, this soon falls apart and dosen’t withstand any scruteny. How could Commodore come up with such a scheme?
If without any clue, compare the principal upper-case/graphics set with the original chiclet keyboard layout of the PET 2001:
The order isn’t in the character set, but on the keyboard! Like we’ve seen it with ASCII code (especially, in the relationship between punctuations and numeric characters), the characters are arranged in a way that they match the unshifted keys on the PET 2001 chiclet keyboard. Also, mind that the graphics characters are particularly arranged so that the most important ones, like those for drawing frames, are available regardless of the character ROM in use. (That is, 0xA0
…0xBF
, 0xC0
, 0xDB
, 0xDD
.) Admire the logical arrangement!
Something you probably wouldn’t have figured out, if you knew the C64 only!
For an example, consider the order of the frame characters in the code and the corresponding arrangement on the numeric key pad of the PET 2001:
Compare this to the keyboard layout of the VIC-20 or the C64, where we find the frame corners (┌
, ┐
, └
, ┘
) mapped to keys A
, S
, Z
, X
and the bottom row of the numeric key pad of the PET 2001 (0
, .
, −
, =
) dispersed over several rows of the keyboard. While just looking at those machines, we may have a hard time discovering the relation between keys and graphic character codes in the PETCSII set, which is pretty much just a variation on how punctuations and numbers are mapped on an ASCII keyboard: the code of the base key with a bit modifier reflecting to the state of the SHIFT key.
And this is also, why this is really PETSCII, as opposed to “Commodore-8bit-SCII”… ;-)
Pi (π)
You may recall that shifted characters, those with the highest bit set, are used as tokens by BASIC. This may be the reason for the peculiar copy of “π
” at 0xFF
. In a BASIC program, graphics characters are not allowed outside of strings, as their code values conflict with the encoding of the BASIC keywords as tokens. However, there’s need for π as a constant and it’s not in the basic character set. What to do about it?
It will be a special case to be checked by the operating system, much like a token of its own, there’s no way around this, but at least you want that character to be out of the way to avoid any conflicts. Like on the opposite end of the range that starts the list of BASIC tokens. Also, a distinc code value may help, a value that stands out easily. Like 0xFF
.
Therefore, only the copy at 0xFF
represents the constant, while the original at 0xDE
is just anotherer graphics character, as far as BASIC is concerend. Most likely, this was not in the original design, but a fix by Microsoft. Or it may have been an adaption to MS BASIC made by Commodore. Anyway, the copy of π at 0xFF
isn’t really to be considered a PETSCII code, but is much more a feature of BASIC.
Screen Codes
As should have become apparent by now, the “magic” (or, maybe, irritating nature) of PETSCII is related rather to the arrangement of the character codes than to their representations in the character ROM. However, one doesn’t go without the other. Therefore we may ask, what is the particular relationship between PETSCII and Commodore screen codes?
Obviously, there’s no need for control characters in the character ROMs. While they are required in order to provide organized output on the screen, once we’re actually displaying something, we’ve done with them. (How they are represented in string context in the editor is regulated by the operating system and not directly related to the character set.) Just the same, there’s no need to store the mirrored regions as unique glyphs. By this, the entire set of glyphs can be jammed into a block of 7-bit codes, using the 8th bit to indicate reverse video. The reverse video characters, however, aren’t stored in the ROM, but are generated on the fly by inverting the repsective bit patterns. A single character is drawn onto a 8 × 8 pixel matrix (streched to double height on the later 80-column PETs), thus occupying 8 bytes per character in ROM. As 128 × 8 give 1024 bytes, a single 1K chip is all what’s needed to store an entire set of screen characters.
So, is there an obvious relation between the PETSCII code set and the screen codes? Chances are, since this would certainly make things much simpler. At least, this is about a piece of engineering…
As may be observed, PETSCII codes (mirrored regions and control characters removed) match directly to screen codes, but are swapped and relocated in groups of 32. As it happens, PETSCII shows this structural resemblence to ASCII, regarding groups of 32, down to the very lowest levels of implementation. As may be expected, this is handled in exactly the same way for the upper-case/graphics set and the lower-case set.
The screen representations for the control characters follow a similar logic:
Screen Codes Bitwise
Now, let’s have a look at our findings from a bitwise perspective by listing the key transformations of the various zones in binary representation:
Mode Range b76543210 PETSCII 4x – 5x: 010xxxxx →Screen 0x – 1x: 000xxxxx PETSCII Cx – Dx: 110xxxxx →Screen 4x – 5x: 010xxxxx PETSCII Ax – Bx: 101xxxxx →Screen 6x – 7x: 011xxxxx PETSCII 2x – 3x: 001xxxxx →Screen 2x – 3x: 001xxxxx
Mind that the goal is here to clear bit 7 in order to use it to encode inverted glyphs in the upper half of the character ROM. How is this achieved? By dropping bit 6 (red) and shifting bit 7 (green) to the right by one position. Bit 7 is now free to encode normal (0
) or reverse (1
) video.
Mind that this is possible only, because the encodings of bits 5 and 6 are redundant with bit 6 giving always the inverse of bit 5. Hence, we may simply drop bit 6 and shift bit 7 in place, in order to compact the representation. — There’s actually some sense to this apparently random shuffling of character groups!
And what about control codes?
Mode Range b76543210 PETSCII 0x – 1x: 000xxxxx →Screen 8x – 9x: 100xxxxx PETSCII 8x – 9x: 100xxxxx →Screen Cx – Dx: 110xxxxx
As we may see, this adheres to the same transformation, but now bit 7 is also force-jammed to HI
(reverse video) in order to distinguish control codes from normal characters in string context.
By this, we can come up with the following maps:
PETSCII to Screen Code
0x00
–0x1F
→0x80
–0x9F
(+0x80
)0x20
–0x3F
→0x20
–0x3F
(±0
)0x40
–0x5F
→0x00
–0x1F
(-0x40
)0x80
–0x9F
→0xC0
–0xDF
(+0x40
)0xA0
–0xBF
→0x60
–0x7F
(-0x40
)0xC0
–0xDF
→0x40
–0x5F
(-0x80
)
Duplicate PETSCII code ranges and substitutions:
- PETSCII
0x60
–0x7F
→ PETSCII0x20
–0x3F
(-0x40
) - PETSCII
0xE0
–0xFF
→ PETSCII0xA0
–0xBF
(-0x40
) - PETSCII
0xFF
is substituted by0xDE
(“π
” / medium size checkerboard)
Add 0x80
to screen codes for reverse video.
Screen Code to PETSCII
0x00
–0x1F
→0x40
–0x5F
(+0x40
)0x20
–0x3F
→0x20
–0x3F
(±0
)0x40
–0x5F
→0xC0
–0xDF
(+0x80
)0x60
–0x7F
→0xA0
–0xBF
(+0x40
)
For inverse screen codes in string context (control characters):
0x80
–0x9F
→0x00
–0x1F
(-0x80
)0xC0
–0xDF
→0x80
–0x9F
(-0x40
)
Otherwise, subtract 0x80
and switch to reverse video.
Bitwise/Logical Transformations
- PETSCII (
p
) to Screen Code (c
)First, normalize any duplicate ranges:
if (p & 0x60 == 0x60) p = p & 0xBF
c = ((p & 0x80) >> 1) | (p & 0x3F)
Set bit 7 for control codes:
if (p < 0x20) c = c | 0x80
(Mind that this only applies to a limited class of printable control codes, which may appear in string context.)Note: BASIC token
0xFF
(π
) transforms to PETSCII0xDE
(resulting in screen code0x5E
). - Screen Code (
c
) to PETSCII (p
)p = ((c & 0x40) << 1) | ((~c & 0x20) << 1) | (c & 0x3F)
— Mind how we can reconstruct bit 6 from the inverse of bit 5, as the encoding is redundant.
Clear bit 6 for control codes:
if (c & 0x80) p = p & 0xBF
(Mind that there is no way to determine other than from context, whether a screen code with bit 7 set toHI
represents true reverse video or a control code.)
And here are some useful hex-to-decimal conversions, ready for the use in Commodore BASIC:
0x00 .... 0 0x1F .... 31 0x20 .... 32 0x3F .... 63 0x40 .... 64 0x5F .... 95 0x60 .... 96 0x7F .... 127 0x80 .... 128 0x9F .... 159 0xA0 .... 160 0xBF .... 191 0xC0 .... 192 0xDF .... 223 0xD0 .... 208 0xFF .... 255
PETSCII Tables
And here are the tables for PETSCII on the PET 2001 in both character sets, using the closest sensible Unicode representation available for the graphics characters. (Block-graphics are used to indicate the large, border-sided frame characters, some of the patterned block characters have no matching equivalent in Unicode and are substituted by a matching shape.)
The Upper-Case/Graphics Set (PET 2001)
00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F | |
0x00 | C R | D W N | R V S | H O M | D E L | R G T | ||||||||||||||||||||||||||
0x20 | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | |
0x40 | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ↑ | ← |
0x60 | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | |
0x80 | S C R | U P | R O F | C L R | I N S | L F T | ||||||||||||||||||||||||||
0xA0 | ▌ | ▄ | ▔ | ▁ | ▏ | ▒ | ▕ | ▄ | ◤ | ▄ | ├ | ▗ | └ | ┐ | ▂ | ┌ | ┴ | ┬ | ┤ | ▎ | ▍ | ▐ | ▔ | ▀ | ▃ | ▟ | ▖ | ▝ | ┘ | ▘ | ▚ | |
0xC0 | ─ | ♠ | │ | ─ | ─ | ▔ | ─ | │ | │ | ╮ | ╰ | ╯ | ▙ | ╲ | ╱ | ▛ | ▜ | ● | ▁ | ♥ | ▏ | ╭ | ╳ | ○ | ♣ | ▕ | ♦ | ┼ | ▌ | │ | π | ◥ |
0xE0 | ▌ | ▄ | ▔ | ▁ | ▏ | ▒ | ▕ | ▄ | ◤ | ▄ | ├ | ▗ | └ | ┐ | ▂ | ┌ | ┴ | ┬ | ┤ | ▎ | ▍ | ▐ | ▔ | ▀ | ▃ | ▟ | ▖ | ▝ | ┘ | ▘ | π |
The Upper-Case/Lower-Case Set (PET 2001)
00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 0A | 0B | 0C | 0D | 0E | 0F | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 1A | 1B | 1C | 1D | 1E | 1F | |
0x00 | C R | D W N | R V S | H O M | D E L | R G T | ||||||||||||||||||||||||||
0x20 | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | |
0x40 | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ↑ | ← |
0x60 | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? | |
0x80 | S C R | U P | R O F | C L R | I N S | L F T | ||||||||||||||||||||||||||
0xA0 | ▌ | ▄ | ▔ | ▁ | ▏ | ▒ | ▕ | ▄ | ▨ | ▄ | ├ | ▗ | └ | ┐ | ▂ | ┌ | ┴ | ┬ | ┤ | ▎ | ▍ | ▐ | ▔ | ▀ | ▃ | ✓ | ▖ | ▝ | ┘ | ▘ | ▚ | |
0xC0 | ─ | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | q | r | r | s | t | u | v | w | x | y | z | ┼ | ▌ | │ | ░ | ▧ |
0xE0 | ▌ | ▄ | ▔ | ▁ | ▏ | ▒ | ▕ | ▄ | ▨ | ▄ | ├ | ▗ | └ | ┐ | ▂ | ┌ | ┴ | ┬ | ┤ | ▎ | ▍ | ▐ | ▔ | ▀ | ▃ | ✓ | ▖ | ▝ | ┘ | ▘ | ░ |
(HOM … HOME, SRC … SHIFT + CR, ROF … RVS OFF, CLR … CLEAR.)
Beyond the PET 2001
Finally, since no write-up on PETSCII may be considered even mildly complete without, here’s how those characters appear on the C64 and in its characterstic, bold font:
Mind, how the lower-case group and the upper-case group were swapped in the “shifted” set by a simple modification of the character ROM, as compared to the original character set of the PET 2001.
And another chart, including the control characters and listing codes by decimal values:
Codes 192-223 as codes 96-127 Codes 224-254 as codes 160-190 Code 255 as code 126
— That’s all, folks! —
Addendum
On Feb. 2, 2021 Jason Scott shared an interview with Leonard Tramiel on the subject of the design process of the PETSCII character set. Here are the most significant parts:
- (Preamble)
LEONARD TRAMIEL is the designer of PETSCII, the variant dialect of ASCII that was shipped with Commodore PET computers in 1977 and which eventually became CBM ASCII, appearing in the Commodore C16, C64, C116, C128, CBM-II, Plus/4 and VIC-20. It was designed by Tramiel at the rquest of designer CHUCK PEDDLE, who asked for Card Suits to be included (for potential use in card games) and a set of graphical characters for text-based artwork. (…)
- Jason Scott:
(…) I think the only real gap in it, really is understanding those days when you are putting it together under Mr. Peddle’s request. It’s just you and graph paper? Or did you work with anyone familiar with type ore typeography?
- Leonard Tramiel:
It was just me and graph paper. I don’t really see it as typography and certainly wouldn’t have thought of it that way of the time.
- Json Scott:
(…) Do you recall having any sources you looked up? Any books or typefaces?
- Leonard Tramiel:
I had quite an interest in typefaces at the time. I had worked on my college yearbook and got really into how different typefaces gave a different feel. But that didn’t and, still doesn’t, feel relevant to the graphics character set. To answer your question directly, no. No books or other sources.
- Jason Scott:
You mentioned (…) that you partially based the designs on how effectively you could render a number of elements using the graphic characters. Any other goals (besides Mr. Peddle saying “do it”)
- Leonard Tramiel:
(…) The goal was to create a character set that would allow as wide a range of images as possible. I chose to concentrate on the kind of subjects I thought were most likely to come up. That’s why there are vertical and horizontal bars of every possible pixel width for bar charts. Then there are elements to make rectangular boxes and lastly all possible 4×4 pixel elements within a 8×8 square. The images I used as a test were the Starship Enterprise from Star Trek that was used in lots of early demos and the Lunar Lander that was used in the game by that name. I don’t remember how many iterations I went through but there were many.
- Jason Scott:
I think that fills in all the gaps I have; thank you so much for your time.
- Leonard Tramiel:
Sure thing.
And these are the sample images mentioned by Leonard Tramiel, the Lunar Lander and Starship Entrprise. Both are to be found in animated form as part of the “PET DEMO
” program on the “PET DEMO
” disk (see here for the program running in online emulation).
Mind that there is also a much more sophisticated demo image of the Lunar Lander in PETSCII graphics, really showing the versatility of the character set:
Update (2023): For how characters are read from the keyboard, see “PET Keys — Series 2001 Edition”.
Norbert Landsteiner,
Vienna, 2020-03-12
————
Amended 2021-02-05