BASIC Variables & Strings — with Commodore
Investigations into the memory utilization of Commodore BASIC (PET 2001, VIC-20, C64)
In our last episode, in which we were investigating the storage format for BASIC text for the sake of renumbering, we stumbled over some surprising facts, regarding the variable management and memory allocation in Commodore BASIC and consequences thereof. Namely, we found that there’s a single pointer, ARYTAB, used both for pointing at the next available memory location for the allocation of simple (non-indexed) variables and for marking the start of the memory space used to store arrays. Meaning, everytime the interpreter encounters a new simple variable, the entire block of previously defined arrays has to be moved by 7 bytes in order to provide the required space in memory.
This is certainly something, we may want to optimize our programs for. (Define all simple variables first, then allocate array space by “DIM()
”.) However, there may be more of note in that quarter. Reason enough to investigate variable management in Commodore BASIC — and especially strings,.
BASIC Memory Partions
Let’s recap how BASIC memory is partioned and how we may find out at any time about how this is currently configured. Here, we concentrate on the PET 2001 (ROM 1.0 and 2.0), the VIC-20 and the C64, which share the same flavor of BASIC. There are 6 pointers in the zeropage, which are used by the system for this, named TXTTAB
, VARTAB
, ARYTAB
, STREND
, FRETOP
, and MEMSIZ
. (Inbetween, there’s is also a utility pointer FRESPC
, which is used for operations involving the use of scratch space, which won’t be of further interest here).
And this is, where we find them on each of these systems:
• C64 and VIC-20 label loc.hex loc.dec comment TXTTAB 002B-002C 43-44 Pointer: Start of BASIC Text VARTAB 002D-002E 45-46 Pointer: Start of BASIC Variables ARYTAB 002F-0030 47-48 Pointer: Start of BASIC Arrays STREND 0031-0032 49-50 Pointer: End of BASIC Arrays (+1) FRETOP 0033-0034 51-52 Pointer: Bottom of String Storage FRESPC 0035-0036 53-54 Utility String Pointer MEMSIZ 0037-0038 55-56 Pointer: Highest Address Used by BASIC • PET 2001 ROM 2.0 ("new ROM") TXTTAB 0028-0029 40-41 Pointer: Start of BASIC Text VARTAB 002A-002B 42-43 Pointer: Start of BASIC Variables ARYTAB 002C-002D 44-45 Pointer: Start of BASIC Arrays STREND 002E-002F 46-47 Pointer: End of BASIC Arrays (+1) FRETOP 0030-0031 48-49 Pointer: Bottom of String Storage FRESPC 0032-0033 50-51 Utility String Pointer MEMSIZ 0034-0035 52-52 Pointer: Highest Address Used by BASIC • PET 2001 ROM 1.0 ("old ROM") TXTTAB 007A-007B 122-123 Pointer: Start of BASIC Text VARTAB 007C-007D 124-125 Pointer: Start of BASIC Variables ARYTAB 007E-007F 126-127 Pointer: Start of BASIC Arrays STREND 0080-0081 128-129 Pointer: End of BASIC Arrays (+1) FRETOP 0082-0083 130-131 Pointer: Bottom of String Storage FRESPC 0084-0085 132-133 Utility String Pointer MEMSIZ 0086-0087 134-135 Pointer: Highest Address Used by BASIC
These system pointers enjoy the following use and meaning:
- TXTTAB
- points to the very start of the BASIC program text.
- VARTAB
- is the start of the variable space. It follows immediately after the end of the program in memory (the final empty line-link.) Simple variables are stored here. Each variable occupies 7 bytes of memory, regardless of the type. ARYTAB points to the next available space.
- ARYTAB
- is (also) the beginning of the space allocated for array storage. It points to the location following immediately after the last byte allocated for simple variables.
- STREND
- is the lower end of the space used for storing string literals. String literals are stored at the highest availble address, as marked by FRETOP, growing down from the top. The space starting at STREND and up to FRETOP is the free available memory.
- FRETOP
- marks the top end of the unallocated memory, below the last allocated string. New string literals will be stored immediately below this address.
- FRESPC
- is a utility pointer used internally by BASIC. It is not directly involved in variable management, but used for string handling.
- MEMSIZ
- is the top address of accessible memory. Strings start growing down from here.
In case of a PET 2001 with ROM 2.0 (“new ROM”) and 8K of RAM — we’ll use this for our demonstrations, since we have it at ready hands as a default configuration in our online emulation —, this is what these pointers look like, just after a fresh start or reset:
PET 2001, 8K, ROM 2.0, freshly initialized TXTTAB 0028-0029: 01 04 → $0401 (BASIC Text: 00 00) VARTAB 002A-002B: 03 04 → $0403 ARYTAB 002C-002D: 03 04 → $0403 STREND 002E-002F: 03 04 → $0403 FRETOP 0030-0031: 00 20 → $2000 FRESPC 0032-0033: 44 44 → $4444 MEMSIZ 0034-0035: 00 20 → $2000 (Top of 8K RAM + 1)
So, on a PET 2001, BASIC starts at $0401
(on a C64, it’s $0801
, and on a VIC-20 it starts at $1001
, or, if there’s a memory extension of at least 8K, at $1201
). The minimal BASIC text consist of two zero bytes, representing an empty line-link and thus the end of program (compare last episode). The next free, avaible space (which might be used in direct mode) for a variable to go starts immediately after this, ad $0403
(VARTAB
). Since we have no variables defined yet, this block is of zero length and $0403
is also the beginning of the array space (ARYTAB
). As we haven’t defined any yet, this is also the end of it and the begin of free, unused memory (STREND
). Finally, $2000
is hex for 8K (MEMSIZ
), and, since we haven’t used any strings either, this is also, where any stored string literals would start to grow down from (FRETOP
).
Now, let’s load a simple program and RUN
it:
10 DIM A(10) 20 FOR I=0 TO 10:A(I)=I:NEXT 30 B$=CHR$(66)
A dump of the memory provides the following image:
We may easily discern some structure in this:
- The BASIC text is stored in a segment from
$0401
to$0433
, where three zero-bytes mark the end of the program: one zero-byte for the end of the last line, followed by the two zero-bytes of the last, empty link address, which marks the end of the linked list of BASIC lines. - At
$0434
, we find the simple variables. The ASCII representation on the right helpfully hints at the characters “I
” and “B
”, which are also the variable names used by our simple program. Each of the memory representations of these variables starts with the name and occupies 7 bytes, regardless of the type. - This is followed by the array storage at
$0442
. Aggain, we may regognize the variable name (“A
”) in the right-hand ASCII representation, which is followed by some opaque bytes. However, we know that this is an array of a single dimension, comprising 11 indexed instances in total (0..11
), thanks to our “DIM(10)
” command. - Finally, at the very top, we find the string literal “
B
”, which was generated by our “CHR$(66)
” command. A closer look at the bytes following the entry forB$
in the simple variable space also provides the address$1FFF
(as usually stored in LO-byte, HI-byte order), which is probably a link to this slice of memory. - MEMSIZ remains unchanged, since our total RAM is still 8K.
Stored Variables
Let’s have some fun and explore this a bit further.
For a beginning, let’s try some simple variables:
10 REM SIMPLE VARIABLES TEST 20 R=1: REM REAL 30 R1=2: REM REAL, DOUBLE CHARACTER NAME 40 I%=1: REM INT 50 I1%=2:REM INT, DOUBLE CHARACTER NAME 60 S$=CHR$(65): REM "A" 70 S1$=CHR$(66):REM "B"
which provides us with the following memory dump:
On the PET, we see $AA
for any unused addresses, a result of the RAM test run by the start-up routine, in order to determine the size of available memory.
VARTAB → 04B7 04B7: 52 R 04B8: 00 81 00 00 00 00 52 31 ......R1 04C0: 82 00 00 00 00 C9 80 00 ........ 04C8: 01 00 00 00 C9 B1 00 02 ........ 04D0: 00 00 00 53 80 01 FF 1F ...S.... 04D8: 00 00 53 B1 01 FE 1F 00 ..S..... 04E0: 00 AA AA AA AA AA AA AA ........ ... 1FF8: AA AA AA AA AA AA 42 41 ......BA
Again, we may recognize a few ASCII characters, hinting at variable names. We already know that all variables occupy 7 bytes, each, regardless of their type, so we may reformat this as:
R: 52 00 81 00 00 00 00 R...... R1: 52 31 82 00 00 00 00 R1..... I%: C9 80 00 01 00 00 00 ....... $C9 = $80 + "I" I1%: C9 B1 00 02 00 00 00 ....... $B1 = $80 + "1" S$: 53 80 01 FF 1F 00 00 S...... → $1FFF = "A" S1$: 53 B1 01 FE 1F 00 00 S...... → $1FFE = "B"
- Ordinary, real variables are easy. As we may observe in the first two bytes, this is just the ASCII representation of the variable name (2 characters max), which is followed by the floating point storage format, consisting of 5 bytes. For the single-character name, the second name byte is just zero.
- Integer variables (
I%
andI1%
) are a bit trickier, since we fail to detect any traces of the ASCII characters corresponding to their names. However,$C9
is the sum of$80
and$49
, which is the most significant bit set and the ASCII code for “I
”. The second byte of variableI
is also$80
, instead of$00
, which we may have expectecd, indicating that for any integer variables the high-bit is set on both of the two name bytes. (And, indeed, having a closer look at the second integer variable,$B1 = $80 + $31
, the latter representing the ASCII code for “1
”). This is followed by a the 2-byte integer value in HI-LO order (which is often used in arithmetic context). The remaining 3 bits are left unused and are set to zero. - Strings have the sign-bit set only on the second name byte. This is followed here by a binary
1
, which may be well indicating the length of the string (in our example a single character), and a pointer to the address, where the corresponding string literal is stored ($1FFF
and$1FFE
, respectively). The two remaining, unused bytes are set to zero, again.
So normal variables (real) are identified by their name in ASCII (max two bytes), where the second byte is zero for single-character names. Integer variables have the high-bit set on both name bytes and string variables only on the second one.
Real variables use the remaining 5 bytes for the floating point representation of their value. Integers have the value in HI-LO order in their third and fourth byte, strings have their length (max 255) in the third byte, followed by a pointer to the storage location of the PETSCII sequence in the usual LO-HI format. Any remaining space to complete the 7 bytes is filled with binary zeros.
Arrays
What may be more fun than trying the same for arrays? ;-)
10 REM SIMPLE ARRAY TEST 20 I=0:DIM RA(2):DIM IA%(2):DIM SA$(2) 30 FOR I = 0 TO 2 40 RA(I)=1+I:IA%(I)=1+I:SA$(I)=CHR$(65+I) 50 NEXT
Which provides the following memory dump:
VARTAB → 047C ARYTAB → 0482 047C: 49 00 82 40 00 I..@. 0480: 00 00 52 41 16 00 01 00 ..RA.... 0488: 03 81 00 00 00 00 82 00 ........ 0490: 00 00 00 82 40 00 00 00 ....@... 0498: C9 C1 0D 00 01 00 03 00 ........ 04A0: 01 00 02 00 03 53 C1 10 .....S.. 04A8: 00 01 00 03 01 FF 1F 01 ........ 04B0: FE 1F 01 FD 1F AA AA AA ........ ... 1FF8: AA AA AA AA AA 43 42 41 .....CBA
Again, we may recognize some ASCII characters hinting at the start of a variable. And, indeed, the same naming conventions apply. (High-bit set on both name bytes for integers and solely on the second one for strings.) — Let’s reformat this array segement…
0482: 52 41 16 00 01 00 03 RA(0..2) 81 00 00 00 00 1.0 82 00 00 00 00 2.0 82 40 00 00 00 3.0 0498: C9 C1 0D 00 01 00 03 IA%(0..2) 00 01 1 00 02 2 00 03 3 04A5: 53 C1 10 00 01 00 03 SA$(0..2) 01 FF 1F length: 1 → $1FFF ("A") 01 FE 1F length: 1 → $1FFE ("B") 01 FD 1F length: 1 → $1FFD ("C")
At first glance, we can see that arrays use a much more compact storage format for values. No extra bytes are wasted for padding, since each of the array members are of the same type and length. However, there are some extra bytes inbetween the variable name and the list of values, which are, BTW, obviously stored in ascending index order.
- The first two bytes provide the name, which is followed by a two byte value. Apparently, this is in HI-LO order, hinting at this being an arithmetic integer value. With a bit of luck we may even recognize, what this value is all about: the offset to the next array variable. Obviously, we wouldn’t mind having this following the name as immediately as possible, so that we may scan through the list of array variables as quickly as possible, when we want to access them.
- The third value (5th byte) is a single byte value, holding the dimensionality of the array. Since these are all single-dimmed arrays, this is set to
1
. (We’ll confirm in a further example below, using multidimensional arrays, that this byte provides the dimensionality of the array, indeed.) - The 6th and 7th byte provides the number of members in a given dimension, here
3
. - This is followed by a compact representation of the individual array items in ascending index order.
And What About Multidemsional Arrays?
10 REM MULTI DIM ARRAY TEST 20 A=0:B=0:DIM I%(2,2) 30 FOR A=0 TO 2 40 FOR B=0 TO 2 60 I%(A,B)=3*B+1*A 70 NEXT B 80 NEXT A
Which provides the following dump:
ARYTAB → 0482 0482: C9 80 1B 00 02 00 ...... 0488: 03 00 03 00 00 00 01 00 ........ 0490: 02 00 03 00 04 00 05 00 ........ 0498: 06 00 07 00 08 AA AA AA ........
Reformatted:
0482: C9 80 1B 00 02 00 03 00 03 00 00 00 01 00 02 00 03 00 04 00 05 00 06 00 07 00 08
Two things may be observed: The array recieves another descriptor for the extend of the second dimension, added after the first discriptor, which we allready know, and the last left-most index iterates fastest. Moreover, we may confirm that the 5th byte provides indeed the dimensionality of the array, since it has here changed to 2
(as in 0...2
.
Update (Jan. 2023):
The above example is maybe not as conclusive as we may would have wished. So in order to get this right, let’s have a look at a 3-dimensional array:
10 REM MULTI DIM ARRAY TEST 20 A=0:B=0:DIM A$(1,2,3) 30 FOR A=0 TO 1 40 FOR B=0 TO 2 45 FOR C=0 TO 3 60 A$(A,B,C)=CHR$(48+A)+CHR$(48+B)+CHR$(48+C) 65 NEXT C 70 NEXT B 80 NEXT A
04B0: 41 80 53 00 03 00 04 00 03 00 02
Mind that the order of the individual lengths of the subarrays is reversed!
And this is the order of the individual elements in memory, the left-most index rotating fasted:
A$(0,0,0) = "000" A$(1,0,0) = "100" A$(0,1,0) = "010" A$(1,1,0) = "110" A$(0,2,0) = "020" A$(1,2,0) = "120" A$(0,0,1) = "001" A$(1,0,1) = "101" A$(0,1,1) = "011" A$(1,1,1) = "111" A$(0,2,1) = "021" A$(1,2,1) = "121" A$(0,0,2) = "002" A$(1,0,2) = "102" A$(0,1,2) = "012" A$(1,1,2) = "112" A$(0,2,2) = "022" A$(1,2,2) = "122" A$(0,0,3) = "003" A$(1,0,3) = "103" A$(0,1,3) = "013" A$(1,1,3) = "113" A$(0,2,3) = "023" A$(1,2,3) = "123"
Or, looking just at the indices (order as in memory, left to right and from top to bottom sequentially):
DIM(A,B,C) :REM A=1,B=2,C=3 | |||||
C ↓ | A → | ||||
B ↓ | (0,0,0) (1,0,0) | ||||
(0,1,0) (1,1,0) | |||||
(0,2,0) (1,2,0) | |||||
A → | |||||
B ↓ | (0,0,1) (1,0,1) | ||||
(0,1,1) (1,1,1) | |||||
(0,2,1) (1,2,1) | |||||
A → | |||||
B ↓ | (0,0,2) (1,0,2) | ||||
(0,1,2) (1,1,2) | |||||
(0,2,2) (1,2,2) | |||||
A → | |||||
B ↓ | (0,0,3) (1,0,3) | ||||
(0,1,3) (1,1,3) | |||||
(0,2,3) (1,2,3) |
Strings
As we’ve already seen, strings are a bit different, insofar as they do not contain their value by themselves, but are rather pointers. Let’s have a closer look at this, since there may be more to it, hidden in the finer details of string processing and referencing…
May we suggest another, small test program?
10 A$="THE QUICK":B$=" BROWN FOX ":C$="JUMPS OVER THE LAZY DOG"
Which produces the following dump:
0401: 42 04 0A 00 41 24 B2 B...A$. 0408: 22 54 48 45 20 51 55 49 "THE QUI 0410: 43 4B 22 3A 42 24 B2 22 CK":B$." 0418: 20 42 52 4F 57 4E 20 46 BROWN F 0420: 4F 58 20 22 3A 43 24 B2 OX ":C$. 0428: 22 4A 55 4D 50 53 20 4F "JUMPS O 0430: 56 45 52 20 54 48 45 20 VER THE 0438: 4C 41 5A 59 20 44 4F 47 LAZY DOG 0440: 22 00 00 00 41 80 09 09 "...A... 0448: 04 00 00 42 80 0B 18 04 ...B.... 0450: 00 00 43 80 17 29 04 00 ..C..).. 0458: 00 AA AA AA AA AA AA AA ........ A$: 41 80 09 09 04 00 00 length: 9 → $0409 B$: 42 80 0B 18 04 00 00 length: 11 → $0418 C$: 43 80 17 29 04 00 00 length: 23 → $0429
As may be observed, there are no entries pointing to the string storage area. Rather, the string variables point to the string literals inside the BASIC text (here printed in teal). Which is actually a great idea, since, this way, quite an amount of memory and runtime for copying may be saved.
However, mind the following program:
10 A$="THE QUICK":B$=" BROWN FOX ":C$="JUMPS OVER THE LAZY DOG" 20 D$=A$+B$ 30 E$=D$+C$
Now, this results in quite an amount of string allocation, where we find the stored literals, as composed by the assignments, at the very top of the memory:
1FC0: AA 54 48 45 20 51 55 49 .THE QUI 1FC8: 43 4B 20 42 52 4F 57 4E CK BROWN 1FD0: 20 46 4F 58 20 4A 55 4D FOX JUM 1FD8: 50 53 20 4F 56 45 52 20 PS OVER 1FE0: 54 48 45 20 4C 41 5A 59 THE LAZY 1FE8: 20 44 4F 47 54 48 45 20 DOGTHE 1FF0: 51 55 49 43 4B 20 42 52 QUICK BR 1FF8: 4F 57 4E 20 46 4F 58 20 OWN FOX
Notably, we find the sequence “THE QUICK BROWN FOX
” (A$+B$
) twice, since string variables can only reference consecutive sequences of memory and there is no way to reuse this. Over time, string literals will pile up in memory, as we procede to compose strings in our program.
String Functions
So, what about string operations using BASIC’s built-in functions, which access partial strings, like “LEFT$()
”, “RIGHT$()
”, or “MID$
”? Certainly, these can make use of the very properties of the string variables, by just modifing the pointer address and/or the length?
10 A$="TEST":B$=LEFT$(A$,2) 0401: 1A 04 0A 00 41 24 B2 ....A$. 0408: 22 54 45 53 54 22 3A 42 "TEST":B 0410: 24 B2 C8 28 41 24 2C 32 $..(A$,2 0418: 29 00 00 00 41 80 04 09 )...A... 0420: 04 00 00 42 80 02 FE 1F ...B.... 0428: 00 00 AA AA AA AA AA AA ........ ... 1FF8: AA AA AA AA AA AA 54 45 ......TE B$: 42 80 02 FE 1F 00 00
Oh no! Rather counter-intuitively, this has generated its own entry in the string storage! Surely, this is so that this can operate on complex expressions, which may be provided as an argument in some kind of buffer or scratch area, without piling up garbage in the string storage?
10 A$=LEFT$("TE"+"ST",3) 0401: 17 04 0A 00 41 24 B2 ....A$. 0408: C8 28 22 54 45 22 AA 22 .("TE"." 0410: 53 54 22 2C 33 29 00 00 ST",3).. 0418: 00 41 80 03 F9 1F 00 00 .A...... ... 1FF8: AA 54 45 53 54 45 53 54 .TESTEST
Oh double-no!
There’s both the composed, temporary string and the sequence resulting from the “LEFT$()
” operation! — Now, this isn’t optimized in any way.
Printing Strings
So, we may ask, are there any consequences for the PRINT
statement? Obviously, it’s not a good idea to compose strings just for the sake of printing, as in “C$=A$+" "+B$:PRINT C$
”.
But, does this also apply for PRINT
expressions? Is there any difference in the following two statements?
10 PRINT "THE LAZY "+"DOG" 10 PRINT "THE LAZY ";"DOG"
Let’s give it a try…
10 PRINT "THE LAZY ";"DOG"
RUN
THE LAZY DOG
READY.
█
1FF8: AA AA AA AA AA AA AA AA ........
As expected, not much to be observed here, the string storage area is still empty. Now for the more thrilling test, will the string concatenation by “+
” generate a new entry, as we previously observed it in assignments?
10 PRINT "THE LAZY "+"DOG"
RUN
THE LAZY DOG
READY.
█
1FF0: AA AA AA AA 54 48 45 20 ....THE
1FF8: 4C 41 5A 59 20 44 4F 47 LAZY DOG
Oops, there it is! — Make sure to join your strings by semicolons (“;
”) in your PRINT
statements!
Other Operations Involving Strings
Just in case, you supposed, you had seen all of it, by now, including all possible pitfalls, consider the case of loading a program:
### COMMODORE BASIC ###
7167 BYTES FREE
READY.
LOAD "TESTPROGRAM",8
SEARCHING FOR TESTPROGRAM
LOADING
→
1FF0: AA AA AA AA AA 54 45 53 .....TES
1FF8: 54 50 52 4F 47 52 41 4D TPROGRAM
Yes, there it is. Any string operation like this will generate its own entry in the string storage area. Obviously, direct mode doesn’t use any references to the BASIC input buffer.
But, how about this?
### COMMODORE BASIC ### 7167 BYTES FREE READY. 10 LOAD "TEST-TAPE" RUN PRESS PLAY ON TAPE #1 → 0401: 13 04 0A 00 93 20 22 ..... " 0408: 54 45 53 54 2D 54 41 50 TEST-TAP 0410: 45 22 00 00 00 AA AA AA E"...... ... 1FF8: AA AA AA AA AA AA AA AA ........
Indeed, when executing a LOAD
statement from inside a program, our string storage is still empty.
Lessons Learned
Here, we may finsih our explorations. But there are a few things, which are worth keeping in mind, when operating with strings. As there are, in no particular order:
- Garbage collection and memory are not an issue, as long as you reuse strings, which have been already defined, be it as a composition or as a literal in the BASIC text.
- However, string literals in the BASIC text may cause severe problems, if the BASIC text has been altered in any way in the meantime, e.g., by loading a (temporary) code overlay. (In this case, you may want to rather compose your strings from parts, in order to have them allocated separately.)
- There may be performance issues, when using “
+
” for string concatenation inPRINT
statements as opposed to using semicolons (“;
”). Also, portable listings using “CHR$()
” may be less efficient than using PETSCII screen characters embedded as string literals in the BASIC text.
Especially, statements like “10 PRINT "A"+" TEST."
” will generate a new entry in the string storage! - Composing strings just for the sake of printing (as opposed to joining the output by semicolons) should be avoided.
- Executing commands involving strings in direct mode will generate a new entry in the string storage area.
Exploiting String Variables
Now, could we use this for something productive, like, a nifty exploit? Certainly, we could modify a string on the fly, say, to print a line of a video game screen for a BASIC 10-liner contest or the like. If we put a string definition right at the very beginning of the program, we may easily work out, where the actual string sequence starts in the BASIC text. Moreover, since this is also the very first variable defined, it will be easy to work out its location in memory, simply by PEEKing the contents of pointer “VARTAB
”.
Let’s have simple definition like,
10 A$="123456789"
Since this is the very first line, it will start at “TXTTAB
, the start of the BASIC text in memory. On a PET (with ROM 2.0), this is at location $0401
(dec. 1025) and on a C64 at $0801
(dec. 2049). Add 2 locations to this for the link to the next line and another 2 for the line number (stored in binary). Hence, the text of our line starts at TXTTAB + 4
and, adding another 4 for “A$="
”, we worked out that our string literal starts at TXTTAB + 9
, on a PET at $040A
, on a C64 at $080A
. This is, where we’ll find the character “1
”, being the first character in the string literal.
Quite the same, we may work out, where the properties of the string variable A$
are stored: Since it’s the very first variable, the length will be at VARTAB + 2
, and the reference to the start location of the literal in VARTAB + 3
and VARTAB + 4
.
• PET 2001 (ROM 2.0) VARTAB: 002A-002B (dec. 42,43) VT = PEEK(43)*256+PEEK(42) • C64, VIC-20 VARTAB: 002D-002E (dec. 45,46) VT = PEEK(46)*256+PEEK(45)
Now consider something like a canyon game, where we print a varying, winding shaft, filling the screen, which is procedurally scolled up by adding lines to the bottom by a print statement. We may define a string, wider than a screen line, containing 30 fill characters to left, a passage of 9 spaces in the middle, and another 30 fill characters to the right. Then, we could set the effective length of the string variable to 39 (3rd byte), and adjust any padding to the left (and by this the position of the “shaft” in the middle) by modifying the pointer in the string variable (4th and 5th byte). This way, we may adjust a window for printing inside our larger string literal, a technique, which may be also used for horizontal scrolling and the like.
And here is the same in text representation (grey dots in the listing represent blanks):
LIST
10 A$="▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒··
·······▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒"
20 VT=PEEK(43)*256+PEEK(42):A=1025+9
30 POKE VT+2,39:REM LENGTH A$=39
40 P=A+15:REM 15 INTO THE STRING
50 POKE VT+3,P AND 255
60 POKE VT+4,INT(P/256)
70 PRINT A$
READY.
RUN
▒▒▒▒▒▒▒▒▒▒▒▒▒▒ ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
READY.
█
Here, our tiny proof of concept is drawing a random generated canyon by repeatedly printing a sliding window into the string literal defined for variable A$
:
— That’s all, folks! —
Update: For yet another variable tye, see “The Case of the Missing 4th Commodore BASIC Variable (and the 5th Byte)” (2023).
Norbert Landsteiner,
Vienna, 2020-03-01