UK Vintage Radio Repair and Restoration Discussion Forum

UK Vintage Radio Repair and Restoration Discussion Forum (https://www.vintage-radio.net/forum/index.php)
-   Vintage Computers (https://www.vintage-radio.net/forum/forumdisplay.php?f=16)
-   -   Fun with 6502 Assembler (https://www.vintage-radio.net/forum/showthread.php?t=156829)

Slothie 22nd Jun 2019 4:50 pm

Re: Fun with 6502 Assembler
 
The BASIC on the commodore PET (and probably most others with Microsofts 6502 BASIC) has a small routine copied into zero page RAM that gets the next byte of the program code:
.
Code:

C:00c2  E6 C9      INC $C9
.C:00c4  D0 02      BNE $00C8
.C:00c6  E6 CA      INC $CA
.C:00c8  AD 00 04    LDA $0400
.C:00cb  C9 3A      CMP #$3A
.C:00cd  B0 0A      BCS $00D9
.C:00cf  C9 20      CMP #$20
.C:00d1  F0 EF      BEQ $00C2
.C:00d3  38          SEC
.C:00d4  E9 30      SBC #$30
.C:00d6  38          SEC
.C:00d7  E9 D0      SBC #$D0
.C:00d9  60          RTS

It's more complicated than you would expect because it sets various flags depending on the type of byte retrieved and subtracts the ASCII offset from numeric digits, but its unclear why Microsoft chose to do it this way rather than just have the code in ROM and a pointer in ZP. It's entirely possible that they just transliterated the code directly from another processor that didn't do indirection [well] and couldn't be bothered to "optimse" it, but it does give the crafty user a way to inject extra code into this critical routine to, for instance, add extra keywords to the command interpreter etc.

julie_m 9th Jul 2019 8:03 am

Re: Fun with 6502 Assembler
 
Spot the schoolgirl error:
Code:

.dotest
LDY offsets,X
LDA (fxb),Y
INX
LDY offsets,X
CMP (mvb),Y
PHP \ save C,N,Z flags
INX \ will trash comparison result
PLP \ restore real result
RTS

In case you need context, it's called from a routine doing a bunch of comparisons on arrays of bytes starting at the addresses in fxb, fxb+1 and mvb, mvb+1; and it's doing them out-of-order. Hence the table of offsets indexed by X.

dominicbeesley 9th Jul 2019 9:30 am

Re: Fun with 6502 Assembler
 
Move the INX before the CMP?

Other comments:
Could use offsets+1 in the second LDY
Could split offsets table in to a low byte and hi byte table then only one inx needed.


I've been doing a lot of 6502 assembler recently. Mainly to allow me to load operating system and basic rom images into my frankenstein bbc micro with an optional 6809, z80 or 68008 main processor. I prefer 6809 coding usually but the limitations of the 6502 often force the use of cunning to get the job done. I had a fun few hours yesterday squeezing the flash rom programming routines down to a single page..

julie_m 9th Jul 2019 3:10 pm

Re: Fun with 6502 Assembler
 
Ding, ding, we have a winner! This is 2 bytes shorter and 7 cycles faster:
Code:

.dotest
LDY offsets,X
INX
LDA (fxb),Y
LDY offsets,X
INX
CMP (mvb),Y
RTS

I'd still need to INX again sometime if I used offsets+1 in the second LDY -- and bear in mind there must be a conditional branch after the JSR that called this subroutine. The offsets are only 8 bits (actually, they're only 4 bits) but it's not worth unpacking 8 bytes into 16 for this. (The unpacking code would need 4 bytes for the LSR A if using the high nybble, two for the conditional branch around them and two more for the AND &F if using the low nybble -- and that's the saving wiped out before we've even done an LDA or a TAX!) The rest of the project makes very heavy use of bit-packed co-ordinates (12 bit signed X and Y values in 3 bytes); and in some places, uses robbed bits to specify a plot mode -- move, draw, triangle or close. But one byte saved per point plotted (and points saved through the use of "close") is worth it.

Splitting the table of offsets would be a slap in the face for anybody trying to reverse-engineer the code in future. (I do plan to publish the Source Code, but you never know .....) So I'm holding off that technique unless and until the code ever gets one byte the wrong side of running on a model B!

julie_m 27th Jul 2019 3:54 pm

Re: Fun with 6502 Assembler
 
If you have a number of pointers stored in Zero Page, each representing the base address of a fixed-size record up to 256 bytes, you probably use something like
Code:

LDA (pcb),Y
to read out an individual record, as Y varies from 0 to the length of the record -1, and then update the pointer to read the next record. Here pcb is a location in ZP, short for "packed co-ordinate base"; pcb and pcb+1 are the pointer to a series of packed co-ordinate pairs in memory. After unpacking the co-ordinates (12 + 12 = 24 bits = 3 bytes), we have four bytes to store at a location pointed to by the value in acb and acb+1 (short for "absolute co-ordinate base).

Now to move on to the next vertex, we need to add 3 to value stored at pcb and pcb+1; and we need to add 4 to the value stored at acb and acb+1. Naïvely, we might write
Code:

.next_vertex
CLC
LDA pcb
ADC #3
STA pcb
LDA pcb+1
ADC #0
STA pcb+1
CLC
LDA acb
ADC #4
STA acb
LDA acb+1
ADC #0
STA acb+1
RTS

That's 27 bytes. We probably can omit the second CLC, because we are not expecting the address to overflow the bounds of memory. But we can save even more by making use of the X register, and fall-through:
Code:

.next_vertex
LDX #pcb
CLC
JSR add_three
LDX #acb
.add_four
SEC \ this will add an extra one
.add_three
LDA #3
.add_A_X
ADC 0,X
STA 0,X
LDA #0
ADC 1,X
STA 1,X
RTS

This takes five bytes less space and also exposes two more general-purpose subroutines: one which can be used to advance any similar Zero Page pointer given in X by 4 bytes, and another which advances the pointer by an amount given in A, plus an extra 1 if C is set. (Not doing the CLC in the subroutine allows us to take advantage of 3 and 4 being exactly 1 apart.)

Note that we don't need to increase X to deal with the high byte; we just offset from 1 instead.

Other pointers at different addresses in ZP probably will have different record sizes; so we just need a construct like
Code:

.next_shape
LDX #shp \ shape pointer
LDA #23 \ each "shape" record is 23 bytes long
CLC
BCC add_A_X \ saves over JSR followed by RTS

for each of them. Here, the base address of the list of packed vertex co-ordinates (which will go into pcb and pcb+1) is obtained from a "shape" record which is actually 23 bytes long. After we have stepped through all the vertexes in a shape, unpacking their co-ordinates into a buffer, we can move on to the next shape.

dominicbeesley 29th Jul 2019 11:04 am

Re: Fun with 6502 Assembler
 
shave another couple of bytes and speed up with an early exit
Code:

        ADC #xx
        STA  0,X
        BCC  ex
        INC  1,X
ex:    RTS



All times are GMT +1. The time now is 10:48 am.

Powered by vBulletin®
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright ©2002 - 2023, Paul Stenning.