UK Vintage Radio Repair and Restoration Powered By Google Custom Search Vintage Radio and TV Service Data

Go Back   UK Vintage Radio Repair and Restoration Discussion Forum > Specific Vintage Equipment > Vintage Computers

Notices

Vintage Computers Any vintage computer systems, calculators, video games etc., but with an emphasis on 1980s and earlier equipment.

Closed Thread
 
Thread Tools
Old 25th Mar 2023, 3:07 pm   #1
Michael Haardt
Tetrode
 
Join Date: May 2021
Location: Titz, Germany.
Posts: 72
Default Stack and subroutine calls

It is frequently said that the SC/MP does not have a stack and Elbug has a software stack with a very high overhead for that reason. But looking at the architecture, the SC/MP and its XPPC instruction always reminded me of the branch and link (BAL) instruction of the IBM/360. Digging through the SC/MP programming and assembler manual, section 6.2 "Stack programming" confirms that: There is an example how to call subroutines with statically allocated call frames. Their advantage is that you have no stack that could overflow, but they consume more RAM than stack allocated frames and you need to save/restore the frame pointer.

If you do not extend call frames dynamically, and the SC/MP only allows signed byte indexed addressing anyway, there should be an easier way using call frames on the stack:

Subroutine entry

p3 contains the return address
p2 points to frame:
subroutine arguments addressed with positive offsets
if this subroutine contains calls, save p3 to 0/-1(p2)
locals are stored further down
caller arguments are stored below locals

Returning from subroutine

ld -1(p2) ; Restore p3 if this subroutine made calls
xpah p3
ld 0(p2)
xpal p3

xppc p3 ; Return

Call a subroutine

store callee arguments below locals
ld @-argsoffset(p2) ; decrease p2 to point below arguments
ldi l(subroutine)
xpal p3
ldi h(subroutine)
xpah p3
xppc p3
ld @argsoffset(p2) ; restore caller p2

I did not try this, but it looks like it should work. As long as the call frame stack does not exceed a page, a single call frame would not have more than 127 bytes passed arguments and 128 bytes locals and call arguments, it is suitable for independently assembled and linked modules or even a compiler.

Which makes me wonder: Were there ever any compilers for SC/MP?

Michael
Michael Haardt is offline  
Old 25th Mar 2023, 5:23 pm   #2
Mark1960
Octode
 
Join Date: Mar 2020
Location: Kitchener, Ontario, Canada
Posts: 1,265
Default Re: Stack and subroutine calls

Why not just use auto index addressing mode to push and pop values from the stack?

Xpah p3. ; push return address (little endian)
St @-1(p2)
Xpal p3
St @-1(p2)

Ld @1(p2). ; pop return address (little endian)
Xpal p3
Ld @1(p2)
Xpah p3

If this is going to be used often enough then it may be worth having a subroutine/interupt service routine. Keep P3 pointed at the service routine, then:-
Xppc p3
Db calleelow
Db calleehi

Service:
St @-1(p2)
Jump to interupt service if senseA set, possibly after saving p3 on stack
Xpah p3
St @-1(p2)
Xpah p3
Xpal p3
St @-1(p2)
Xpal p3
Ld @1(p3)
Xae
Ld @1(p3)
Xpah p3
Xae
Xpal p3
Xppc p3
Jmp service

Maybe use the carry flag to indicate call or return. The service routine then handles calls and returns and also interupt service.

This is just a first attempt, could possibly also allocate local storage on the stack and preserve Acc and E in the call to and return from the subroutine.
Mark1960 is offline  
Old 25th Mar 2023, 8:11 pm   #3
Phil__G
Octode
 
Join Date: Mar 2011
Location: North Yorkshire, UK.
Posts: 1,084
Default Re: Stack and subroutine calls

I always thought that the "it doesnt have a stack" comments came from non-users.
I use Nat Semis method using auto-index via P2, its a byte stack rather than word like other processors, but its easy to push & pop byte values onto/off the stack. Kitbug for example makes extensive use of the stack. The 'lost book' from a few threads back has an interesting subroutine management snippet called "The long arm of P3"
Phil__G is online now  
Old 25th Mar 2023, 9:37 pm   #4
SiriusHardware
Dekatron
 
Join Date: Aug 2011
Location: Newcastle, Tyne and Wear, UK.
Posts: 11,484
Default Re: Stack and subroutine calls

I guess I'm one of the guilty ones who peddle the view that the SC/MP doesn't have a stack, but I think I'm right in the sense that it doesn't have a dedicated stack pointer and it doesn't have the 'traditional' CALL and RET type instructions to go with that.
SiriusHardware is online now  
Old 26th Mar 2023, 1:08 am   #5
Phil__G
Octode
 
Join Date: Mar 2011
Location: North Yorkshire, UK.
Posts: 1,084
Default Re: Stack and subroutine calls

Lets compromise & go with "it can perform some stack operations"
Phil__G is online now  
Old 26th Mar 2023, 1:31 am   #6
ortek_service
Octode
 
ortek_service's Avatar
 
Join Date: May 2018
Location: Northampton, Northamptonshire, UK.
Posts: 1,394
Default Re: Stack and subroutine calls

Quote:
Originally Posted by SiriusHardware View Post
I guess I'm one of the guilty ones who peddle the view that the SC/MP doesn't have a stack, but I think I'm right in the sense that it doesn't have a dedicated stack pointer and it doesn't have the 'traditional' CALL and RET type instructions to go with that.
Or not having any Push & Pull (or transfer to & from) registers onto / off the stack instructions, that most other 8-bit uP's had at the time.

The original PIC16xx uC's also had a very limited h/w stack (size/depth-wise), - maybe due to its early General Instruments heritage (originally, as a mask-programmed only uP) - and might also have made compilers more difficult.

It was often said that lack of (particularly full-word / uniform instruction operations) registers, made writing compilers that produced efficient-code more difficult. So later processor architectures, like 68000, had many (Rn) registers that could be used with all instructions, for better compiler support.


With limited memory resources back then, Compilers tended to be quite rare and everything done directly in optimised by-hand assembler. And early ones for home computers, were mainly to speed-up BASIC, by removing the interpreting in real-time overhead.
C compilers did eventually appear for all 8bit uP's, and maybe C might be regarded as a bit lower-level (especially compared to C++) than BASIC / closer to assembler.

The SC/MP 4k page size, and not being able to have code automatically flowing crossing 4K boundaries, might also have been a bit problematic, with need for workarounds by hand.
ortek_service is offline  
Old 31st Mar 2023, 7:59 am   #7
Michael Haardt
Tetrode
 
Join Date: May 2021
Location: Titz, Germany.
Posts: 72
Default Re: Stack and subroutine calls

Mark1960: You can certainly use a pointer register as stack pointer, just like on the IBM/360, and nobody would say that architecture had no stack just because it did not have CALL/RET but instead swapped a register with the PC. Static call frames, not stack allocated ones, were common back then. The disadvantage of using a register as stack pointer is that the subroutine has to unfold its data flow to match a stack, as if you were using Forth, and even Forth has SWAP, DUP, OVER... which is all working around the problem that a stack offers very limited addressing. If you use the same pointer register as frame pointer instead, you can address data with a signed byte offset in a very efficient way and do not need to unfold your data flow.

Forth code that is well written is very efficient, because the addressing overhead of a stack is minimal and a data flow that can make use of that is extremely efficient. But it makes code of even a few lines an optimization puzzle. Forth has two stacks, because you would want to extend that optimization over the whole program and not create and destroy stack frames, which is again overhead. I tried it and it changed my view of C: C is a horribly inefficient language and that is fine with me.

The pages are indeed in the way of a compiler. You certainly could have a linker that inserts jumps to cross pages, and that encodes jumps as short or long. Variable instruction length encoding is essential for transputers. It makes the linker slow, but it works and results in good code.

The two main troubles for a compiler would be that pointer arithmetic does not cross pages and that pointers are so slow to load and store. The first probably means malloc() had to be limited to at most a page per object, which is a serious restriction. Further the stack is limited to a single page, which is probably ok. The speed I don't have a solution for.
Michael Haardt is offline  
Old 31st Mar 2023, 10:55 am   #8
jjl
Octode
 
jjl's Avatar
 
Join Date: Jan 2003
Location: Ware, Herts. UK.
Posts: 1,082
Default Re: Stack and subroutine calls

Modern ARM processors don't quite have CALL and RET equivalent instructions. instead the BL instruction loads the return address into the LR register (R14) and the B LR instruction loads the program counter from LR. LR can be saved and restored from the stack when nested subroutines are used.

John
jjl is offline  
Old 31st Mar 2023, 4:06 pm   #9
Michael Haardt
Tetrode
 
Join Date: May 2021
Location: Titz, Germany.
Posts: 72
Default Re: Stack and subroutine calls

It's funny they named it B LR, because the register addressed version on IBM/360 is BALR and by convention the return address is also stored in R14:

https://faculty.cs.niu.edu/~hutchins...40/more-br.htm

It actually makes sense, because if you do not have to spill the register to memory, you can save two memory accesses, unlike CALL/RET. I don't think that's why they did that for the SC/MP, though, and guess they just followed a contemporary pattern for static call frame linkage.
Michael Haardt is offline  
Old 31st Mar 2023, 6:31 pm   #10
Mark1960
Octode
 
Join Date: Mar 2020
Location: Kitchener, Ontario, Canada
Posts: 1,265
Default Re: Stack and subroutine calls

There are a couple of things that might be a problem if the local data of the subroutine is below the pointer.

Interupts would not be able to use that pointer, as the interupt service routine would not know the size of the locals area.

Each call from the subroutine would need to move the position of the frame pointer as the routine called would not know the size of the calling routines locals area.

Its not just ARM that uses the branch and link method for subroutine calls. This is standard for all RISC type ISAs as it works better with pipelined processors.

The disadvantage for the 8060 is that it only has four registers. One is the PC, P3 is used for branch and link, but also points to the interupt service. If P2 is used for stack or frame pointer then only one remains for general purpose. Copying or sorting data often needs two pointers, sometimes more, so there is quite a lot of swapping the pointer to ram, but then that has to go through the accumulator. P3 can be used if interupts are disabled, so long as the impact on interupt latency is not a problem.

For larger programs it might be better to implement an emulator for a better ISA, CP/M on an 8060 would be interesting
Mark1960 is offline  
Old 1st Apr 2023, 1:12 pm   #11
Michael Haardt
Tetrode
 
Join Date: May 2021
Location: Titz, Germany.
Posts: 72
Default Re: Stack and subroutine calls

Good point that having the frame pointer on top avoids increasing it for each call, but then the subroutine had to increase it on entry and return, so what's saved for the caller creates a cost in the callee. Which of course means you could do it either way.

Interrupts are weird on the SC/MP. Basically you must not use P3, because it holds the interrupt vector, which is an incredible cost for this register starved design. To me that sounds like a decision between rather high level code with call frames or very low level optimized code that keeps P3 reserved and allows interrupts. It might be possible to introduce interrupt scheduling points where you load P3, enable interrupts and disable them right away, which increases the latency, but you could do that at a time where A, E and P1 could be destroyed, which makes the handler faster.

A bytecode interpreter substitutes memory usage against speed. NIBL shows that: Small, but slow. It is best to implement a virtual machine, either a stack machine, or a register machine, or something in between like AcheronVM for the 6502, instead of an actual existing CPU with all of its details. From my experience, addressing quickly becomes the bottleneck in VMs. AcheronVM is brilliant there. Typically you seek to make use of whatever the architecture can do very fast, like the zero page access for the 6502. Is there anything the SC/MP can do fast?

I never thought about branch and link player better with a pipelined architecture. It is interesting that Transputers did not use that, but I guess the thread scheduling points were seriously in the way of a register holding the return address. I never programmed any other RISC architecture in assembler, only CISC.
Michael Haardt is offline  
Old 1st Apr 2023, 5:00 pm   #12
ortek_service
Octode
 
ortek_service's Avatar
 
Join Date: May 2018
Location: Northampton, Northamptonshire, UK.
Posts: 1,394
Default Re: Stack and subroutine calls

Quote:
Originally Posted by Michael Haardt View Post
>>
>>
Typically you seek to make use of whatever the architecture can do very fast, like the zero page access for the 6502. Is there anything the SC/MP can do fast?
>>
Yes, the fast zero-page address mode on the 6502 did help make-up for its lack of registers (although there was often a lot of transferring between A, X &Y / pushing & pulling onto its stack).

And the 6502, with relatively-simple reduced instruction set / low < 3500 transistors count design by a small team, was very-much the inspiration for Acorn's original ARM-designers - after being disappointed by the performance of National Semi's 32016 processor (that was so-complex it had 100 people working on it, and still had MMU issues), plus the 68000 (that Sophie Wilson illustrated in a talk was slower than the 6502 at the same clock speed for a simple 8bit addition), due to all the micro-coding required to implement these.
(Although the 68000's compiler-friendlier abundance of registers, and universal Move instruction between any, did also feature in the ARM, to overcome 6502's lack of these).
- The micro-coding on the original 8051, meant even a NOP took 12 clock cycles! And even the Philips etc enhanced version took 6 cycles, until SiLabs boosted the 8051's popularity a lot with their large-range of small low-power high-speed single-cycle 8051-core micro-controllers.
Acorn's Dr Steve Furber described had they ran typical programs by hand using cards, through their ARM architecture to design-out any bottlenecks in it. And he'd also done an 800-line BBC BASIC simulation of the original ARM, that he reckoned ARM still wouldn't let him release to public domain.

When the ARM was first launched, it could out-perform most other microprocessor, being up there with the much more expensive highest-end PC processors, probably overtaking many of these these when DEC used their Alpha-processor technology on the StrongARM, to boost clock speeds to hundreds of MHz, when original lack of Floating-point Co-Pro had started to hold it back in some PC applications.


Regarding the SC/MP's architecture, I found some interesting discussions / links, here: https://groups.google.com/g/comp.arch/c/uE8CDTtNhwM
- Probably moving-on, from: https://en.wikipedia.org/wiki/Nation...onductor_SC/MP
https://www.cl.cam.ac.uk/teaching/20...tory.html#SCMP
-Where it seems NS's slightly-more successful COP series were successors to SC/MP, but with added stack support.

I wonder how many transistors the SC/MP had, compare to the <3,500 in the 6502, to save complexity and have some parts of it serial.
(BTW, I only recently discovered about some early attempts at '1 bit' Serial computer designs: https://en.wikipedia.org/wiki/Motorola_MC14500B - Although the 4bit 16 instructions were parallel-fed into this).
ortek_service is offline  
Old 1st Apr 2023, 7:15 pm   #13
Phil__G
Octode
 
Join Date: Mar 2011
Location: North Yorkshire, UK.
Posts: 1,084
Default Re: Stack and subroutine calls

My own sc/mp wishlist would have long jumps before a stack pointer!

Last edited by Phil__G; 1st Apr 2023 at 7:21 pm.
Phil__G is online now  
Old 1st Apr 2023, 7:25 pm   #14
Mark1960
Octode
 
Join Date: Mar 2020
Location: Kitchener, Ontario, Canada
Posts: 1,265
Default Re: Stack and subroutine calls

One advantage of the sc/mp is the auto increment and decrement of the pointers. This could be used in subroutines to both initialise a local variable and allocate space on the stack.

The 4k page limit for a stack is not as limiting as the 256 byte stack on 6502, though zero page pointers can be used on 6502 to implement a separate data stack, making 6502 good for forth.
Mark1960 is offline  
Old 1st Apr 2023, 8:48 pm   #15
Phil__G
Octode
 
Join Date: Mar 2011
Location: North Yorkshire, UK.
Posts: 1,084
Default Re: Stack and subroutine calls

That comp.arch thread has so many errors!
Phil__G is online now  
Old 3rd Apr 2023, 9:34 pm   #16
River25
Triode
 
Join Date: Apr 2023
Location: Sydney New South Wales, Australia.
Posts: 31
Default Re: Stack and subroutine calls

Hi,

You can use the "default" (ie as Nat Semi proposed) P2 as your stack pointer, but instead of continually using it with +/- option, just add your hard-coded index for most uses. This is good for saving variables and registers etc without having to change P2. Within certain small routines, such as (for example) serial input and output, you can use the +/- capability, but just make sure you return P2 back to its original value when you exit the routine. While this is a bit of a kludge, it does allow you to save all registers, except P2 in a system where the default end-of-first-page-memory is not RAM on your particular system. However, as you know what your original P2 is, you can save all registers and bung in your P2, and thus can do some debugging if required. As mentioned, not the neatest solution, but can work if your system has a different memory map to the default Kitbug sort of expectation.

river
River25 is offline  
Old 5th Apr 2023, 3:28 am   #17
tritone
Banned
 
Join Date: Nov 2014
Location: Derry, Northern Ireland, UK.
Posts: 167
Default Re: Stack and subroutine calls

Use C
it's non object.
tritone is offline  
Old 5th Apr 2023, 4:08 am   #18
tritone
Banned
 
Join Date: Nov 2014
Location: Derry, Northern Ireland, UK.
Posts: 167
Default Re: Stack and subroutine calls

It's the very middle white note on a Piano beside the two black ones called 'Middel C.;

From there, use it as your compass for direction on the quantim.
tritone is offline  
Old 6th Apr 2023, 11:59 pm   #19
River25
Triode
 
Join Date: Apr 2023
Location: Sydney New South Wales, Australia.
Posts: 31
Default Re: Stack and subroutine calls

Hi,

[QUOTE=ortek_service;1548867]
Quote:
Originally Posted by Michael Haardt View Post
>>
>>
Typically you seek to make use of whatever the architecture can do very fast, like the zero page access for the 6502. Is there anything the SC/MP can do fast?
>>

Yes, the fast zero-page address mode on the 6502 did help make-up for its lack of registers (although there was often a lot of transferring between A, X &Y / pushing & pulling onto its stack).
I remember when the first CPU arguments erupted over the 6502 with it's 8-bit only registers and lack of them. The x80/z80 people could not believe the lack of registers, and, on paper, the 6502 looks like rubbish. However, it isn't, and it is a wonderful little CPU.

The best thing I found working on CPU like the SC/MP and, to a slightly lesser extent, the Signetics 2650, is segmentation on the x86 is nothing. It's a walk in the park. On paper the x86 looks way better then a 6502, but it does cop a lot of derision due to segmentation. Seriously.... a "4-bit shift-left and add" is difficult? Well, not after dealing with a SC/MP or 2650 it's not

river
River25 is offline  
Old 7th Apr 2023, 12:12 am   #20
Mark1960
Octode
 
Join Date: Mar 2020
Location: Kitchener, Ontario, Canada
Posts: 1,265
Default Re: Stack and subroutine calls

For the 4k pages on the sc/mp I have been wondering if it was worth adding a memory mapper, maybe 74ls/hct612 as this also has 4k pages. Taking that one stage further with a couple of 74x283 adders the stack page could scroll over a larger range of memory.
Mark1960 is offline  
Closed Thread

Thread Tools



All times are GMT +1. The time now is 10:25 pm.


All information and advice on this forum is subject to the WARNING AND DISCLAIMER located at https://www.vintage-radio.net/rules.html.
Failure to heed this warning may result in death or serious injury to yourself and/or others.


Powered by vBulletin®
Copyright ©2000 - 2024, vBulletin Solutions, Inc.
Copyright ©2002 - 2023, Paul Stenning.