Many German publications of that time had the peculiarity of using German terms for everything
and Siemens was most radical in it.
For example the stack was often referred to as "Kellerstapelspeicher"[1].
The stack pointer was "Kellerstapelspeicherzeiger".
The whole stack terminology revolved around the basement (Keller = cellar) metaphor. For example push was "einkellern" and pop "auskellern". I think this was influenced by Friedrich L. Bauer's work.
[1] The linked manual is quite lax and uses just "Stapelspeicher" and "Stapelzeiger". But at least it has "Stapelzeigeradresse" which is a nice and long compound.
I’ve only recently scratched the surface of CPU architecture/ hardware , and one of the challenges was picturing the register stack. For some reason, reimagining this as a single doorway keller (cellar) made the whole First-In Last-Out arrangement much easier to visualise! Strange how words can be transformative!
Same with Rockwell (Allen-Bradley) PLCs.
You download a program to the PLC, and upload a program from a PLC.
I always assumed the naming confusion for those was just a matter of perspective; if you are thinking about a "download" from the perspective of the PLC receiving the file or the user sending the file.
Because googling for "Kellerstapelspeicherzeiger" yields only a single result which purports the same legend on stackexchange without providing a source and has been downvoted to -3.
As a matter of fact the submission links the full manual and they also just use "Stapelspeicher" and not "Kellerstapelspeicher". And that manual is by Siemens and from the 80s.
"Stapelzeiger" and "Stapelspeicher" otoh yield many results.
Googling for "Kellerstapelspeicher" only yield results for "Keller, Stapelspeicher", indicating that those are not used in one words but are synonyms.
You are right to be skeptical. Unfortunately I could not find the "Handbuch für Datenverarbeitung" mentioned in the Stack Exchange post. Amazon lists it with ISBN 3927892009 but it is not available unsurprisingly. Neither libgen nor archive.org seem to have it either.
Apart from several humorous mentions over the decades I found one serious reference in an exam text from 1997 [1].
It's unlikely to appear in Bauer's publications because he coined and used the Keller metaphor. It is much more likely to appear in later publications that tried to translate stack but at the same time wanted to pay tribute to Bauer, hence the usage of both Keller and Stapel.
That's not a peculiarity. "Stack" is a metaphor just like "Keller" (cellar) is. And in fact "Keller" is the apparently older term, as it was already used in the original Samelson/Bauer patent. See Wikipedia:
> Klaus Samelson and Friedrich L. Bauer of Technical University Munich proposed the idea of a stack called Operationskeller ("operational cellar") in 1955[6][7] and filed a patent in 1957.[8][9][10][11] In March 1988, by which time Samelson was deceased, Bauer received the IEEE Computer Pioneer Award for the invention of the stack principle.[12][7]
I remember the knowledge/idea of illegal opcodes for the C64 slowly making their way through the hacker/cracker community. Like secret knowledge being distilled back then via BBS
they were quite useful. I distinctly remember there were undocumented NOPs consuming a different number of cycles than the default one EA. These were fe used to pad raster line interrupt routines with cycle precision. (damn, I'm old)
Throughout the entire NES game library, almost no games use these illegal opcodes. Apparently as part of the licensing process, Nintendo would verify that games only used the official instructions.
I wonder how they tested that, though? I don't think developers had to submit their source code to Nintendo, so they would have had to analyse the binaries in some way?
I don't know if they even had an emulator at the time - I don't think a 1980s PC could run a NES emulator at a reasonable speed.
Another possibility is that they used a hardware device. Perhaps something that watches the 6502 `sync` pin to know when an opcode byte is being read, and verifies that the data bus contains a legal value.
It's weird how the x86/PC/WinDOS combination was inferior to almost anything else contemporary, and still killed each and every competing branch of general consumer computing so dead we can't even imagine anything else anymore.
The market was desperate for any kind of standard, but computer manufacturers were rabidly against working with competitors. Then IBM, the gorilla in the room and watched like hawks by everybody, accidentally delivered something open-ish even if ugly. The whole market recrystalized around it, while IBM frantically tried to close the ecosystem and locked themselves out of their own ecosystem. Meanwhile, Microsoft and Intel had no problem with anybody giving them money, developers could deliver 1 version of the application to everyone, and consumers knew what to choose.
And now we are back to BigTech all trying to again create their own ecosystem and even hold the user hostage to their whims. It's now all about control, not just over the users but even over developers.
In these older chips⁰ such instructions are undocumented because they were not intended to exist, rather than they were but did not work in all circumstances.
It is a side effect of how the instructions are decoded into what parts of the CPU are involved: if this bit is set then affect the X register, unless this other bit is set in which case affect Y, and if one of these bits is set use the ALU, … This results in other instructions magically appearing because it isn't a case of every valid instruction being directly listed¹ and every invalid one having no definition at all. It is also why the opcodes seem strewn around the instructions space arbitrarily, sometimes in obvious groups and sometimes apparently spread widely. And why it is risky to use them: if a later revision of the CPU adds any new instructions, the accidentally useful opcodes could become something quite² different as the designers only need to make sure the official ones keep functioning as they did before.
--------
[0] And probably more current ones too, though I've not kept a close eye on CPU design like the university aged and younger versions of me used did.
[1] There usually isn't a straight table of “this code means do that to that with this and store there” — that would take more silicon, and likely be notably slower, than the pass-the-bits-through-a-bunch-of-logic-gates approach.
There isn't any dedicated instruction decoding for these "illegal" instructions, at all. Generally, just two regular instructions happen to be executed at once. Some of these work out reliably and somewhat useful, others result in the processor becoming stuck (so-called JAM or KIL instructions). Some others kind of work, but don't yield any external result (much like a multi-byte and multi-cycle NOP), e.g., if we try to what would amount to store a register with immediate address mode. (Storing doesn't set any flags and there is no viable write address, therefore there isn't any external result to this operation.)
Generally speaking, there are no legal opcodes with both lowest bits set and the decoding triggers for both instructions with either the lowest or the second lowest bit set. (There are some more outside of this pattern, but these result more often in a "jammed" CPU than not.)
I always wonder if something like these undocumented opcodes could be used as a concept in more modern processors. Backend, transistors were a precious resource, and the result was those opcodes. Nowadays, instruction encoding space is more precious because of pressure on the instruction cache. Decoding performance might also be relevant.
The result of these thoughts is something I called "PISC", programmable instruction set computer, which basically means an unchanged back-end (something like RISC + vector) but a programmable decoder in front of it. So then different pieces of code could use different encodings, optimized for each case.
...which you get in RISC with subroutines + instruction cache, if you regard the CALL instructions as "encoded custom instructions", but not quite because CALLs waste a lot of bits, and you need additional instructions to pass arguments.
For pure RISC, all of this would at best take some pressure of the instruction cache, so probably not worth it. Might be more interestring for VLIW backends.
ARM has the Thumb opcodes, which aren't your "PISC" concept (which is a limited form of loadable microcode, another thing that's been done) but special, shorter encodings of a subset of ARM opcodes, which the CPU recognizes when an opcode flips an internal bit. There's also Thumb-2, which has a mix of short (16-bit) and full-size (32-bit) opcodes to help fix some performance problems of the original Thumb concept:
> ARMv4T and later define a 16-bit instruction set called Thumb. Most of the functionality of the 32-bit ARM instruction set is available, but some operations require more instructions. The Thumb instruction set provides better code density, at the expense of performance.
> ARMv6T2 introduces Thumb-2 technology. This is a major enhancement to the Thumb instruction set by providing 32-bit Thumb instructions. The 32-bit and 16-bit Thumb instructions together provide almost exactly the same functionality as the ARM instruction set. This version of the Thumb instruction set achieves the high performance of ARM code along with the benefits of better code density.
> The Thumb instruction set is a subset of the most commonly used 32-bit ARM instructions. Thumb instructions are each 16 bits long, and have a corresponding 32-bit ARM instruction that has the same effect on the processor model. Thumb instructions operate with the standard ARM register configuration, allowing excellent interoperability between ARM and Thumb states.
> On execution, 16-bit Thumb instructions are transparently decompressed to full 32-bit ARM instructions in real time, without performance loss.
For an example of how loadable microcode worked in practice, look up the Three Rivers PERQ:
> The name "PERQ" was chosen both as an acronym of "Pascal Engine that Runs Quicker," and to evoke the word perquisite commonly called a perk, that is an additional employee benefit
And for a description of how to build a computer with loaded microcode you could start with Mick and Brick [1] (PDF). It describes using AMD 2900 series [2] components, the main alternative at the time was to use the TI 74181 ALU [3] and build your own microcode engine.
> On execution, 16-bit Thumb instructions are transparently decompressed to full 32-bit ARM instructions in real time, without performance loss.
That quote is from the ARM7TDMI manual - the CPU used in the Game Boy Advance, for example. I believe later processors contained entirely separate ARM and Thumb decoders.
Many German publications of that time had the peculiarity of using German terms for everything and Siemens was most radical in it.
For example the stack was often referred to as "Kellerstapelspeicher"[1]. The stack pointer was "Kellerstapelspeicherzeiger".
The whole stack terminology revolved around the basement (Keller = cellar) metaphor. For example push was "einkellern" and pop "auskellern". I think this was influenced by Friedrich L. Bauer's work.
[1] The linked manual is quite lax and uses just "Stapelspeicher" and "Stapelzeiger". But at least it has "Stapelzeigeradresse" which is a nice and long compound.
German is the Java of human languages (at least when related to naming)
Relevant: https://youtube.com/watch?v=ADqLBc1vFwI
I’ve only recently scratched the surface of CPU architecture/ hardware , and one of the challenges was picturing the register stack. For some reason, reimagining this as a single doorway keller (cellar) made the whole First-In Last-Out arrangement much easier to visualise! Strange how words can be transformative!
And Siemens does it still, even though they translate words to English. Using Tia Portal, one "downloads" a program from the host to the PLC.
That's pretty common terminology in PLC's though. You also download to Koyo PLC's from Automation Direct.
Same with Rockwell (Allen-Bradley) PLCs. You download a program to the PLC, and upload a program from a PLC.
I always assumed the naming confusion for those was just a matter of perspective; if you are thinking about a "download" from the perspective of the PLC receiving the file or the user sending the file.
Well, at least, they don't "play them in" :)
Can you back this up?
Because googling for "Kellerstapelspeicherzeiger" yields only a single result which purports the same legend on stackexchange without providing a source and has been downvoted to -3.
As a matter of fact the submission links the full manual and they also just use "Stapelspeicher" and not "Kellerstapelspeicher". And that manual is by Siemens and from the 80s.
"Stapelzeiger" and "Stapelspeicher" otoh yield many results.
Googling for "Kellerstapelspeicher" only yield results for "Keller, Stapelspeicher", indicating that those are not used in one words but are synonyms.
You are right to be skeptical. Unfortunately I could not find the "Handbuch für Datenverarbeitung" mentioned in the Stack Exchange post. Amazon lists it with ISBN 3927892009 but it is not available unsurprisingly. Neither libgen nor archive.org seem to have it either.
Apart from several humorous mentions over the decades I found one serious reference in an exam text from 1997 [1].
[1] https://www.telle-online.de/fernuni/ruf/klausur/1704-97.html...
My first impression was that this is another sarcastic variation of the German tapeworm theme.
The German-language Wikipedia article on Stapelspeicher [1] points to a a patent registration [2].
[1] https://de.wikipedia.org/wiki/Stapelspeicher [2] https://worldwide.espacenet.com/patent/search/family/0069672...
It's unlikely to appear in Bauer's publications because he coined and used the Keller metaphor. It is much more likely to appear in later publications that tried to translate stack but at the same time wanted to pay tribute to Bauer, hence the usage of both Keller and Stapel.
Which supports what I am saying.
That's not a peculiarity. "Stack" is a metaphor just like "Keller" (cellar) is. And in fact "Keller" is the apparently older term, as it was already used in the original Samelson/Bauer patent. See Wikipedia:
> Klaus Samelson and Friedrich L. Bauer of Technical University Munich proposed the idea of a stack called Operationskeller ("operational cellar") in 1955[6][7] and filed a patent in 1957.[8][9][10][11] In March 1988, by which time Samelson was deceased, Bauer received the IEEE Computer Pioneer Award for the invention of the stack principle.[12][7]
https://en.wikipedia.org/wiki/Stack_(abstract_data_type)#His...
In this thread: Americans bewildered that someone might prefer to speak their native language instead of English.
They were also discussed at length in the German magazine "64er". fe here: https://www.64er-magazin.de/8503/opcodes.html
I remember the knowledge/idea of illegal opcodes for the C64 slowly making their way through the hacker/cracker community. Like secret knowledge being distilled back then via BBS
they were quite useful. I distinctly remember there were undocumented NOPs consuming a different number of cycles than the default one EA. These were fe used to pad raster line interrupt routines with cycle precision. (damn, I'm old)
IIRC the "normal" NOP took 2 cycles, so we used the illegal ones to waste an odd number of cycles for accurate raster line timing.
Throughout the entire NES game library, almost no games use these illegal opcodes. Apparently as part of the licensing process, Nintendo would verify that games only used the official instructions.
I wonder how they tested that, though? I don't think developers had to submit their source code to Nintendo, so they would have had to analyse the binaries in some way?
> I wonder how they tested that, though?
The QA process might have been performed on an emulator that doesn't support illegal opcodes.
I don't know if they even had an emulator at the time - I don't think a 1980s PC could run a NES emulator at a reasonable speed.
Another possibility is that they used a hardware device. Perhaps something that watches the 6502 `sync` pin to know when an opcode byte is being read, and verifies that the data bus contains a legal value.
Ah, when German engineering actually meant something (like putting some work in)… :'(
It's weird how the x86/PC/WinDOS combination was inferior to almost anything else contemporary, and still killed each and every competing branch of general consumer computing so dead we can't even imagine anything else anymore.
The market was desperate for any kind of standard, but computer manufacturers were rabidly against working with competitors. Then IBM, the gorilla in the room and watched like hawks by everybody, accidentally delivered something open-ish even if ugly. The whole market recrystalized around it, while IBM frantically tried to close the ecosystem and locked themselves out of their own ecosystem. Meanwhile, Microsoft and Intel had no problem with anybody giving them money, developers could deliver 1 version of the application to everyone, and consumers knew what to choose.
And now we are back to BigTech all trying to again create their own ecosystem and even hold the user hostage to their whims. It's now all about control, not just over the users but even over developers.
> The instructions LAX Immediate, AAX X reg $02 and AAX X reg accu $02 are not always processed correctly.
Maybe that was the reason they were undocumented?
In these older chips⁰ such instructions are undocumented because they were not intended to exist, rather than they were but did not work in all circumstances.
It is a side effect of how the instructions are decoded into what parts of the CPU are involved: if this bit is set then affect the X register, unless this other bit is set in which case affect Y, and if one of these bits is set use the ALU, … This results in other instructions magically appearing because it isn't a case of every valid instruction being directly listed¹ and every invalid one having no definition at all. It is also why the opcodes seem strewn around the instructions space arbitrarily, sometimes in obvious groups and sometimes apparently spread widely. And why it is risky to use them: if a later revision of the CPU adds any new instructions, the accidentally useful opcodes could become something quite² different as the designers only need to make sure the official ones keep functioning as they did before.
--------
[0] And probably more current ones too, though I've not kept a close eye on CPU design like the university aged and younger versions of me used did.
[1] There usually isn't a straight table of “this code means do that to that with this and store there” — that would take more silicon, and likely be notably slower, than the pass-the-bits-through-a-bunch-of-logic-gates approach.
[2] Or, perhaps worse, subtly different!
There isn't any dedicated instruction decoding for these "illegal" instructions, at all. Generally, just two regular instructions happen to be executed at once. Some of these work out reliably and somewhat useful, others result in the processor becoming stuck (so-called JAM or KIL instructions). Some others kind of work, but don't yield any external result (much like a multi-byte and multi-cycle NOP), e.g., if we try to what would amount to store a register with immediate address mode. (Storing doesn't set any flags and there is no viable write address, therefore there isn't any external result to this operation.)
Generally speaking, there are no legal opcodes with both lowest bits set and the decoding triggers for both instructions with either the lowest or the second lowest bit set. (There are some more outside of this pattern, but these result more often in a "jammed" CPU than not.)
Some rambling...
I always wonder if something like these undocumented opcodes could be used as a concept in more modern processors. Backend, transistors were a precious resource, and the result was those opcodes. Nowadays, instruction encoding space is more precious because of pressure on the instruction cache. Decoding performance might also be relevant.
The result of these thoughts is something I called "PISC", programmable instruction set computer, which basically means an unchanged back-end (something like RISC + vector) but a programmable decoder in front of it. So then different pieces of code could use different encodings, optimized for each case.
...which you get in RISC with subroutines + instruction cache, if you regard the CALL instructions as "encoded custom instructions", but not quite because CALLs waste a lot of bits, and you need additional instructions to pass arguments.
For pure RISC, all of this would at best take some pressure of the instruction cache, so probably not worth it. Might be more interestring for VLIW backends.
Sounds a lot like what Transmeta was doing.
ARM has the Thumb opcodes, which aren't your "PISC" concept (which is a limited form of loadable microcode, another thing that's been done) but special, shorter encodings of a subset of ARM opcodes, which the CPU recognizes when an opcode flips an internal bit. There's also Thumb-2, which has a mix of short (16-bit) and full-size (32-bit) opcodes to help fix some performance problems of the original Thumb concept:
https://developer.arm.com/documentation/dui0473/m/overview-o...
> ARMv4T and later define a 16-bit instruction set called Thumb. Most of the functionality of the 32-bit ARM instruction set is available, but some operations require more instructions. The Thumb instruction set provides better code density, at the expense of performance.
> ARMv6T2 introduces Thumb-2 technology. This is a major enhancement to the Thumb instruction set by providing 32-bit Thumb instructions. The 32-bit and 16-bit Thumb instructions together provide almost exactly the same functionality as the ARM instruction set. This version of the Thumb instruction set achieves the high performance of ARM code along with the benefits of better code density.
So, in summary:
https://developer.arm.com/documentation/ddi0210/c/CACBCAAE
> The Thumb instruction set is a subset of the most commonly used 32-bit ARM instructions. Thumb instructions are each 16 bits long, and have a corresponding 32-bit ARM instruction that has the same effect on the processor model. Thumb instructions operate with the standard ARM register configuration, allowing excellent interoperability between ARM and Thumb states.
> On execution, 16-bit Thumb instructions are transparently decompressed to full 32-bit ARM instructions in real time, without performance loss.
For an example of how loadable microcode worked in practice, look up the Three Rivers PERQ:
https://en.wikipedia.org/wiki/PERQ
> The name "PERQ" was chosen both as an acronym of "Pascal Engine that Runs Quicker," and to evoke the word perquisite commonly called a perk, that is an additional employee benefit
And for a description of how to build a computer with loaded microcode you could start with Mick and Brick [1] (PDF). It describes using AMD 2900 series [2] components, the main alternative at the time was to use the TI 74181 ALU [3] and build your own microcode engine.
[1] https://www.mirrorservice.org/sites/www.bitsavers.org/compon... [2] https://en.wikipedia.org/wiki/AMD_Am2900 [3] https://en.wikipedia.org/wiki/74181
> On execution, 16-bit Thumb instructions are transparently decompressed to full 32-bit ARM instructions in real time, without performance loss.
That quote is from the ARM7TDMI manual - the CPU used in the Game Boy Advance, for example. I believe later processors contained entirely separate ARM and Thumb decoders.
[dead]