A few weeks ago I found this topic about recreating an FPGA version of the ST's blitter.
I already posted some thoughts there, but obviously the project is to recreate a feature-exact FPGA replacement because blitters are now hard to find - and some people like to add them in STFs!
However to speed-up the blitter, my reasoning was that it would need enhanced features, like being able to work from two RAM sources instead of one (to speedup masking, for instance).
Of course that kind of new feature is a completely different project, since it wouldn't profit any legacy software. Hence this post opening a different thread.
Also, I must add I have no FPGA programming skill whatsoever, I'm more a software than hardware guy, so everything I say here are pure ideas that I guess won't come to fruition any time. But here they are!
First thoughts I had...
Why not making a blitter than, while being 100% backward compatible with the ST blitter (and cycle accurate), would have another mode that enables more feature, like the Amiga's:
- Two RAM sources instead of one
- Different stride for source 1 and source 2 (to be able to mask & draw sprites in 1 pass on 4 bitplanes, for instance)
- Or alternatively, indicating how many bitplanes the sprite has, but that means 4 shift registers!
- Line drawing
- Polygon fill
Okay, so far it looks pretty much like a more complex Amiga blitter. So at this point I realized it would make a potential FPGA even more complex - because of the handling of 4 bitplanes in 1 pass.
My second thoughts were: wait, but the blitter is also used to change colors, so it has access to all addresses that are available on the ST bus.
So why not adding:
- Changing of whatever color is needed through a prepared list in RAM
It starts to looks like a Copper list now.
Then I found out another project - I can't remember where though - of someone making a 68000 replacement for Amiga, from a Raspberry Pi. So far it's quite in early stage, but the Amiga seems to boot, so I guess it's possible to make a new blitter from a Raspberry Pi - I don't know how, I guess there's some interface card to make between the Raspi and the blitter socket.
In any case, it means that maybe (maybe!) we can make something even more complex.
So... maybe we could somehow make this new blitter aware of the CRT beam position? Perhaps connecting it to the video circuit that handles Vsync and Hsync could allow it to detect Vsync and from there, count lines and/or cycles. The only thing is that it couldn't detect if it's in 50/60Hz until the next Vsync - unless the CPU tells the blitter which mode is actually on.
Now that our new blitter knows where the CRT beams is, it can:
- Change automatically colors based on a list in RAM
- Change the video counter based on a list in RAM
Now we have our Copper list. Actually, it could work the exact same way.
The CPU then had to start this mode, and the blitter would interrupt the CPU only when needed - and even interrupt its own transfers in that case. A kind of autonomous mode.
The only drawback is that it wouldn't work when a DMA disk transfer is in progress - the disk DMA has priority over the blitter. It could be handled by a register asking the blitter to stop every autonomous active mode. In any case, it would be up to the programmer to take care of this.
And of course, it wouldn't work in demos that require overscan, since the CPU needs to count cycles accurately.
...unless we add:
- Automatic sending fullscreen switches to the Shifter and GLUE to open required borders (top/left/right/bottom)
And guess what, that means you can now make fullscreen demos/games even when you use an accelerated CPU (Mega STE, TF cards, you name it). Even the desktop AFAIK, but fullscreen would have to be disabled during disk access.
So why stop here. We have a Raspberry Pi up there? Okay, why not:
- Audio PCM sample channel mix
There you are, a soundtrack mix as fast as you can get since the blitter only needs to read/write samples from RAM.
And well actually, putting a powerful ARM computer on the blitter socket actually make available all the ST RAM to it... so we can reverse roles:
- Mixing soundtrack on Raspi and sending then to a short-looping audio ST-RAM buffer (2 bytes)
- Hosting the "copper"-like list in Raspi memory, making color/video counter change every 4 cycles
- Drawing virtually anything to screen
- Hard-disk driver through the blitter
- Packing/unpacking data
So everything may be possible as long as there's cycles available to access RAM. The Raspberry Pi could even act as the ST is handled as a mere display...
I also must say I'm a fan of STE - I don't have any STF - so I'm naturally inclined to think about what would be possible on the blitter socket. But of course I guess everything would be possible to make on the 68000 socket as well. It's just that STF projects usually don't work on STE because of the 68000 socket difference. And also, it may start to be complex to add accelerators *and* a new blitter on the same socket.
Well, that's it :)
Enhanced Blitter
-
exxos
- Site Admin

- Posts: 28361
- Joined: 16 Aug 2017 23:19
- Location: UK
Re: Enhanced Blitter
The problem is with a lot of that is your wanting to extend the "tricks" a ST can do rather than just solving the problems properly in the first place. It all boils down to ram speed and how fast the cpu can access it.
Its been talked about many times already.. That if you have more ram bandwidth you open up a desktop of higher resolution, so you don't need overscan or opening the borders, as basically those are hacks not solutions. Same with colours, more ram bandwidth you have the better video modes you can have..
As I've said many times already, extra features can be added, but the blitter is really the wrong place.. It needs to steal cycles from DMA or CPU etc., if a better MMU is developed, we could have new functions for screen copy and clearing blocks of ram fast.. With some minor patches to TOS, it could use those new features for all desktop apps. It really makes the blitter obsolete in the long run.
Hard disk drivers are already out there and in use with the blitter, I even posted about it yesterday in fact.
Its been talked about many times already.. That if you have more ram bandwidth you open up a desktop of higher resolution, so you don't need overscan or opening the borders, as basically those are hacks not solutions. Same with colours, more ram bandwidth you have the better video modes you can have..
As I've said many times already, extra features can be added, but the blitter is really the wrong place.. It needs to steal cycles from DMA or CPU etc., if a better MMU is developed, we could have new functions for screen copy and clearing blocks of ram fast.. With some minor patches to TOS, it could use those new features for all desktop apps. It really makes the blitter obsolete in the long run.
Hard disk drivers are already out there and in use with the blitter, I even posted about it yesterday in fact.
-
sporniket
- Site sponsor

- Posts: 1164
- Joined: 26 Sep 2020 21:12
- Location: France
Re: Enhanced Blitter
The problem with that is that it would be limited to 4 bitplane mode -to keep it simple-. There are other graphical modes like in the Falcon that also has a blitter. (but let's restrict ourself to the ST, because there is a big problem that arise when dealing with STE and further models, see below)fenarinarsa wrote: 22 Dec 2020 22:25
- Or alternatively, indicating how many bitplanes the sprite has, but that means 4 shift registers!
[...] because of the handling of 4 bitplanes in 1 pass.
I understand that it appels you as you are more a software than a hardware guy (I'm the same), but to me the only thing that prevent one to make "more complexes" thing in HDL vs software program is lack of knowledge and experience with HDL, and the fact that software programming allows to stay at a high level of abstraction.fenarinarsa wrote: 22 Dec 2020 22:25 Then I found out another project - I can't remember where though - of someone making a 68000 replacement for Amiga, from a Raspberry Pi. So far it's quite in early stage, but the Amiga seems to boot, so I guess it's possible to make a new blitter from a Raspberry Pi - I don't know how, I guess there's some interface card to make between the Raspi and the blitter socket.
In any case, it means that maybe (maybe!) we can make something even more complex.
From the STE onward, I believe that the Blitter is embedded in the video chip (although there is a socket for a separate Blitter for early models of STE). So you would have to mutate your project of super-blitter into a super-shifter (STE) and super-videl (Falcon)... Or even integrate the duo MCU/Shifter or Combel/Videl into a super chip ! (that's my HDL fantasies by the way, that and mutate a 68k CPU design)fenarinarsa wrote: 22 Dec 2020 22:25 I also must say I'm a fan of STE - I don't have any STF - so I'm naturally inclined to think about what would be possible on the blitter socket. But of course I guess everything would be possible to make on the 68000 socket as well. It's just that STF projects usually don't work on STE because of the 68000 socket difference. And also, it may start to be complex to add accelerators *and* a new blitter on the same socket.
Well, that's it :)
-
fenarinarsa
- Posts: 37
- Joined: 18 Dec 2017 16:03
Re: Enhanced Blitter
It's two different projects. You're talking about making a different ST and I might say the remake project is beautiful.exxos wrote: 23 Dec 2020 01:26As I've said many times already, extra features can be added, but the blitter is really the wrong place.. It needs to steal cycles from DMA or CPU etc., if a better MMU is developed, we could have new functions for screen copy and clearing blocks of ram fast.. With some minor patches to TOS, it could use those new features for all desktop apps. It really makes the blitter obsolete in the long run.
Changing the MMU, another video chip, is like making a new TOS-compatible computer, or a new ST+, if you want. Because after that, why not putting a hardware accelerated VGA card, or even a modern 3D PCI card? The sky is the limit.
However I'm more thinking of it as a graphic/audio hardware acceleration board that would fit on any blitter socket. That means a lot of STFs, Mega STs and STEs could take advantage of it just by putting this on the blitter socket. So yes that involves hacking and working around the bus limitation.
Actually my first thought is that it would fit very well in my Mega STE so that I could activate the 16Mhz mode and do a fullscreen demo anyway XD (I'm still wondering why there wasn't more accelerator boards based on an associative cache, it works really well).
In any case, I'm not naive, it would be useful only to a handful of people - most ST users just want to play their old games or demos. They're not into hacking. And that kind of project is often one shots anyway - proof of concept, make raytracing and textured 3D in 4096 colors, amazing, but you're the only one to be able to run it :p
All those projects are hobbyist projects, and as I said I'm just throwing up ideas, I don't expect them to come to life anytime.
Well actually there's now mixed solutions it seems, like the DE10 card that mix an ARM and FPGA array. It seems very powerful to say the least.sporniket wrote: 23 Dec 2020 05:57 I understand that it appels you as you are more a software than a hardware guy (I'm the same), but to me the only thing that prevent one to make "more complexes" thing in HDL vs software program is lack of knowledge and experience with HDL, and the fact that software programming allows to stay at a high level of abstraction.
But as for everyone, my time is limited, days are only 24h long and I have a job and a family. I already struggle to find time to recap all my ST/Amiga/Consoles :roll:
Again, it's making another completely new computer IMO. Also in that case it would be better to make a completely new motherboard like the remake project does.sporniket wrote: 23 Dec 2020 05:57 From the STE onward, I believe that the Blitter is embedded in the video chip (although there is a socket for a separate Blitter for early models of STE). So you would have to mutate your project of super-blitter into a super-shifter (STE) and super-videl (Falcon)... Or even integrate the duo MCU/Shifter or Combel/Videl into a super chip ! (that's my HDL fantasies by the way, that and mutate a 68k CPU design)
Or make all the computer in FPGA, for that matter.
I don't know the % of STEs that have an integrated blitter. Mine is a late TOS 1.62 but has an external blitter.
I don't take the Falcon into account, a simple (but expensive) accelerator board is enough for a Falcon ;)
-
exxos
- Site Admin

- Posts: 28361
- Joined: 16 Aug 2017 23:19
- Location: UK
Re: Enhanced Blitter
There are already whole computers in FPGA.. Suska, mist, mister,forever etc if you wanted to add more features without a redesign then that's probably the best way of doing it.
I am working on a whole new audio system anyway., but i just have so many ongoing things and my store soaks up pretty much all my time., so everything happens really slowly.. I know nothing about FPGA stuff, so in have to learn that before I can even think about anything else.
But there are huge problems in developing on original machines,, just look at the list of fixes.. I've probably spent 95% of hardware development time in fixing and finding faults with original machines. Its why I did a new motherboard. Now we have a solid foundation to build on and that's the way forward.
There is the issue of new features that unless someone writes code for it, then it becomes pointless. Its why I gave up with the idea of adding a DSP sound system like the falcon has. But it goes for pretty much any hardware add-on. Of course anyone can develop on the remake platform and do what they want.
So not trying to squash any ideas for anything, though I think anything new has to work with original software as much as possible. This way the hardware isn't depending on people to write software for it. For example. TOS has routines to do graphics on the CPU or blitter if installed. NVDI replaces a lot of stuff and runs very fast on the CPU. But if we have new hardware built into the MMU which can block copy ram faster than the blitter, then some patches in TOS can be done to use the new hardware instead. Once TOS is aware of the new hardware, let's just call it a super blitter, then all TOS apps could make use of it. I'm sure demo coders would love such fast copy functions,
As for sound, there is nothing much the ST can't do already. But its limited to what the CPU and RAM can do. So playing audio eats up all the speed.. This was improved on the STE, though my audio system is basically on its own bus. Like the falcon DSP has its own ram . So it doesn't take up CPU or ram time. The system is simple enough to even patch games to use it.
So there's a lot of ongoing work with all this. With a faster CPU, you can do a lot more cool stuff than you could on a stock machine. I don't know what hardware features may help in relation to then blitter vs the cpu doing it. Pure logic functions could be done , but I think the blitter does a lot of those anyway.. Though I'm not a programmer so can't really say..
I mean we have bytewap on the hard drive, that's a lot faster in pure hardware than on the CPU. Adding a new feature like that into a PLD would be simple, maybe bit flips, inversions, shifts, etc. A PLD could do a lot at 100mhz speeds. But it would be the CPU overhead in setting up the new hardware which may likely just end up being quicker on the CPU anyway... It really isn't easy to decide what directions to move in..
I am working on a whole new audio system anyway., but i just have so many ongoing things and my store soaks up pretty much all my time., so everything happens really slowly.. I know nothing about FPGA stuff, so in have to learn that before I can even think about anything else.
But there are huge problems in developing on original machines,, just look at the list of fixes.. I've probably spent 95% of hardware development time in fixing and finding faults with original machines. Its why I did a new motherboard. Now we have a solid foundation to build on and that's the way forward.
There is the issue of new features that unless someone writes code for it, then it becomes pointless. Its why I gave up with the idea of adding a DSP sound system like the falcon has. But it goes for pretty much any hardware add-on. Of course anyone can develop on the remake platform and do what they want.
So not trying to squash any ideas for anything, though I think anything new has to work with original software as much as possible. This way the hardware isn't depending on people to write software for it. For example. TOS has routines to do graphics on the CPU or blitter if installed. NVDI replaces a lot of stuff and runs very fast on the CPU. But if we have new hardware built into the MMU which can block copy ram faster than the blitter, then some patches in TOS can be done to use the new hardware instead. Once TOS is aware of the new hardware, let's just call it a super blitter, then all TOS apps could make use of it. I'm sure demo coders would love such fast copy functions,
As for sound, there is nothing much the ST can't do already. But its limited to what the CPU and RAM can do. So playing audio eats up all the speed.. This was improved on the STE, though my audio system is basically on its own bus. Like the falcon DSP has its own ram . So it doesn't take up CPU or ram time. The system is simple enough to even patch games to use it.
So there's a lot of ongoing work with all this. With a faster CPU, you can do a lot more cool stuff than you could on a stock machine. I don't know what hardware features may help in relation to then blitter vs the cpu doing it. Pure logic functions could be done , but I think the blitter does a lot of those anyway.. Though I'm not a programmer so can't really say..
I mean we have bytewap on the hard drive, that's a lot faster in pure hardware than on the CPU. Adding a new feature like that into a PLD would be simple, maybe bit flips, inversions, shifts, etc. A PLD could do a lot at 100mhz speeds. But it would be the CPU overhead in setting up the new hardware which may likely just end up being quicker on the CPU anyway... It really isn't easy to decide what directions to move in..
-
sporniket
- Site sponsor

- Posts: 1164
- Joined: 26 Sep 2020 21:12
- Location: France
Re: Enhanced Blitter
I am aware of whole computers in FPGA like suska/mist/mister. This summer I had to choose between the H5 project that I had just discovered and the Mister FPGA project. The price tag of the H5, and the fact that my hard drive was too full to install the altera IDE pushed toward the hardware route :D
Well, my fantasies will be addressed in time by myself, and the H5 will be more helpfull as an experimental test bed than would have been the Mister project. And the H5 by itself gave ways to new fantasies to explore too.
Well, my fantasies will be addressed in time by myself, and the H5 will be more helpfull as an experimental test bed than would have been the Mister project. And the H5 by itself gave ways to new fantasies to explore too.
For me, on a ST(e) with a 68000 CPU, the blitter helps a lot with it's shifting abilities, thus allowing to use sprites without preshifting. Plus the possibility to program big memory moves without the "movem dance", either in hog mode, or in a kind of multitasking. Same thing for the DMA sound, no need to write a routine that hog the cpu to drive the YM. Plus I keep the YM to do bleep and blop.exxos wrote: 24 Dec 2020 02:26 I don't know what hardware features may help in relation to then blitter vs the cpu doing it. Pure logic functions could be done , but I think the blitter does a lot of those anyway.. Though I'm not a programmer so can't really say..
Who is online
Users browsing this forum: ClaudeBot, don_apple, MegaSTEarian and 8 guests