Anyway, It started out way back in my original booster thread on my website https://exxosforum.co.uk/atari/last/16mhz/index.htm
The long and short of it all, is basically cutting the 16mhz input to the MMU and feeding in 32mhz. Then everything "down the line" gets double clocked. This also needs faster ST-RAM! Then the GLUE needs to work to fix some video issues.
The end result is basically this before fixing the video timings.
Then once you resync the DE signal from the GLUE you can get this.
And as mentioned in this thread the driver actually already exists which Troed experimented with.
viewtopic.php?p=8331#p8331
640x200x16 becomes 320x400x16
1280x200x4 becomes 640x400x4
1280x400x2 becomes 2560x200x2
Now before anyone gets excited (again) this modification was abandoned for various reasons. I know people got upset about this decision, I'm not pleased about it myself but there was really no choice.
Problem is the ST chipsets has various "wake up states" where when the MMU for example takes 32MHz and down clocks to 8Mhz etc, there is nothing to reset the clock division flip-flops. So the synchronisation between the 32 MHz and 8 MHz is somewhat "random". Firstly it is very difficult to start running things faster when the foundation is basically unstable with random timings.
In terms of the wake up states, this is off the top of my head so may not be entirely accurate, but this is basically the overall gist of the problem..
The modification would only work with I believe WS2 reliably. WS1 for example would mean DTACK would arrive to fast and RAM wouldn't be yet ready. So thus it crashes. With WS3 for example, DTACK would arrive too late and the system would inherently slowdown because RAM access was slower. So you cannot do anything but deliberately put a delay on DTACK to make sure it is always on time or too late. This means it was like 50:50 if you would end up with 200% bus speed or something slower like 180% speed. Obviously this makes it very difficult to generate resolutions based on unstable timings.
There is basically a one in four chance of it working correctly. If your machine always favoured WS2 for example you would be basically pleased that it all works every time. However the machines may fire up in W1 which would be a constant failure. But the machines could also power up in WS3 and always be to slow . and because the timings are random, anyone's guess how it would power up. One day could work perfectly fine, and the next day he would refuse to work at all.
To further compound the problem, the GLUE also seemed to have its own 2 wake up states. It was 50-50 if the video would be "wrapped" with the first 16 pixels on screen. Read it as screen corruption if you like.
To even more further compound the problem, the rise and fall times of the MMU generating the clocks is really bad and needs buffering which can then also upset the timings if not careful . To even further compound the problems, as we all know the original machine is not exactly that stable to start with. You're fighting with groundbounce problems, ringing on signals which then needs more attention.. It really is a world of pain.
Even more further compound the problem, it's explaining all this to the "general public" they will be trying to do this modification themselves. Where it will be very difficult to work out if they had a problem with the motherboard, how they built the modification, or just simply does not work because of everything I said above.
Certainly from a "commercial" point of view in selling a modification board for people to wire in , like nemesis if you like, even if people have a working product to start with, there is still a lot of user error and points of failure. Trying to diagnose these problems would be a ongoing nightmare. It is why I was buying the SMT MMUs as at least everyone will seriously start off with something "similar". But again it was just way too much work.
But also people have to realise "what's next". Assuming we did get all this 16Mhz working, and got it all reasonably stable, what's next ? There is no way we could scale up to 32MHz for example. So I made the decision to just abandon that direction totally and work on the FPGA chipset instead and put all the time and effort into that project. This way we can control every chip and bend it to our desire. We basically have a open door for possibilities in that respect.
So why am I babbling on about higher bus speeds I hear you ask ?
Because the shifter itself does not really do much video wise. What happens is the MMU dumps 4x16bit data to the shifter which is the four bit planes. The shifter takes one pixel from each of those four words, does a bit of internal voodoo for converting that into a colour and is literally dumps it onto the screen based on the 32MHz pixel clock. One thing I did fail to mention is that the shifter clock has to be doubleclocked for higher horizontal resolutions because you have to output the data twice as fast. But the shifter itself does not really know anything about resolutions.
Basically it all boils down to memory bandwidth (aka bus speed). But I believe what Atari did on the Falcon was to cripple the bus with a 16bit CPU (well its 32bit but only 16bit was used!) , but the actual video circuit actually ran at 32bit. The problem is like when you change your desktop do true colour for example, it can take a couple of seconds easily to "draw" the background even though the image itself is in true colour and in high-resolution. The problem there lies the CPU cannot write to the video RAM fast enough.
Now I don't know exactly what the ET4000 etc graphics cards do. But I assume it is a similar problem because at least from a stock machine point of view, the CPU cannot keep up. Basically the graphics card would display the image whatever resolution or whatever core depth you want. But of course the CPU cannot magically write that data fast enough.
As for example ,for arguments sake , say the CPU could generate a image at 50 frames per second at 320x200x16. If we then wanted to double the horizontal resolution to 640x200x16 the CPU simply does not have the bandwidth to do that. So the frames per second would inherently drop to 25. Then if we doubled the vertical resolution to 620x400x16 we would still get that on the screen but the CPU and its limitation means we are dropping to about 12 frames per second. There will be start increasing the core depth you basically end up with frames which can take a second or even several seconds to update a single frame.
So there lies the problem that the higher resolutions and higher colour depths have to get the bandwidth from somewhere. In terms of a stock machine, such graphics cards are only going to be useful for displaying pictures or basic DTP applications. If you want to get around such limitations then you then have to increase the CPU speed which is then getting into accelerators.
But from my point of view adding a graphics card and an accelerator is a bit of a "hack job" because it does not fix the fundamental problem that the bus speed needs to be increased. If the bus speed is running 4x faster, we can get resolutions relatively easily and we don't need accelerators and we don't need graphics cards.
It is also partly why I have not had much enthusiasm for the 64MHz SEC booster in recent years. On one hand having a faster CPU is great. The 68HC000 is going to max out at 32MHz. So that CPU becomes a bottleneck.
But the problem is you cannot just simply change the CPU and double the clock speed from 32MHz to 64MHz. In terms of original machines you might as well forget it as they are just so inherently unstable to start with. What I did with the later SEC booster is to add bus isolators to physically disconnect the ST bus so the CPU is not having to drive the entire motherboards capacitance on each signal.
I should probably touch on this as well. While the H4/H5 series uses a solid power plane (0V and 5V) which inherently increases stability no end. The problem is every signal running across the board then has capacitive coupling to the ground plane which increases its capacitance. EG, makes it harder to drive at high speeds. It is why I did this design because then it would have its own fast-ram running at 64MHz on its own "fast bus" which is physically disconnected from the "ST side bus". This way the CPU isn't having to drive 300pF of signal capacitance at higher rate of MHz.
Aside from the usual lack of time for projects lately. As I have said before many times, accelerators are basically obsolete now anyway. Why bother having TTram running at 32MHz or whatever, when you can run the entire bus at that speed. In terms of a 68000, we can have simply ~14MB of ST-RAM running at 32MHz. There is no need to mess around with "alt-ram" which undoubtedly has compatibility issues.
When we get up to 32-bit CPUs we then have the issues as we have seen on the Falcon and others, that the shifter , blitter,DMA etc does not have access to memory over the 16MB mark. This again causes compatibility issues and another world of pain. This is why again we are working on our own blitter because hopefully we can simply do a 32-bit version and do away with those problems as well.
So I hope this at least clears things up as to the "entire problem" as to why I abandoned the 16Mhz mod and went for a full FPGA rebuild of the ST's chipsets and how it all relates to video resolutions etc. Plus all the related problems and also "whats next and what's next after that". There is no use ploughing thousands of hours into the 16MHz mod knowing it is never going to be stable and knowing it is a total brick wall after all that work anyway. A guy at work always used to say, it takes just as long to bodge it as it does to do it properly.
Overall, while we could have rare graphics cards and accelerators to get higher resolutions and colours and faster screen updates etc etc, it is still really just a hack job in my opinion. If the chipset is recreated and built properly from the ground up, we would not need all those "hack jobs". We could just do things properly from the ground up and then take things from there.


