TF536 - Atari firmware - Rev2 TF536

agranlund · Post by **agranlund** » Mon Jul 25, 2022 3:38 pm

Badwolf wrote: ↑Mon Jul 25, 2022 11:30 am I have no idea how it'd know what's changed and not! Getting into too advanced techniques for me there

I wasn't thinking of anything fancy, just brute force compare against the destination (because of L2) on copy and hoping that the total overhead of testing is less than the cost of writing

I may have misunderstood the instruction timings and my numbers and calculations might be (probably are) completely flawed..
Plus you'd have to somehow factor in what difference in speed building the screen in tt-ram + copy makes compared to stock Frontier.
I really have no idea...

Assuming your fast->st copy goes somewhat like this:

Code: Select all

	// copy stuff
	movem.l	(a0)+,d0-d7		// 12 + 64:fast
	movem.l d0-d7,x(a1)		// 12 + 64:slow

ref: 88 @50mhz + 64 @8mhz

Some kind of brute force copy-if-different:

Code: Select all

	// compare-and-copy the same amount of stuff
	movem.l	(a0)+,d0-d7		// 12 + 64 
	// ---- 
	cmp.l	0(a1),d0		// 12
	beq.s	.skip0			// 10 taken, 8 not taken
	move.l	d0,0(a1)		// 8 + 8:slow
.skip0: // ---- etc 7 more times

best: 252 @50mhz
worst: 300 @50mhz + 64 @8mhz

And then assuming (yes, there's an awful lot of assuming going on here

) that one 8mhz cycle is "worth" about 7(?) 50mhz cycles?

Code: Select all

best:   252
ref:   ~536
worst: ~748

Copy procedure should at least be faster when nothing was changed and slower when everything was changed.
..until someone with a better understanding of 68k instruction timings points out all the stuff I got wrong.
I suspect the timings for reading from fastram would be shorter still due to 32bit access, burstmode, cache and whatnot.

Wonder if it would be worth it to try and make the L2 cache more clever and only write-through to ST-RAM on changed data?
I think the overhead would kill you, but an interesting experiment if you could make the hardware do it.

I was thinking the hardware would do that on st-ram writes. There would be an overhead of at least one fastram read on every st-ram write.
Probably a huge benefit when a lot of writes can be early-outed, but more expensive if the opposite is true. I don't really know how to implement that though, but it would be an interesting experiment.

Cyprian · Post by **Cyprian** » Mon Jul 25, 2022 4:29 pm

I wonder whether would be possible to replace the ST-RAM with a dual-port memory.
In that case the accelerator could have fast, non-blocked, 50MHz access to the ST-RAM

Matej · Post by **Matej** » Mon Jul 25, 2022 4:37 pm

Want buy TF536 ST firmware flashed anyone selling those? Got my TT upgraded. Now want STFM upgrade.

agranlund · Post by **agranlund** » Mon Jul 25, 2022 8:57 pm

exxos wrote: ↑Mon Jul 25, 2022 10:31 am Maybe games like Xenon with a scrolling background which is always changing may not show much speed improvement. Only a guess of course.

Yep, if everything changes all the time then it'll end up more expensive trying to be clever about what to write.

I don't know about you.. maybe I'm getting old but quite a few of these old 2D games are already running at unplayable speeds

(The ones that do care about timing are indeed very smooth now

)

Giana Sisters with software fine-scrolling runs very smooth now with the L2, it's basically on par with the blitter scroll version now.
And the Dungeon Master door opens really really really fast!

I tested quite a few games loading off the Ultrasatan and the dma-fix/hack seems to work fine. Should try some more from floppy too.
Many games didn't run of course but that's always been the case and it's hard to keep track of which ones prefer TOS or EmuTOS.
From what I can tell the KLZ releases seemed to have a high success rate and the PPera ones were a bit more hit and miss (at least now, running on EmuTOS)

Badwolf wrote: ↑Sun Jul 24, 2022 8:55 pm I did a special version of Frontbench for someone (sorry, I forget who) that did all screen writes to TT-RAM and only blitted the end result to ST-RAM.
It turned out to be slower on the Falcon, but on the ST where writes are sub-4MB/s the overhead may be worth it? I wonder if I can find it.

That version shows 4033 on my ST, so a bit slower on the ST too but not by much.
Looking at how much of the screen is identical between each frame in Frontbench I'm willing to bet (a very small bet) it may very well turn out faster on the ST spending more (fast) cycles in a brute force compare-before-write to avoid more of those (slow) cycles.

Edit: When doing the same comparison with L2 disabled, your special version actually comes out on top. Only by about 100 points or so but still.. goes to show how slow the bus is on the ST

Post by **exxos** » Mon Jul 25, 2022 9:00 pm

agranlund wrote: ↑Mon Jul 25, 2022 8:57 pm I don't know about you.. maybe I'm getting old but quite a few of these old 2D games are already running at unplayable speeds
(The ones that do care about timing are indeed very smooth now )

Try castle master

Maybe we should get @Badwolf To do a frame counter on that as well

agranlund · Post by **agranlund** » Mon Jul 25, 2022 9:24 pm

@exxos, I'm not sure how Castle Master normally runs but those sharks sure swim fast

Post by **exxos** » Tue Jul 26, 2022 6:56 pm

agranlund wrote: ↑Mon Jul 25, 2022 9:24 pm @exxos, I'm not sure how Castle Master normally runs but those sharks sure swim fast

It runs at like 0.5FPS normally

So I can now soon retire now I can (almost) play castle master at a reasonable frame rate

Badwolf · Post by **Badwolf** » Tue Jul 26, 2022 7:18 pm

agranlund wrote: ↑Mon Jul 25, 2022 3:38 pm Some kind of brute force copy-if-different:
Code: Select all
	// compare-and-copy the same amount of stuff
	movem.l	(a0)+,d0-d7		// 12 + 64 
	// ---- 
	cmp.l	0(a1),d0		// 12
	beq.s	.skip0			// 10 taken, 8 not taken
	move.l	d0,0(a1)		// 8 + 8:slow
.skip0: // ---- etc 7 more times

I've implemented a quick hack (more overhead than the above, but was quick to do at lunchtime) and sent it to Anders. Will be fun to find the results!

BW

Code: Select all

    move.w #15999,d0 ; word counter
screenloop:
    move.w (a1)+,d1
    cmp.w (a0),d1
    beq screenskipwrite         ; if eq, skip
    ; else
    move.w d1,(a0)+ ; write to actual screen
    dbra.w d0,screenloop
    bra.b col_end
screenskipwrite:
    addq.l #2,a0
    dbra.w d0,screenloop
col_end:

ijor · Post by **ijor** » Tue Jul 26, 2022 8:48 pm

agranlund wrote: ↑Sun Jul 24, 2022 6:22 pm Yep exactly that!
I basically just bulk copy the DMA'ed data into the shadow on end-of-DMA.

The cpld helps by shadowing all writes to hardware registers (I had plenty of unused space in the reserved region anyway).
The interrupt handler gets hold of the start address by reading the shadowed DMA address counter, and the end address is gotten by reading the real DMA address counter. Then copy. It ended up being quite a small piece of code.

Very nice!

But why are you using an interrupt routine to copy the data? Can't you just write to fast RAM on the fly? Or you don't have enough CPLD resources to implement that?

agranlund · Post by **agranlund** » Wed Jul 27, 2022 2:21 am

Latest firmware with the L2 stuff:

tf536_2022_07_27_ATARI_alpha.zip: (107.38 KiB) Downloaded 53 times

Also support tools:

maprom_220727.zip: (20.96 KiB) Downloaded 42 times

(or on github: https://github.com/agranlund/tftools )

These are new re-implementations of fastram.prg and maprom.prg and they are meant to completely replace the old versions from now on.
I'm no longer building the old ones but I'll be keeping the source code around under "src/maprom_old".

The new maprom is required in order to use TF536r2 firmware features, but the program is not specifically tied to this card and firmware.
It *should* work with TF534 and TF536 just as before but with some added features and most importantly: hopefully much easier to maintain, and for other people to modify, as it's mostly in C now.

It creates a "maprom.inf" file on the root of the boot drive at first launch. This is the settings file and it can be edited in a text editor of choice (there's no options GUI yet).

PS. @Badwolf, I hope I am detecting the DFB1 correctly?
It doesn't do anything special for it, but it's nice if the card name shows up correctly

https://github.com/agranlund/tftools/bl ... ard_DFB1.c

I wanted to change the name to something other than maprom but I couldn't come up with anything..
It's not just for putting the rom in fastram after all, it does a bunch of other stuff too

TF536 - Atari firmware - Rev2 TF536

Re: TF536 - Atari firmware - Rev2 TF536

Re: TF536 - Atari firmware - Rev2 TF536

Re: TF536 - Atari firmware - Rev2 TF536

Re: TF536 - Atari firmware - Rev2 TF536

Re: TF536 - Atari firmware - Rev2 TF536

Re: TF536 - Atari firmware - Rev2 TF536

Re: TF536 - Atari firmware - Rev2 TF536

Re: TF536 - Atari firmware - Rev2 TF536

Re: TF536 - Atari firmware - Rev2 TF536

Re: TF536 - Atari firmware - Rev2 TF536