PiStorm Chat

General discussions or ideas about hardware.
Badwolf
Posts: 1313
Joined: Tue Nov 19, 2019 12:09 pm

Re: PiStorm Chat

Post by Badwolf »

alexh wrote: Wed Jul 13, 2022 11:35 pm Nice. How did you fix processors that have more than 24 address bits?
The emulator was refusing to attempt to pass through any addresses with the top byte set. Of course we know that 0xFFnnnnnn needs to map to 0x00nnnnnn, so I put in a translation for that before the top byte check*.

Not efficient, needs to be moved to platform specific code at some point, but good enough to show them working.

No further speed gain once we get to the 020, but it's nice to have an STE with a full 040 processor and 516MB of RAM. :lol:

BW

* EmuTOS makes extensive use of 0xFFFFnnnn registers before any MMU is set up. I presume this is deliberate and expected to be handled in hardware.
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
DSTB1 Open source 16Mhz 68k and AltRAM accelerator for the ST
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
Badwolf
Posts: 1313
Joined: Tue Nov 19, 2019 12:09 pm

Re: PiStorm Chat

Post by Badwolf »

OK, so after some truly heinous hacking, a *partially* working bus error emulation seems to have fallen out.

PiStormST can now boot from stock OS images. Here's 2.06 doing its thing.

rsz_fylao_-xoaeaox2.jpg
rsz_fylao_-xoaeaox2.jpg (72.46 KiB) Viewed 391 times
rsz_fylapaixoairm_5.jpg
rsz_fylapaixoairm_5.jpg (78.81 KiB) Viewed 391 times

Next step will be to try to understand what's going on and tidy this whole thing up. Then it'll be looking into why ST-RAM accesses are so slow. My gut feeling is it's protocol limitations, but we'll see.

There's an alternative asynchronous firmware called PiStormX which I fancy is worth looking at. Ties in better with how I approach bus cycles in my other firmwares.

Cheers,

BW.
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
DSTB1 Open source 16Mhz 68k and AltRAM accelerator for the ST
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
User avatar
agranlund
Posts: 774
Joined: Sun Aug 18, 2019 10:43 pm
Location: Sweden
Contact:

Re: PiStorm Chat

Post by agranlund »

Badwolf wrote: Fri Jul 22, 2022 12:30 pm OK, so after some truly heinous hacking, a *partially* working bus error emulation seems to have fallen out.

PiStormST can now boot from stock OS images. Here's 2.06 doing its thing.
Awesome progress!!
User avatar
Cyprian
Posts: 277
Joined: Fri Dec 22, 2017 9:16 am
Location: Poland

Re: PiStorm Chat

Post by Cyprian »

great
Mega ST 1 / 7800 / Portfolio / Lynx II / Jaguar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
DDD HDD / AT Speed C16 / TF536 / SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Steem SSE / Aranym / Saint
http://260ste.atari.org
Badwolf
Posts: 1313
Joined: Tue Nov 19, 2019 12:09 pm

Re: PiStorm Chat

Post by Badwolf »

IMG_5688.jpg
IMG_5688.jpg (159.47 KiB) Viewed 202 times

Treating the symptom rather than the cause, but stability greatly improved by a late-in-the-day check for IPL mismatch.

The issue I was seeing repeatedly was that the interrupt acknowledge cycle was often generating a bus error because there was no actual interrupt to acknowledge!

The IRQ system in PiStorm is very opaque and any change to what often appear magic numbers causes it all to fall apart. So I've kind of admitted defeat for now and implemented this check. I still have seen a couple of errors, but things are better.

The big issue is still bus access speed. Unfortunately the gap between bus cycles is too large. One cycle to the next is ~1us whereas on a proper 8MHz 68000 it's ~500ns. Hence I'm seeing 50% speed (the figures above should be 3.7MB/s).

There are two contributory factors to this. Firstly turning on bus (and address) errors costs time. Secondly, the Amiga asserts it data earlier in the cycle meaning PiStorm can normally just abandon the cycle early, meaning the cycle-to-cycle time comes down. We can't on the ST as data is only asserted at the 'correct' time, meaning we have to go through a full three-clock bus cycle.

Of course, turning on the features -- TT-RAM, 020 processor, virtual ROM, mitigates this to some extent on OS-based apps (see below), but the RAM speed still never gets above 57%.

I might have to learn how to do profiling on a multithreaded linux app to know where to go next.

BW

IMG_5689.jpg
IMG_5689.jpg (194.54 KiB) Viewed 202 times
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
DSTB1 Open source 16Mhz 68k and AltRAM accelerator for the ST
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
User avatar
exxos
Site Admin
Site Admin
Posts: 18962
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: PiStorm Chat

Post by exxos »

That sounds all rather painful!
https://www.exxosforum.co.uk/atari/ All my hardware guides - mods - games - STOS
https://www.exxosforum.co.uk/atari/store2/ - All my hardware mods for sale - Please help support by making a purchase.
viewtopic.php?f=17&t=1585 Have you done the Mandatory Fixes ?
Just because a lot of people agree on something, doesn't make it a fact. ~exxos ~
People should find solutions to problems, not find problems with solutions.
User avatar
alexh
Posts: 238
Joined: Tue Oct 17, 2017 4:51 pm
Location: Oxfordshire

Re: PiStorm Chat

Post by alexh »

If you think that it is related to the performance of the software can you prove this by over clocking (or under clocking) your Pi?
Principal ASIC Engineer - SystemVerilog, VHDL
Thalion Webshrine - http://thalion.atari.org
STfm,STe,MegaSTe,Falcon060
A500+,A600,A4000/060,CD32,CDTV
Badwolf
Posts: 1313
Joined: Tue Nov 19, 2019 12:09 pm

Re: PiStorm Chat

Post by Badwolf »

alexh wrote: Wed Aug 03, 2022 9:47 am If you think that it can be solved by increasing performance of the software can you prove this by over clocking your Pi?
I don't think the RPi3 supports overclocking and, if I had to guess, I'd say it's unlikely we can bridge that gap as things stand.

That said, it's an untested theory ATM. Finding out a way to log the cycle-to-cycle times when using 'virtual' memory (eg. pi-based TT-RAM) would give us more evidence.

The fact that emulated TT-RAM was only clocking similar figures to DFB1 (and lower than TF536) in my screenshot above when internal RAM accesses are effectively instant does suggest to me a fundemental bottleneck in the emulation ATM.

Emu68 (the bare metal software for PiStorm) might be the alternative to circumventing this. It does have the downside that nice to have features like network and RTC become orders of magnitude harder, though (drivers have to be written on *both* sides of the PiStorm board).

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
DSTB1 Open source 16Mhz 68k and AltRAM accelerator for the ST
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
User avatar
alexh
Posts: 238
Joined: Tue Oct 17, 2017 4:51 pm
Location: Oxfordshire

Re: PiStorm Chat

Post by alexh »

Badwolf wrote: Wed Aug 03, 2022 9:54 am
alexh wrote: Wed Aug 03, 2022 9:47 am If you think that it can be solved by increasing performance of the software can you prove this by over clocking your Pi?
I don't think the RPi3 supports overclocking and
Yes? I was sure it did, but maybe not. Perhaps turn off "turbo"?
Badwolf wrote: Wed Aug 03, 2022 9:54 am if I had to guess, I'd say it's unlikely we can bridge that gap as things stand.
I wasn't suggesting overclocking as a fix, more..... if extra/fewer CPU cycles improves/degrades performance then you can look to optimise the code. If it doesn't then maybe it is latency in the Pi<->CPLD interface which is independent of emulation performance?
Badwolf wrote: Wed Aug 03, 2022 9:54 am The fact that emulated TT-RAM was only clocking similar figures to DFB1 (and lower than TF536) in my screenshot above when internal RAM accesses are effectively instant does suggest to me a fundemental bottleneck in the emulation ATM.
TF536 uses SDR DRAM? Could it be the Pi's DDR3 latency? DDR3's higher latency is usually compensated for with CPU caches etc. But if the emulation is doing somehow doing something which negates the effect of the cache then it might have an effect? Probably not but I thought I'd throw it out there.
Principal ASIC Engineer - SystemVerilog, VHDL
Thalion Webshrine - http://thalion.atari.org
STfm,STe,MegaSTe,Falcon060
A500+,A600,A4000/060,CD32,CDTV
Badwolf
Posts: 1313
Joined: Tue Nov 19, 2019 12:09 pm

Re: PiStorm Chat

Post by Badwolf »

alexh wrote: Wed Aug 03, 2022 12:05 pm Perhaps turn off "turbo"?
Can try that as an experiment, but the cycles are quite quantised, so it would need a big change to make any noticiable difference, I suspect.
Badwolf wrote: Wed Aug 03, 2022 9:54 am if I had to guess, I'd say it's unlikely we can bridge that gap as things stand.
I wasn't suggesting overclocking as a fix, more..... if extra/fewer CPU cycles improves/degrades performance then you can look to optimise the code. If it doesn't then maybe it is latency in the Pi<->CPLD interface which is independent of emulation performance?
I wasn't referring to overclocking. I suspect it's unlikely we'll find a 2x speed gain via any cumulative mechanism, unless I have a real head-slap moment and find a silly mistake.

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
DSTB1 Open source 16Mhz 68k and AltRAM accelerator for the ST
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
Post Reply

Return to “HARDWARE DISCUSSIONS”