PiStorm Chat

General discussions or ideas about hardware.
User avatar
Badwolf
Site sponsor
Site sponsor
Posts: 3043
Joined: 19 Nov 2019 12:09

Re: PiStorm Chat

Post by Badwolf »

alexh wrote: 13 Jul 2022 23:35 Nice. How did you fix processors that have more than 24 address bits?
The emulator was refusing to attempt to pass through any addresses with the top byte set. Of course we know that 0xFFnnnnnn needs to map to 0x00nnnnnn, so I put in a translation for that before the top byte check*.

Not efficient, needs to be moved to platform specific code at some point, but good enough to show them working.

No further speed gain once we get to the 020, but it's nice to have an STE with a full 040 processor and 516MB of RAM. :lol:

BW

* EmuTOS makes extensive use of 0xFFFFnnnn registers before any MMU is set up. I presume this is deliberate and expected to be handled in hardware.
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
User avatar
Badwolf
Site sponsor
Site sponsor
Posts: 3043
Joined: 19 Nov 2019 12:09

Re: PiStorm Chat

Post by Badwolf »

OK, so after some truly heinous hacking, a *partially* working bus error emulation seems to have fallen out.

PiStormST can now boot from stock OS images. Here's 2.06 doing its thing.

rsz_fylao_-xoaeaox2.jpg
rsz_fylapaixoairm_5.jpg

Next step will be to try to understand what's going on and tidy this whole thing up. Then it'll be looking into why ST-RAM accesses are so slow. My gut feeling is it's protocol limitations, but we'll see.

There's an alternative asynchronous firmware called PiStormX which I fancy is worth looking at. Ties in better with how I approach bus cycles in my other firmwares.

Cheers,

BW.
You do not have the required permissions to view the files attached to this post.
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
User avatar
agranlund
Site sponsor
Site sponsor
Posts: 1752
Joined: 18 Aug 2019 22:43
Location: Sweden

Re: PiStorm Chat

Post by agranlund »

Badwolf wrote: 22 Jul 2022 12:30 OK, so after some truly heinous hacking, a *partially* working bus error emulation seems to have fallen out.

PiStormST can now boot from stock OS images. Here's 2.06 doing its thing.
Awesome progress!!
User avatar
Cyprian
Posts: 542
Joined: 22 Dec 2017 09:16
Location: Warszawa, Poland

Re: PiStorm Chat

Post by Cyprian »

great
ATW800/2 / V4sa / Lynx I / Mega ST 1 / 7800 / Portfolio / Lynx II / Jaguar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
DDD HDD / AT Speed C16 / TF536 / SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
http://260ste.atari.org
User avatar
Badwolf
Site sponsor
Site sponsor
Posts: 3043
Joined: 19 Nov 2019 12:09

Re: PiStorm Chat

Post by Badwolf »

IMG_5688.jpg

Treating the symptom rather than the cause, but stability greatly improved by a late-in-the-day check for IPL mismatch.

The issue I was seeing repeatedly was that the interrupt acknowledge cycle was often generating a bus error because there was no actual interrupt to acknowledge!

The IRQ system in PiStorm is very opaque and any change to what often appear magic numbers causes it all to fall apart. So I've kind of admitted defeat for now and implemented this check. I still have seen a couple of errors, but things are better.

The big issue is still bus access speed. Unfortunately the gap between bus cycles is too large. One cycle to the next is ~1us whereas on a proper 8MHz 68000 it's ~500ns. Hence I'm seeing 50% speed (the figures above should be 3.7MB/s).

There are two contributory factors to this. Firstly turning on bus (and address) errors costs time. Secondly, the Amiga asserts it data earlier in the cycle meaning PiStorm can normally just abandon the cycle early, meaning the cycle-to-cycle time comes down. We can't on the ST as data is only asserted at the 'correct' time, meaning we have to go through a full three-clock bus cycle.

Of course, turning on the features -- TT-RAM, 020 processor, virtual ROM, mitigates this to some extent on OS-based apps (see below), but the RAM speed still never gets above 57%.

I might have to learn how to do profiling on a multithreaded linux app to know where to go next.

BW

IMG_5689.jpg
You do not have the required permissions to view the files attached to this post.
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
User avatar
exxos
Site Admin
Site Admin
Posts: 28367
Joined: 16 Aug 2017 23:19
Location: UK

Re: PiStorm Chat

Post by exxos »

That sounds all rather painful!
User avatar
alexh
Site sponsor
Site sponsor
Posts: 1340
Joined: 17 Oct 2017 16:51
Location: Oxfordshire

Re: PiStorm Chat

Post by alexh »

If you think that it is related to the performance of the software can you prove this by over clocking (or under clocking) your Pi?
Senior Principal ASIC Engineer - SystemVerilog, VHDL
Thalion Webshrine - http://thalion.atari.org
ST,STf,STfm,STe,MegaST,MegaSTe,Falcon060
A500+,A600,A4000/060,CD32,CDTV
User avatar
Badwolf
Site sponsor
Site sponsor
Posts: 3043
Joined: 19 Nov 2019 12:09

Re: PiStorm Chat

Post by Badwolf »

alexh wrote: 03 Aug 2022 09:47 If you think that it can be solved by increasing performance of the software can you prove this by over clocking your Pi?
I don't think the RPi3 supports overclocking and, if I had to guess, I'd say it's unlikely we can bridge that gap as things stand.

That said, it's an untested theory ATM. Finding out a way to log the cycle-to-cycle times when using 'virtual' memory (eg. pi-based TT-RAM) would give us more evidence.

The fact that emulated TT-RAM was only clocking similar figures to DFB1 (and lower than TF536) in my screenshot above when internal RAM accesses are effectively instant does suggest to me a fundemental bottleneck in the emulation ATM.

Emu68 (the bare metal software for PiStorm) might be the alternative to circumventing this. It does have the downside that nice to have features like network and RTC become orders of magnitude harder, though (drivers have to be written on *both* sides of the PiStorm board).

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
User avatar
alexh
Site sponsor
Site sponsor
Posts: 1340
Joined: 17 Oct 2017 16:51
Location: Oxfordshire

Re: PiStorm Chat

Post by alexh »

Badwolf wrote: 03 Aug 2022 09:54
alexh wrote: 03 Aug 2022 09:47 If you think that it can be solved by increasing performance of the software can you prove this by over clocking your Pi?
I don't think the RPi3 supports overclocking and
Yes? I was sure it did, but maybe not. Perhaps turn off "turbo"?
Badwolf wrote: 03 Aug 2022 09:54 if I had to guess, I'd say it's unlikely we can bridge that gap as things stand.
I wasn't suggesting overclocking as a fix, more..... if extra/fewer CPU cycles improves/degrades performance then you can look to optimise the code. If it doesn't then maybe it is latency in the Pi<->CPLD interface which is independent of emulation performance?
Badwolf wrote: 03 Aug 2022 09:54 The fact that emulated TT-RAM was only clocking similar figures to DFB1 (and lower than TF536) in my screenshot above when internal RAM accesses are effectively instant does suggest to me a fundemental bottleneck in the emulation ATM.
TF536 uses SDR DRAM? Could it be the Pi's DDR3 latency? DDR3's higher latency is usually compensated for with CPU caches etc. But if the emulation is doing somehow doing something which negates the effect of the cache then it might have an effect? Probably not but I thought I'd throw it out there.
Senior Principal ASIC Engineer - SystemVerilog, VHDL
Thalion Webshrine - http://thalion.atari.org
ST,STf,STfm,STe,MegaST,MegaSTe,Falcon060
A500+,A600,A4000/060,CD32,CDTV
User avatar
Badwolf
Site sponsor
Site sponsor
Posts: 3043
Joined: 19 Nov 2019 12:09

Re: PiStorm Chat

Post by Badwolf »

alexh wrote: 03 Aug 2022 12:05 Perhaps turn off "turbo"?
Can try that as an experiment, but the cycles are quite quantised, so it would need a big change to make any noticiable difference, I suspect.
Badwolf wrote: 03 Aug 2022 09:54 if I had to guess, I'd say it's unlikely we can bridge that gap as things stand.
I wasn't suggesting overclocking as a fix, more..... if extra/fewer CPU cycles improves/degrades performance then you can look to optimise the code. If it doesn't then maybe it is latency in the Pi<->CPLD interface which is independent of emulation performance?
I wasn't referring to overclocking. I suspect it's unlikely we'll find a 2x speed gain via any cumulative mechanism, unless I have a real head-slap moment and find a silly mistake.

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark

Return to “HARDWARE DISCUSSIONS”

Who is online

Users browsing this forum: Baidu [Spider], ClaudeBot and 8 guests