You will not be able to post if you are still using Microsoft email addresses such as Hotmail etc
See here for more information viewtopic.php?f=20&t=7296
DO NOT USE MOBILE / CGNAT DEVICES WHERE THE IP CHANGES CONSTANTLY!
At this time, it is unfortunately not possible to whitelist users when your IP changes constantly.
You may inadvertently get banned because a previous attack may have used the IP you are now on.
So I suggest people only use fixed IP address devices until I can think of a solution for this problem!

PiStorm Chat

General discussions or ideas about hardware.
dad664npc
Posts: 162
Joined: Mon Sep 12, 2022 2:32 pm
Location: South East

Re: PiStorm Chat

Post by dad664npc »

ijor wrote: Fri Jan 09, 2026 12:32 pm I might have been too harsh. I apologize if I was, especially to @dad664npc . But I don't like to be accused of not contributing or not loving the platform. This is certainly not fair. Not for me, and not for most us here. But again, I apologize if I was too harsh. Respect and good luck.
Nothing to apologise for - I didn't intend to accuse anyone of not contributing, just venting my frustration at the lack of interest of the years. Reading recent posts it seems I'm not alone :)
ATARI STfm, STe, Mega ST, TT
Amstrad CPC464, CPC6128
PiStorm dev - https://github.com/gotaproblem/pistorm-atari
Pico HDC - https://bbansolutions.co.uk
ijor
Posts: 760
Joined: Fri Nov 30, 2018 8:45 pm

Re: PiStorm Chat

Post by ijor »

dad664npc wrote: Fri Jan 09, 2026 4:26 pm The pi4 can produce full performance.
Ah, this is very good. It means that there is, at least, hope :)

Do you have an idea about the Musashi "raw" performance when not limited by PiStorm?
The Musashi 68000 emulation does include prefetch.
I probably didn't phrase my question correctly. My question was not about the effect of self modyfying code, if that's what you meant by supporting prefetch. My question was related with what we discussed previously with @Badwolf :

Can Musashi start a new bus cycle without waiting for the result of the previous one? This is probably critical for performance issues, and it should be doable. Of course that this is not always possible. Obviously a conditional branch does depend on the previous result. But otherwise a new prefetch can always be, at least, started beforehand.
The firmware should work with both latch types but somewhere during development that functionality got broken.
I can understand that one type of latch would work and the other would not. But what doesn't make much sense is that the performance would be different. Never mind. This is not very important at this point. Which type of latch is the one that is currently working (better)?

Anyway, the logic associated with the latches must be completely rewritten. At least according to what I see at the repository. What I see there is some ripple logic that seems to be the perfect example of how not to generate a pulse.

First thing to do is, IMHO, to bring back the high frequency clock. The Atari interface might still use the 8MHZ clock, at least for the time being. But the PI interface should be as much as possible, modified to be synchronous. And for that we need a high frequency clock.
I have rewritten the firmware and have repurposed the first 8 gpio pins. The idea behind this was to improve the pistorm protocol and incorporate BERR, RESET, IPL1 and IPL2 ...
BERR and RESET signal detection had large processing overheads within the emulator. I eventually managed to repurpose the GPIO layout to include these two signals. Reduces read/write overheads as in the case of BERR, it has to be checked for each I/O
I understand, but not sure you need a dedicated pin for each of these signals. Certainly not for RESET. You probably don't even need a dedicated BERR pin. You don't need to check for BERR on every I/O.

For most bus cycles, when accessing RAM, you already know if BERR would be triggered. Even for ROM as well. For I/O access, BERR behavior is more complicated, it depends on the specific chipset version, among other things. So probably better not to be too smart when accessing I/O. But then, you don't care about performance too much when accessing I/O.

Perhaps not critical, but seems there is no support for TAS bus cycles?
The Emu68 developer (Michael Schulz I believe) has stated that it should work with Atari ST but there will need to be a lot of work to get stuff booting etc. (A lot of 68000 machine coding ).
I don't understand. I thought that Emu68 doesn't support the 68K features not used in the Amiga, such as BERR, FC signals, etc. So how it would work at all with the ST?
But if you think you can get a stable firmware then brilliant - I can send you the latest stuff I'm working on
I am willing to help. But as I said, I can't commit myself to write the whole firmware from scratch. And to be honest, from what I've seen, I'm not sure this would not be needed. I certainly can't test it.

But do send me your latest snapshot and we'll see what we can do. You can send me a PM, but is you don't want to make it public for the time being, probably better to open a new github private repository and give me private access. In anycase, include not only the firmware code but everything that is associated. Schematics for the hardware you are using. The PI protocol code. I understand that there is a program to exercise the PI interface without emulator, that would be useful as well.
http://github.com/ijor/fx68k 68000 cycle exact FPGA core
FX CAST Cycle Accurate Atari ST core
http://pasti.fxatari.com
dad664npc
Posts: 162
Joined: Mon Sep 12, 2022 2:32 pm
Location: South East

Re: PiStorm Chat

Post by dad664npc »

Do you have an idea about the Musashi "raw" performance when not limited by PiStorm?
It varies a lot - I would estimate between 10MB/s up to 16MB/s from what I remember.
Can Musashi start a new bus cycle without waiting for the result of the previous one? This is probably critical for performance issues, and it should be doable. Of course that this is not always possible. Obviously a conditional branch does depend on the previous result. But otherwise a new prefetch can always be, at least, started beforehand.
The emulator (Musashi) issues many ps_reads and ps_writes. These are blocking io requests handled in ps_protocol.c. So no to your question at the moment. It would require significant pistorm protocol performance increase (I did try buffering commands but could not get it to work). If you look at a ps_read_16 () for example, you will see the number of io transfers to make one read.
First thing to do is, IMHO, to bring back the high frequency clock. The Atari interface might still use the 8MHZ clock, at least for the time being. But the PI interface should be as much as possible, modified to be synchronous. And for that we need a high frequency clock.
It's doable but would need a rewrite. If that's the way to go then so be it, but is 200MHz really needed? I can understand a multiple of 8MHz.
I understand, but not sure you need a dedicated pin for each of these signals. Certainly not for RESET. You probably don't even need a dedicated BERR pin. You don't need to check for BERR on every I/O.
If RESET isn't on a GPIO then how to check for it? It was done by reading a status word (as seen in ps_read_stat_reg ()). The problem then is it becomes another overhead (Pi4 reads are around 100ns I believe), so the emulator performance decreases a little. I have tried putting the RESET check in another thread (CPU) but it soon becomes unstable.
For most bus cycles, when accessing RAM, you already know if BERR would be triggered. Even for ROM as well. For I/O access, BERR behavior is more complicated, it depends on the specific chipset version, among other things. So probably better not to be too smart when accessing I/O. But then, you don't care about performance too much when accessing I/O.
It is (well for me I decided it is) easier to check for BERR with every read/write, but if there is a deterministic way to determine if the check is needed then that would increase performance.
Perhaps not critical, but seems there is no support for TAS bus cycles?
I haven't looked at this - I've had enough to deal with with the firmware alone ;)
I don't understand. I thought that Emu68 doesn't support the 68K features not used in the Amiga, such as BERR, FC signals, etc. So how it would work at all with the ST?
I might be wrong, but I thing it does now support FC, BERR but may not support interrupts. I think it has/is being developed for APPLE Mac 68000
But do send me your latest snapshot and we'll see what we can do. You can send me a PM, but is you don't want to make it public for the time being, probably better to open a new github private repository and give me private access. In anycase, include not only the firmware code but everything that is associated. Schematics for the hardware you are using. The PI protocol code. I understand that there is a program to exercise the PI interface without emulator, that would be useful as well.
I can PM you later my current stuff and will look in to setting up a private github.
I wrote ataritest.c to check for basic pistorm functionality between the Pi and Atari.

Great stuff
thx
ATARI STfm, STe, Mega ST, TT
Amstrad CPC464, CPC6128
PiStorm dev - https://github.com/gotaproblem/pistorm-atari
Pico HDC - https://bbansolutions.co.uk
User avatar
alexh
Site sponsor
Site sponsor
Posts: 1200
Joined: Tue Oct 17, 2017 4:51 pm
Location: Oxfordshire

Re: PiStorm Chat

Post by alexh »

I wasn't aware anyone using Emu68 natively on 68k Mac.

The examples I've seen of Emu68 running MacOS is in an Amiga running Shapeshifter (Mac Emulator). Even the latest video from Emu68 appears to show the new dumpster when using a Mac emulator running on an AmigaI
Principal ASIC Engineer - SystemVerilog, VHDL
Thalion Webshrine - http://thalion.atari.org
STf,STfm,STe,MegaST,MegaSTe,Falcon060
A500+,A600,A4000/060,CD32,CDTV
User avatar
Cyprian
Posts: 522
Joined: Fri Dec 22, 2017 9:16 am
Location: Warszawa, Poland
Contact:

Re: PiStorm Chat

Post by Cyprian »

alexh wrote: Mon Jan 12, 2026 4:00 pm I wasn't aware anyone using Emu68 natively on 68k Mac.
I read on PiStorm Discord server that Mac like Atari needs proper 68k emulation.
Lynx I / Mega ST 1 / 7800 / Portfolio / Lynx II / Jaguar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
ATW800/2 / SUBcart / FujiNet / DDD HDD / AT Speed C16 / TF536 / SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / Mach32 / ET4000 VME / PAM Net
http://260ste.atari.org
User avatar
alexh
Site sponsor
Site sponsor
Posts: 1200
Joined: Tue Oct 17, 2017 4:51 pm
Location: Oxfordshire

Re: PiStorm Chat

Post by alexh »

Cyprian wrote: Mon Jan 12, 2026 5:33 pm I read on PiStorm Discord server that Mac like Atari needs proper 68k emulation.
Like the Atari ST it probably needs features of the 68k CPU that the Amiga didn't use and PiStorm / Emu68k don't normally do.

FC (processor states),
BERR (Bus error),
BREQ (Bus request),
BGNT (Bus Grant),
IRQ (Interrupt Request latency)

Probably other things.

However an emulated Mac on the Amiga probably doesn't need these things.

I just checked the PiStorm Discord Mac section. There's an example of someone modifying Musashi build to run on an SE but currently no support for BREQ or BGNT so no SCSI. No mention of Emu68k
Principal ASIC Engineer - SystemVerilog, VHDL
Thalion Webshrine - http://thalion.atari.org
STf,STfm,STe,MegaST,MegaSTe,Falcon060
A500+,A600,A4000/060,CD32,CDTV
ijor
Posts: 760
Joined: Fri Nov 30, 2018 8:45 pm

Re: PiStorm Chat

Post by ijor »

dad664npc wrote: Mon Jan 12, 2026 1:30 pm
Do you have an idea about the Musashi "raw" performance when not limited by PiStorm?
It varies a lot - I would estimate between 10MB/s up to 16MB/s from what I remember.
Hmm. So you are saying that in the best case, performance is never better than a just few times faster than stock hardware? To be honest, that sounds quite disappointing. I thought it would fly, otherwise what's the point? Or I'm missing something?
It's doable but would need a rewrite. If that's the way to go then so be it, but is 200MHz really needed? I can understand a multiple of 8MHz.
It doesn't have to be 200 MHz. I just mentioned that frequency because that's what I thought it was used before. I though then it would be the easiest. It has to be a rather high frequency so that it would allow finer granularity when transferring between different clock domains. Can't say what it would be the ideal value at this time. But this can be fine tuned later on.

What would be the problem to make it 200 MHz as it was originally? Isn't this completely implemented by the hardware without affecting the PI performance?
If RESET isn't on a GPIO then how to check for it? It was done by reading a status word (as seen in ps_read_stat_reg ()). The problem then is it becomes another overhead (Pi4 reads are around 100ns I believe), so the emulator performance decreases a little.
I'm not sure I understand why it would affect performance. Aren't you issuing a read status anyway?
It is (well for me I decided it is) easier to check for BERR with every read/write, but if there is a deterministic way to determine if the check is needed then that would increase performance.
Yes, of course. E.g., a read access to ROM never provokes a BERR, every write to ROM does.
...

Now, stupid question regarding performance: Why you read from ST-RAM at all? Can't you just cache the entire 4 MB range? The only time that you should need to read from RAM is when the systems performs DMA, which might not even be needed if you are using local PI storage.

You still need to write to RAM though. You need to write to video RAM and it can be relocated anywhere. But writes are far much easier than reads. They don't even need to be "real time". You can post writes to a buffer and flush it later, may be even on a separate thread.

Of course, you still need to read (and write) from I/O. But that should not affect performance too much.
http://github.com/ijor/fx68k 68000 cycle exact FPGA core
FX CAST Cycle Accurate Atari ST core
http://pasti.fxatari.com
User avatar
Badwolf
Site sponsor
Site sponsor
Posts: 2994
Joined: Tue Nov 19, 2019 12:09 pm

Re: PiStorm Chat

Post by Badwolf »

ijor wrote: Mon Jan 12, 2026 4:33 am
dad664npc wrote: Fri Jan 09, 2026 4:26 pm The firmware should work with both latch types but somewhere during development that functionality got broken.
I can understand that one type of latch would work and the other would not. But what doesn't make much sense is that the performance would be different. Never mind. This is not very important at this point. Which type of latch is the one that is currently working (better)?
The original firmware supports 74373s (latches) and 74374s (flipflops). Footprint equivalent. During a read a pulse is generated over one half-cycle of the 68k bus cycle. The 374s would latch on the rising edge and the 373s on the trailing edge (nb. actually it may have been a pulse on the 200MHz clock -- sorry, I forget). Obviously you choose a point where the data is valid on both edges. You're right: it shouldn't affect throughput, but I can fully understand dad664's approach of only worrying about one type during development.
You don't need to check for BERR on every I/O.
For most bus cycles, when accessing RAM, you already know if BERR would be triggered.
That's an excellent point and one I hadn't thought about: generate the BERR in the emulator. Imperfect, but probably good enough for 99% of use cases.
Perhaps not critical, but seems there is no support for TAS bus cycles?
Is that the command that produces the RMW cycle? I don't think my inital port handled it originally, no.
ijor wrote: Mon Jan 12, 2026 5:53 pm Hmm. So you are saying that in the best case, performance is never better than a just few times faster than stock hardware? To be honest, that sounds quite disappointing. I thought it would fly, otherwise what's the point? Or I'm missing something?
I think dad664's memory is right. I think I had emulated fast RAM clocking in at between 4 and 4.5x ST-RAM speeds. A 'bare metal' version he was working on came in at about 13x speed.

Compiling in the bus error handling into the emulator does have quite an impact on performance, but at its simplest the idea of an inexpensive upgrade that adds considerable acceleration (we're just talking about memory speeds here, remember), hard disc emulation, graphics card emulation, network card emulation -- whatever additional emulation one cares to write into the software is still an attractive prospect.

If it can be made to work reliably.
Now, stupid question regarding performance: Why you read from ST-RAM at all? Can't you just cache the entire 4 MB range? The only time that you should need to read from RAM is when the systems performs DMA, which might not even be needed if you are using local PI storage.
The 'write through cache' was actually implemented quite early in development. More as a way to demonstrate where the bottleneck was than to work around it, IIRC. I think it added a 'DMA has happened' flag to the status bits which simply flushed the whole cache. And yes, it improved things *considerably*.

I'm hazy over the whole reset thing, so I'll not add noise to that discussion other than to say I did try implementing it as a special case interrupt, but that really demonstrated how broken the interrupt logic was on both ends of the interface. I'm fairly confident that brokenness was inherited rather than introduced by the Atari chaps.

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
SteveBagley
Posts: 29
Joined: Fri Jul 26, 2024 3:53 pm

Re: PiStorm Chat

Post by SteveBagley »

Badwolf wrote: Tue Jan 13, 2026 10:25 am
ijor wrote: Mon Jan 12, 2026 4:33 am You don't need to check for BERR on every I/O.
For most bus cycles, when accessing RAM, you already know if BERR would be triggered.
That's an excellent point and one I hadn't thought about: generate the BERR in the emulator. Imperfect, but probably good enough for 99% of use cases.
Given, IIRC, GLUE in the ST generates BERR if it hasn't seen DTACK for a certain time period (500ns?), is there any reason you couldn't just implement the same logic in software on the PI to generate BERR synthetically? (i.e. effectively if you timeout on waiting for DTACK, then it must be a bus error).

Steve
User avatar
Badwolf
Site sponsor
Site sponsor
Posts: 2994
Joined: Tue Nov 19, 2019 12:09 pm

Re: PiStorm Chat

Post by Badwolf »

SteveBagley wrote: Tue Jan 13, 2026 1:18 pm Given, IIRC, GLUE in the ST generates BERR if it hasn't seen DTACK for a certain time period (500ns?), is there any reason you couldn't just implement the same logic in software on the PI to generate BERR synthetically? (i.e. effectively if you timeout on waiting for DTACK, then it must be a bus error).
Ooo, you're tickling a memory of some kind of experiment I did along those lines a while back, but yes: good idea.

With use of a timer and an interrupt you could probably do it with very little overhead too. Not quite as little as synthesising it yourself, but probably only one branch test per bus cycle as the set-up could all be done in the period you're waiting for DTACK anyway.

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
Post Reply

Return to “HARDWARE DISCUSSIONS”