You will not be able to post if you are still using Microsoft email addresses such as Hotmail etc
See here for more information viewtopic.php?f=20&t=7296
DO NOT USE MOBILE / CGNAT DEVICES WHERE THE IP CHANGES CONSTANTLY!
At this time, it is unfortunately not possible to whitelist users when your IP changes constantly.
You may inadvertently get banned because a previous attack may have used the IP you are now on.
So I suggest people only use fixed IP address devices until I can think of a solution for this problem!

REV 3 - REV 5 - The beginning (ST536)

All about the ST536 030 ST booster.
User avatar
exxos
Site Admin
Site Admin
Posts: 27530
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: REV 3 - REV 5 - The beginning (ST536)

Post by exxos »

ijor wrote: Thu Jan 01, 2026 3:27 pm While this would be a problem? As you are saying, the slow rise edge is perfectly normal when GLUE tristates the bus after completing a DMA transaction. Shouldn't cause any conflict, even with a weaker pull-up. The CPU will actively drive the signal when starting a new bus cycle. During the slow rise time the bus is idle.
What I mean is from the point of a slow ST ram normal cycle, nothing would probably care, if it did the system would simply not work in the first place.

What I'm suggesting is, if the CPU entered fast mode while RW was still transitioning from the previous bus cycle, it could conflict with that slow signal upsetting the start of the fast SDRAM cycle.

I just posted a bit more information after I saw your post..

The LS373 might take ~20ns to completely release the bus. But this delay is from RDAT trailing edge, not from the end of the bus cycle. MMU deasserts RDAT after S7. By next S0 the bus should be tristated. The CPU would not start driving the bus, at least after another full cycle.
Previously I thought it was a address or data problem, but I don't really get that impression at the moment. The data bus is isolated which would mostly only be a address bus issue next.. While this still could be a factor, and not getting the impression that is the root cause which is why I'm thinking about RW instead.

The question would really remain, after any 8MHz cycles at all ST side.. If something is driving such as the blitter, how fast does it release that RW signal..

Is a bit difficult to prove because RW is hardwired on the 536, otherwise I just could have isolated ST side RW to test the theory out..
ijor
Posts: 759
Joined: Fri Nov 30, 2018 8:45 pm

Re: REV 3 - REV 5 - The beginning (ST536)

Post by ijor »

exxos wrote: Thu Jan 01, 2026 3:41 pm However, I really wonder if it's a side effect of RW again. If the ST side RW is driven low, and the CPU enters high-speed mode and is trying to do a write, it would clash on the first cycle.
...
What I'm suggesting is, if the CPU entered fast mode while RW was still transitioning from the previous bus cycle, it could conflict with that slow signal upsetting the start of the fast SDRAM cycle.
I'm not sure this conflict could happen, even when switching to high speed. If GLUE (or Blitter) is driving RW, it means that a DMA transaction is going on and the CPU has granted the bus. The CPU won't reacquire the bus immediately. The CPU takes, at least, one full cycle to take back bus mastership. And then, the CPLD probably takes another cycle, or at least half cycle, to process the CPU request and start an SDRAM access.

This would give you at least two cycles since GLUE started tristating the bus. I don't know how exactly the clock switch is processed. But even at high speed, that would mean ~40ns. That should be plenty of time to avoid any conflict.
If something is driving such as the blitter, how fast does it release that RW signal..
Blitter releases RW even earlier than GLUE. In theory even too early. Anyway, I don't think that the very slow rising edge you are seeing is because of GLUE is tristating RW that slow. The scope you posted shows that the rising edge takes almost four cycles. Certainly GLUE can't be that slow. It is the pull-up and the bus capacitance that take so long.
Is a bit difficult to prove because RW is hardwired on the 536, otherwise I just could have isolated ST side RW to test the theory out..
I suppose you can try adding a delay when the bus is returned to the CPU. But I think you should proceed methodically. Can you detect when a transaction fails? Can you reproduce it somehow? If so, you should be able to debug the failure even without isolating the RW signal. You might need to hook a logic analyzer (which I know you don't like too much :)
http://github.com/ijor/fx68k 68000 cycle exact FPGA core
FX CAST Cycle Accurate Atari ST core
http://pasti.fxatari.com
User avatar
exxos
Site Admin
Site Admin
Posts: 27530
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: REV 3 - REV 5 - The beginning (ST536)

Post by exxos »

ijor wrote: Thu Jan 01, 2026 4:47 pm
I'm not sure this conflict could happen, even when switching to high speed...
It shouldn't happen though but something is happening.. I have got 40ns delay there as the CPU clock is skewed already from the ST 8mhz. the ST bus should be totally "free" by that point. But something trips up.

It could be slightly different that RW is to much of a load with the H5 with the CPU driving it, I need to look into that as well more.

Blitter releases RW even earlier than GLUE. In theory even too early. Anyway, I don't think that the very slow rising edge you are seeing is because of GLUE is tristating RW that slow. The scope you posted shows that the rising edge takes almost four cycles. Certainly GLUE can't be that slow. It is the pull-up and the bus capacitance that take so long.
But that is not the case if you look at the image above.. same signals.. If it was solely pull-up and capacitance, both signals should be equal.. They are not..

The rise time on the first images about 20ns. the second image it takes more like 500ns to go high. But why ? That "long to high" only seems to trigger around when the floppy is accessed.

I suppose you can try adding a delay when the bus is returned to the CPU. But I think you should proceed methodically.
Been doing all that for literally weeks now :P I am literally going round in increasing circles all the time.
Can you detect when a transaction fails? Can you reproduce it somehow? If so, you should be able to debug the failure even without isolating the RW signal. You might need to hook a logic analyzer (which I know you don't like too much :)
It will be monumentally easier if I did have a LA with all the hookup cables etc. But considering I'm dealing in sub 10ns issues, its probably not going to help all that much.

What seems to happen is I have two delay AS30 a clock before letting the SDRAM module see it.. If I delay the CPU clock 2ns, it ~90% works.. I just keep suspecting that the address bus isn't stable by the time its gone though the PLD logic.. but reven removing the ROM address translation, even if it only took 1ns should show improvement, it doesnt.

I was checking the address lines a few days ago and addresses stable about 10ns before AS30 goes low. Its a 6ns PLD, so cant be any faster. The 10ns PLD on the original 536 "worked".. but maybe inherently because it's slower it may not even show this issue.. Almost if I slow down AS30 by a clock, I lose some TTram speed.. so its all very borderline somewhere.

fast slew works best, at least with 2ns on the CPU clock, slow slew it failed miserably.. That is what I been testing out all afternoon in fact all the combinations...
ijor
Posts: 759
Joined: Fri Nov 30, 2018 8:45 pm

Re: REV 3 - REV 5 - The beginning (ST536)

Post by ijor »

exxos wrote: Thu Jan 01, 2026 5:06 pm But that is not the case if you look at the image above.. same signals.. If it was solely pull-up and capacitance, both signals should be equal.. They are not..

The rise time on the first images about 20ns. the second image it takes more like 500ns to go high. But why ? That "long to high" only seems to trigger around when the floppy is accessed.
Because RW is trisated only on DMA transactions. During "normal" operation, all the control signals are actively driven by the bus master, either the CPU, GLUE or BLITTER. These are not open drain signals. The first image with the fast rise time, it is surely when RW was actively driven.

As long as the bus is not granted, the CPU keeps driving all the control signals. Even between consecutive bus cycles. The CPU doesn't release any control signal until granting the bus. And both GLUE and BLITTER do the same.
It will be monumentally easier if I did have a LA with all the hookup cables etc. But considering I'm dealing in sub 10ns issues, its probably not going to help all that much.
An LA wold obviously not detect analog issues. But it would tell you when the problem happens. In the worst case you can combine an LA with a scope using the scope external trigger. There are affordable LAs that can capture at 500 MHz or even 1 GHz.

I would recommend this:
https://www.dreamsourcelab.com/shop/log ... c-u3pro16/
http://github.com/ijor/fx68k 68000 cycle exact FPGA core
FX CAST Cycle Accurate Atari ST core
http://pasti.fxatari.com
User avatar
exxos
Site Admin
Site Admin
Posts: 27530
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: REV 3 - REV 5 - The beginning (ST536)

Post by exxos »

ijor wrote: Thu Jan 01, 2026 5:38 pm Because RW is trisated only on DMA transactions. During "normal" operation, all the control signals are actively driven by the bus master,
makes sense..
The site doesn't load but I do have somewhere the 34 channel analyser but I just never got round to doing anything with it but not really sure how accurate that would be over all those channels.. But after my last ordeal with analysers it just turned into a full-time job trying to get the thing to work in the first place.. Nothing is ever simple.
ijor
Posts: 759
Joined: Fri Nov 30, 2018 8:45 pm

Re: REV 3 - REV 5 - The beginning (ST536)

Post by ijor »

exxos wrote: Thu Jan 01, 2026 6:50 pm The site doesn't load but I do have somewhere the 34 channel analyser but I just never got round to doing anything with it but not really sure how accurate that would be over all those channels..
In general they are pretty accurate. Of course, they are not meant to be as accurate as a scope with similar sampling rate. But that's not the purpose. The idea is that you should be able to see the state of each signal, at the very least, within half cycle resolution, even at 50 MHz.
But after my last ordeal with analysers it just turned into a full-time job trying to get the thing to work in the first place.. Nothing is ever simple.
Yeah, if you are not familiar with working with a LA, it might involve a learning ladder. It might be worth the time though.
http://github.com/ijor/fx68k 68000 cycle exact FPGA core
FX CAST Cycle Accurate Atari ST core
http://pasti.fxatari.com
User avatar
exxos
Site Admin
Site Admin
Posts: 27530
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: REV 3 - REV 5 - The beginning (ST536)

Post by exxos »

Even though everyone hates AI.. I think its got the best theory..

The issue stems from clock domain crossing between the CPU signals (AS30, DS30, and likely A_IN, ACCESSX, etc.) and the SDRAM controller's CLK domain, which is implied to be a high-speed clock (100MHz based on comments). AS30 and DS30 originate in the CLKCPU domain, making them asynchronous to CLK. Directly sampling async signals in a synchronous FSM can cause metastability, where flip-flops enter undefined states, leading to erratic behavior like missed access triggers, incorrect latching, or glitches in conditions like (ACCESS | AS30_s | DS30_s) == 1'b0.

Delaying AS30 and DS30 by one CLK cycle acts as a rudimentary synchronizer, giving the signals time to stabilize before evaluation in the FSM. This reduces (but doesn't eliminate) metastability risk, explaining why the module "works" only with that delay. The address bus (A_IN) appears stable because the problem is primarily with the timing of the strobe/control signals that gate when the address is sampled—not the address values themselves. If A_IN changes near a CLK edge, but the strobes are unreliable, the latch timing (e.g., col_hi_l, etc.) can still fail even if waveforms look clean at a high level.
User avatar
exxos
Site Admin
Site Admin
Posts: 27530
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: REV 3 - REV 5 - The beginning (ST536)

Post by exxos »

Just had a revelation of sorts.. I was delaying 20ns, hence the TTram speed drop. So delaying 10ns is just as good, so TTram only dropped 1% over the "normal" 847% and its stable still ! :hide:

IMG_4364.JPG
IMG_4364.JPG (398.07 KiB) Viewed 90 times
IMG_4365.JPG
IMG_4365.JPG (381.92 KiB) Viewed 90 times

So in order to save my sanity, its just going to have to stop like that.

The "slow down" when BLTFIX is used seems to stem from when I compiled TOS with the cache clear logic. It's a cache problem, original TOS206 didn't have the cache logic, and didn't have the speed drop. Some experienced coder can trace that issue if he wants ;) But I assume its not fixable and there for a reason.
ijor
Posts: 759
Joined: Fri Nov 30, 2018 8:45 pm

Re: REV 3 - REV 5 - The beginning (ST536)

Post by ijor »

exxos wrote: Thu Jan 01, 2026 9:13 pm Even though everyone hates AI.. I think its got the best theory..
The issue stems from clock domain crossing between the CPU signals (AS30, DS30, and likely A_IN, ACCESSX, etc.) and the SDRAM controller's CLK domain, which is implied to be a high-speed clock (100MHz based on comments). AS30 and DS30 originate in the CLKCPU domain, making them asynchronous to CLK. Directly sampling async signals in a synchronous FSM can cause metastability ...
As almost always, AI gets some things correctly, but other things not so much. And unless you know the answer already, it is difficult to realize when the AI is right and when is wrong.

Yes, not following good synchronous design practices could be a problem. I have said that many times already. But I think that AI is reaching some conclusions too fast.

In first place, AFAIK the clocks aren't really asynchronous one from the other. I don't know the details, but with a single oscillator they must be one derived from the other. There is probably some skew between both clocks, and this requires being careful when transferring from one clock to the other. But this is not really sampling async signals. Most important, because the clocks are related, the compiler should be able to check for timing violations.

In second place I doubt very much that the main problem here is meta stability. Meta stability, even when transferring between completely async clocks, is usually an extremely infrequent event in modern logic. May be once a week? This doesn't mean that there couldn't be other synchronization issues, but not precisely metastability.
Delaying AS30 and DS30 by one CLK cycle acts as a rudimentary synchronizer, giving the signals time to stabilize before evaluation in the FSM. This reduces (but doesn't eliminate) metastability risk,
This is completely non sense. And this is is the main reason that I am replying here to the AI (sort to speak :) ). There is no such thing as completely eliminating meta stability risk, at least not in the strict sense. If there is some risk, the only thing you can do is to reduce the TBF (time between failures). It is true though, that you can increase the TBF to levels that it would become irrelevant for most practical purposes.

I would say, as I always say when dealing with CPLD or FPGA code, a comprehensive timing analysis is extremely important. Unfortunately this is not so easy when working with a legacy CPLD. The compiler usually has limited timing analysis support in these cases.
http://github.com/ijor/fx68k 68000 cycle exact FPGA core
FX CAST Cycle Accurate Atari ST core
http://pasti.fxatari.com
User avatar
exxos
Site Admin
Site Admin
Posts: 27530
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: REV 3 - REV 5 - The beginning (ST536)

Post by exxos »

ijor wrote: Fri Jan 02, 2026 2:52 am ...
I would say, as I always say when dealing with CPLD or FPGA code, a comprehensive timing analysis is extremely important. Unfortunately this is not so easy when working with a legacy CPLD. The compiler usually has limited timing analysis support in these cases.
I frankly hate those chips.. I was considering converting to a FPGA a couple weeks ago.. But I would have to start the project all over again.. And I just don't have the time or energy to do that anymore. I've been constantly thinking every week or literally weeks, about throwing the towel and abandoning the whole thing, because of the sheer amount of time and issues it is all taking up. But if I do that it will probably be years before I get around to starting it up again if at all.

I think the problem is that the SDRAM data is not ready when the CPU tries to latch it. It is probably compounded by I have changed from a 10ns to 6ns PLD. It likely skewed the timings enough which was probably masking the original problem somewhat.

Slowing the cycle by 20ns works, as the CPU doesn't latch too early when the data isn't quite stable yet. But of course that is a significant shift for data hold on the SDRAM as well. It works.. Though it slows down TTram access.. Because the CPU is effectively missed 2 sampling edges. So now I just delay by 10ns. It misses the first edge, latches on the second. The result is it is now stable with not much of a performance hit.

That is probably why the AI treats it as meta stability, because its not really wrong about latching on a edge which is changing. But the thing is, that was the conclusion to a very long conversation where I was using the AI as my "rubber duck" in all this to come up with ideas on things to look for.. I probably would have never considered a lot of stuff if it wasn't for AI suggestions. Whether it's talking total BS or not, it is greatly helped with a lot of issues and indeed even talked about SDRAM timings a lot where a lot of things wasn't considered.

So I think people are generally being too harsh towards AI.. I'll probably say the conversation went on probably 100's of pages now. Would I get that sort of help from a real person ? Instant replies for literally months swapping and testing stuff out ? Simple answer no. Right or wrong, AI has been a very valuable tool in working though all the issues. No person would commit to such a investigation, generally I am always on my own in that respect. AI was the only option I had, so I took it and made use of it. If people want to hate on me for that, go ahead, I dont care :)

I do wrestle with what AI says a lot of the times as well. I'm thinking is AI wrong, or am I misunderstanding something. It could be talking BS 100% of the time, it really doesn't matter. What matters, is it took me down a path of investigation and solutions. The ST536 project will be forever better off because of it.

If I took everyone's "advice" in not using AI.. I would have given up weeks ago. I was out of ideas and things to try. AI got me over the finishing line. That's a win in my book.
Post Reply

Return to “ST536 030 ST ACCELERATOR”