You will not be able to post if you are still using Microsoft email addresses such as Hotmail etc
See here for more information viewtopic.php?f=20&t=7296
DO NOT USE DEVICES WHERE THE IP CHANGES CONSTANTLY!
At this time it is unfortunately not possible to white list users when your IP changes constantly.
You may inadvertently get banned because a previous attack may have used the IP you are now on.
So I suggest people only use fixed IP address devices until I can think of a solution for this problem!

ST536 STE EDITION

All about the ST536 030 ST booster.
dml
Posts: 764
Joined: Wed Nov 15, 2017 10:11 pm

Re: ST536 STE EDITION

Post by dml »

> Using this understanding, I rewrote the copy program to use **word-by-word copying**,
> and it now reliably copies the full 512 KB ROM to TT-RAM and verifies correctly in one pass.

Scanning between the lines, something seems off here. The GPT answer/fix only makes sense if there some kind of unresolved problem - or if it fixed the code without telling you what was actually wrong with it.


The 68000 can read or write 16bit words or 32bit longwords at 2 or 4 byte alignment (but not 1-byte alignment - causes address error). The 68020 and above can manage 1-byte alignment - but it is handled differently. The external reads/writes are always aligned with the 'byte fix' being internal masking.

Anyway software will happily use 2-byte alignment for 32bit reads/writes, without caring what the memory address is - whether it is STRam or somewhere higher up. There are no 32bit address alignment restrictions needing to be followed for any of it. Sure it helps with performance to keep things aligned on the later 32bit 680x0 chips but it is not a functional restriction.

So if the ROM-RAM copy is not working on word alignment but is working on long alignment the problem has to be one of these:

1) the copying code was wrong and wasn't copying all the data. e.g. maybe not copying the final word. this is an easy mistake when rounding-down sizes of word or longword sized copy operations. if copying exactly 1MB though, this *is* a long-aligned size so rounding down isn't an issue. if trying to copy 1MB minus 1 word though, that would be a size-rounding fault (you would lose 2 words not the expected 1, when copying longs).

Code: Select all

    ; --- Copy ROM to TT-RAM (long-word copy) ---
    move.l source_addr,a0
    move.l dest_addr,a1
    move.l total_bytes,d0
    lsr.l #2,d0                 ;  <------ potential rounding-down fault!
copy_loop:
    move.l (a0)+,(a1)+
    subq.l #1,d0
    bne copy_loop
This code is only correct so long as the size to copy is a multiple of 4 bytes. If the size is a multiple of 2, it will lose the final word. e.g. if 'total_bytes' is 2, it will divide by 4 and give you zero.

The fix either to to round-up the divide so it copies *at least* the last longword (which may be 1 more word than intended but not missing any)

addq.l #(4-1),d0 ; (s+3)/4
lsr.l #2,d0

...or safer/better is to operate as words only
...or to operate as longs and catch the missing word, which has the same result but more efficient.....

Code: Select all

  
    move.l source_addr,a0
    move.l dest_addr,a1
    move.l total_bytes,d0
    move.w d0,d1 ; <---- catch the missing (odd word or byte) at the end
    lsr.l #2,d0 ; <---- now safe to round down the size without missing a word
copy_loop:
    move.l (a0)+,(a1)+
    subq.l #1,d0
    bne copy_loop
check_final_word:
    btst #1,d1 ; <---- detect the odd wordsize
    beq.s no_odd_word
copy_odd_word:
    move.w (a0)+,(a1)+ ; <---- copy the final word
no_odd_word:
    
(The same last step can be repeated/extended for a final odd byte by adding a btst #0 with a byte copy but we're not expecting odd-byte-sized image here)


2) the ROM image can't run at an unaligned address (this is likely - I'm not sure why the ROM would need to be placed at a 2-byte address? when the ROM code was compiled it might have been statically compiled to a specific ORG which likely would have been aligned so the code may break if unaligned for reasons unrelated to your code). however if the failure is a read-back verify failure not a runtime failure this isn't the main issue.

3) the TTRam controller is not functioning correctly and not allowing a 32bit write (or read) to occur at word aligned addresses. if this is true, lots of software will break when running from or accessing TTRam because none of it will be expecting that restriction. GPT hints at this but does not tell you this would be a serious fault and will stop most software from working! But this should be easy to test.

Just copy a (long-aligned-size!) pattern of known words from an even-word ST source to an odd-word TT destination address using move.l, then verify back the destination using cmp.w as words so the verify is immune. If it doesn't match, you've got problems. Software will not be able to work with it.
dml
Posts: 764
Joined: Wed Nov 15, 2017 10:11 pm

Re: ST536 STE EDITION

Post by dml »

I think I missed a 4th case - which might be the correct one....

So the ROM is actually a 27C4096 16bit ROM on the accelerator board itself? Not on the ST bus but on the same controller as the TTRAM?

In that case

4) The 16bit ROM possibly being handled as a 32bit datapath device by the memory controller (which is also hinted in a roundabout way by some of the GPT results) and it is the 32bit misaligned reads which are failing from ROM space, not the 32bit misaligned writes to TTRAM.

So I guess if there are further probems, you would want to first figure out whether it is ROM misaligned read which is giving incorrect results or the TTRAM misaligned write. Or something else.


BTW the CDIS signal is cache-disable. When asserted it makes the processor behave as if there is no cache. This is similar to (but not exactly the same as) the cache-inhibit signal which IIRC allows reads/writes to independently bypass the cache while it is on. When CDIS is used, the cache contents remain as they are, don't change and are ignored by memory operations until the signal reverts.

I'm not sure why CDIS would affect your results unless there is a memory control fault and is being hidden (or caused) by the cache prefetching. That would suggest some kind of hardware fault though - maybe trying back to the 16bit ROM being mis-handled as a 32bit device.
User avatar
exxos
Site Admin
Site Admin
Posts: 27085
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: ST536 STE EDITION

Post by exxos »

@dml

This was the original code.. It wouldn't work with CDIS disabled.

Code: Select all

 
exxos_do_rom_copy:

/* --- Setup copy parameters ---*/

move.l #$E00000,source_addr

move.l #$4F00000,dest_addr

move.l #$80000,total_bytes

move.l source_addr,a0

move.l dest_addr,a1

move.l total_bytes,d0

lsr.l #2,d0

subq.l #1,d0

copy_verify_loop:

cmp.l #$E80000,a0

bhi addr_error

cmp.l #$4F80000,a1

bhi addr_error

move.l (a0)+,d1

move.l d1,(a1)

move.l (a1)+,d2

cmp.l d1,d2

bne verify_fail /* skip ROM trigger if fail*/

subq.l #1,d0

bne copy_verify_loop

lea copy_done_msg(pc),a3

bsr print_string

/* trigger ROM redirect to RAM*/

move.l #$4F80000,a0

cmp.l #$4F80000,a0

bhi trigger_error

move.w (a0),d0

nop

nop

nop

nop

nop

/* Verification OK*/

/*lea verify_ok_msg(pc),a3 */

/*bsr print_string */

moveq #0,d0

bra done

verify_fail:

lea veri_fail_msg(pc),a3

bsr print_string

move.l #$4F00000,a2

move.l total_bytes,d3

lsr.l #2,d3

sub.l d0,d3

lsl.l #2,d3

add.l d3,a2

lea addr_msg(pc),a3

bsr print_string

move.l a2,d0

bsr print_hex_long

lea newline(pc),a3

bsr print_string

lea src_msg(pc),a3

bsr print_string

move.l d1,d0

bsr print_hex_long

lea newline(pc),a3

bsr print_string

lea dst_msg(pc),a3

bsr print_string

move.l d2,d0

bsr print_hex_long

lea newline(pc),a3

bsr print_string

moveq #2,d0

bra done

trigger_error:

lea trig_error_msg(pc),a3

bsr print_string

move.l #$4F80000,d0

bsr print_hex_long

lea newline(pc),a3

bsr print_string

moveq #3,d0

bra done

addr_error:

lea error_msg(pc),a3

bsr print_string

lea src_addr_msg(pc),a3

bsr print_string

move.l a0,d0

bsr print_hex_long

lea newline(pc),a3

bsr print_string

lea dst_addr_msg(pc),a3

bsr print_string

move.l a1,d0

bsr print_hex_long

lea newline(pc),a3

bsr print_string

moveq #1,d0

done:

rts

/* --- Print long hex ---*/

print_hex_long:

movem.l d0-d2/a0-a2,-(sp)

lea hex_buffer(pc),a3

move.l d0,d1

moveq #7,d2

.hex_loop:

rol.l #4,d1

move.b d1,d0

and.b #$0F,d0

cmp.b #10,d0

blt.s .digit

add.b #'A'-10,d0

bra.s .store

.digit:

add.b #'0',d0

.store:

move.b d0,(a0)+

dbra d2,.hex_loop

move.b #0,(a0)

lea hex_buffer(pc),a3

bsr print_string

movem.l (sp)+,d0-d2/a0-a2

rts

	

text

	

no_tt_ram:

lea ttram_error_msg(pc),a3

bsr print_string

moveq #4,d0     /* Custom error code for TT-RAM unavailable */

bra done

	

In STOS I think I was trying to copy a to large block from ROM. When I reduced the copy size it work. When I copy a larger block it trashed the system somehow.

But don't know what's above ROM address offhand, registers or nothing.. Data wise we don't care. But it corrupted the system somehow. But copying the correct size from works fine.

I've not worked it out but may be copying a too large of a ROM block wrapped around in the memory map and went back to address zero because I don't actually have the upper two address lines connected on the CPU. Plus I have a register in TTram which sits above the ROM shadow which if I copied too large of a block that register could get corrupted and if the ROM copy is bad then the machine will run a bad copy of ROM and of course will go nuts.

The SDRAM controller is from the TF536.. I would assume if there was some sort of problem that it would have been noticed by now considering thousands of the things have been built by now.
User avatar
exxos
Site Admin
Site Admin
Posts: 27085
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: ST536 STE EDITION

Post by exxos »

## 27C4096 ROM → TT-RAM Copy Routine Debug Timeline

This thread documents each test and observation during the debugging of the Atari ST assembly routine that copies and verifies the 27C4096 ROM into TT-RAM.
Each stage notes the issue found, test performed, and the outcome.

---

### 1. Initial Program Failure
Symptom:
Early version of the routine produced inconsistent verification — sometimes OK, sometimes corrupted data depending on whether the cache was enabled.

Tests Performed:
- Repeated copy and verify runs with `CDIS` (cache disabled) and `CENB` (cache enabled).
- Added debug prints before and after copy to confirm flow.
- Compared TT-RAM contents directly using monitor tools.

Finding:
With cache enabled, the copy and verify appeared correct.
When cache was disabled, verification failed — revealing that the CPU cache had been masking underlying bus or data-width mismatches.
The issue was that the ROM is 16-bit wide (27C4096), but the original copy used 32-bit `move.l` instructions, leading to swapped or duplicated words on the bus.

Fix:
Changed the copy loop to use `move.w` instead of `move.l`.
After this, the routine worked correctly regardless of cache state.

---

### 2. Verify Loop Logic Error
Symptom:
Verification always printed “successful” even when TT-RAM was deliberately corrupted.

Tests Performed:
- Injected wrong data into TT-RAM after copy.
- Stepped through `verify_loop` manually in Devpac.
- Checked whether `verify_fail` branch was ever taken.

Finding:
Assembler’s local label format (`.next_word`) caused wrong branch resolution in some TOS build systems.
The `bra verify_fail` never triggered correctly.

Fix:
Replaced all local labels (e.g. `.next_word`) with global-style labels (e.g. `next_word:`).
Verification failure branch now worked correctly.

---

### 3. Incorrect Error Reporting
Symptom:
When verification failed, “expected” and “actual” values printed were identical or off by one word.

Tests Performed:
- Added address output on failure.
- Compared printed data with memory monitor results.

Finding:
The verification loop post-incremented both pointers before entering `verify_fail`, so D1 and D2 pointed to the next word.

Fix:
After calculating the failing address, re-read both ROM and RAM words directly at that address before printing results.

---

### 4. Corruption Test Confirmation
Goal:
Ensure failure path and messages worked correctly.

Action:
Inserted a deliberate corruption routine:

Code: Select all

move.l #$4F00002,a3
cmp.l a1,a3
bne .no_corrupt
move.w #$FFFF,(a1)
.no_corrupt:
Result:
Output confirmed:
Verification failed!
Failing address: $04F00002
Expected ROM data: $00000002
Actual RAM data: $0000FFFF
✅ Verified that the error handling and printout were fully functional.

---

### 5. Port to TOS 2.06 Source (Gemini Desktop Build)
Symptom:
MADMAC assembler reported:
&& 8326 invalid instruction length
&& 8331 illegal relative address
Tests Performed:
- Converted Devpac comments from `;` to `/* ... */`.
- Removed local label dots (`.label`) and simplified all label references.
- Confirmed assembler accepted standard syntax.

Finding:
MADMAC is stricter than Devpac regarding relative labels and local syntax.
The invalid instruction errors were from label misreferences after `.label` removal.

Fix:
Standardized all label names, verified clean assembly under Gemini/TOS build.
Code compiled successfully.

---

### 6. Hardware Register Read ($04F80000)
Symptom:
A direct read from address `$04F80000` didn’t appear to trigger the expected hardware behaviour, even though poking it from STOS BASIC worked.

Tests Performed:
- Inserted NOPs around the read.
- Verified address decoding from hardware side.
- Moved read section to different positions in the routine.

Finding:
The read sequence was placed below the `wait_key:` label and never executed.
When moved before that branch, the read worked fine.

Fix:
Moved the read block above `wait_key:` and added a few `nop` instructions for stability.

---

### 7. Removing Wait-Key on Success
Goal:
Allow automatic exit after successful verification while keeping manual keypress pause on failure.

Fix:
Removed `bra wait_key` from the “verification OK” section and replaced with:

Code: Select all

clr.w -(sp)
trap #1
Result:
✅ On success, program exits automatically.
On failure, it pauses for keypress before returning.

---

### Final Outcome
After all debugging and validation:
- Works correctly with or without cache enabled.
- Handles 16-bit ROM width properly.
- Verification and failure output fully functional.
- Hardware register read and exit flow confirmed.
- Code builds cleanly inside the TOS 2.06 + Gemini build system.

Status: ✅ Stable and verified in hardware.

---

### Lessons Learned
- The 27C4096 EPROM is 16-bit wide — copying with `move.l` caused data swapping.
- Cache enabled masked bus issues; disabling cache exposed them.
- MADMAC rejects Devpac-style local labels and semicolon comments.
- Always re-read failing words in verify routines after post-increment.
- When adding hardware register reads, confirm code placement before exit branches.
User avatar
exxos
Site Admin
Site Admin
Posts: 27085
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: ST536 STE EDITION

Post by exxos »

If I get chance tonight I will re-run the long word copy in Stos verify it actually works. I believe all the tests I did did work but I did so much yesterday I cannot remember exactly everything that was done..

If I remember rightly it did work but I was not entirely convinced because it was not running as fast as assembly code and could be masking whatever problem was anyway.
User avatar
Badwolf
Site sponsor
Site sponsor
Posts: 2947
Joined: Tue Nov 19, 2019 12:09 pm

Re: ST536 STE EDITION

Post by Badwolf »

exxos wrote: Tue Oct 21, 2025 9:45 pm Oddly when CDIS is closed, TTram is like 50% faster :shrug:
Problem with the burst mode logic? Could that possibly be related to your longword copy problem?

BW
DFB1 Open source 50MHz 030 and TT-RAM accelerator for the Falcon
Smalliermouse ST-optimised USB mouse adapter based on SmallyMouse2
FrontBench The Frontier: Elite 2 intro as a benchmark
dml
Posts: 764
Joined: Wed Nov 15, 2017 10:11 pm

Re: ST536 STE EDITION

Post by dml »

exxos wrote: Wed Oct 22, 2025 10:24 am ### Lessons Learned
- The 27C4096 EPROM is 16-bit wide — copying with `move.l` caused data swapping.
- Cache enabled masked bus issues; disabling cache exposed them.
- MADMAC rejects Devpac-style local labels and semicolon comments.
- Always re-read failing words in verify routines after post-increment.
- When adding hardware register reads, confirm code placement before exit branches.
Ooof! That's quite a menu. But it adds up.
exxos wrote: Wed Oct 22, 2025 10:24 am MADMAC rejects Devpac-style local labels and semicolon comments.
Is there a reason for using MADMAC? Is anyone else using this currently on ST? (I mean, is it being maintained? I don't know)

RMAC is an alternative, more recent/maintained version of MADMAC it and it accepts Devpac syntax. There is also VASM - which again can accept Devpac syntax.

BTW I have noticed that with some assemblers, specifying the same local label twice WITHOUT separating them by a global label, can assemble to incorrect results. It uses the first definition and ignores the second (or, other way around - same bad).

...so something like this below is broken (of course) but will sometimes assemble without an error. Especially dangerous of copying/pasting code around.

Code: Select all

global_label:
  move.w #4-1,d0
.loop:
  nop
  dbra d0,.loop
  ;
  move.w #8-1,d0
.loop:
  nop
  dbra d0,.loop

But this will work:

Code: Select all

global_label:
  move.w #4-1,d0
.loop:
  nop
  dbra d0,.loop
  ;
global_label_2:
  move.w #8-1,d0
.loop:
  nop
  dbra d0,.loop
I think VASM had this problem - maybe an earlier version - but I forget exactly the circumstances.

Not nice though if you aren't aware of it.
User avatar
exxos
Site Admin
Site Admin
Posts: 27085
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: ST536 STE EDITION

Post by exxos »

Badwolf wrote: Wed Oct 22, 2025 10:50 am Problem with the burst mode logic? Could that possibly be related to your longword copy problem?
I thought same Originally and disabled burst but made no odds.
User avatar
exxos
Site Admin
Site Admin
Posts: 27085
Joined: Wed Aug 16, 2017 11:19 pm
Location: UK
Contact:

Re: ST536 STE EDITION

Post by exxos »

dml wrote: Wed Oct 22, 2025 11:05 am Is there a reason for using MADMAC? Is anyone else using this currently on ST? (I mean, is it being maintained? I don't know)

RMAC is an alternative, more recent/maintained version of MADMAC it and it accepts Devpac syntax. There is also VASM - which again can accept Devpac syntax.
Its the only thing which worked. I tried installing other stuff to try and compile EMUTOS etc years ago.. I tried to follow vinces guide but nothing would install and kept crashing.. Nobody offered any help, so gave up with it all and just stuck with what worked.

But this will work:

Code: Select all

global_label:
  move.w #4-1,d0
.loop:
  nop
  dbra d0,.loop
  ;
global_label_2:
  move.w #8-1,d0
.loop:
  nop
  dbra d0,.loop
Actually it probably wouldn't.. global_label_2 global_label_1 causes the compiler to think it's the same label. It only seems to check the first few letters... I keep getting caught out by that :(
dml
Posts: 764
Joined: Wed Nov 15, 2017 10:11 pm

Re: ST536 STE EDITION

Post by dml »

exxos wrote: Wed Oct 22, 2025 12:29 pm Actually it probably wouldn't.. global_label_2 global_label_1 causes the compiler to think it's the same label. It only seems to check the first few letters... I keep getting caught out by that :(
There is an 8-char limit on symbol table entries in the final TOS executable.

However - that's a limitation of the TOS executable format (ignoring DRI extended symbols for a minute). It shouldn't be a limitation of the assembler itself when figuring out what addresses are intended. So at worst it should either assemble with an error or leave you with some duplicate looking symbols when debugging - while still working correctly.

Although, if it does at least assemble with an error that's not so bad. If it assembles without errors and just confuses the labels - that's bad.
Post Reply

Return to “ST536 030 ST ACCELERATOR”