Author Topic: G5 qemu attempts.  (Read 58797 times)

Offline darthnVader

  • Platinum Member
  • *****
  • Posts: 679
  • New Member
Re: G5 qemu attempts.
« Reply #100 on: October 16, 2018, 05:45:07 PM »
How do I check if cr7 was greater than?

Do I read the entire 32 bit cr or is there some other way of knowing if it jumped to 0x20fb70?

Don't bother with them at all! We don't need to decompile this function - that's already done. Just record which code instructions are executed and concentrate yourself on everything that changes general purpose registers (r0...r31).

Is this useful, or are you asking for something else?

Code: [Select]
(gdb) stepi
0x000000000020fba4 in ?? ()
1: x/i $pc
=> 0x20fba4: add     r4,r4,r5
(gdb) p/x $r4
$4 = 0x3fee5048
(gdb) stepi
0x000000000020fba8 in ?? ()
1: x/i $pc
=> 0x20fba8: andi.   r0,r4,31
(gdb) p/x $r4
$5 = 0x3fee5100
(gdb) stepi
0x000000000020fbac in ?? ()
1: x/i $pc
=> 0x20fbac: bnel    0x20fc08
(gdb) stepi
0x000000000020fbb0 in ?? ()
1: x/i $pc
=> 0x20fbb0: bl      0x20fd08
(gdb) stepi
0x000000000020fd08 in ?? ()
1: x/i $pc
=> 0x20fd08: rlwinm. r11,r5,27,5,31
(gdb) p/x $r5
$6 = 0xb8
(gdb) stepi
0x000000000020fd0c in ?? ()
1: x/i $pc
=> 0x20fd0c: mtctr   r11
(gdb) stepi
0x000000000020fd10 in ?? ()
1: x/i $pc
=> 0x20fd10: beqlr   
(gdb) stepi
0x000000000020fd14 in ?? ()
1: x/i $pc
=> 0x20fd14: andi.   r0,r3,7
(gdb) p/x $r3
$7 = 0x3fee50f0
(gdb) stepi
0x000000000020fd18 in ?? ()
1: x/i $pc
=> 0x20fd18: beq     0x20fd78
(gdb) stepi
0x000000000020fd78 in ?? ()
1: x/i $pc
=> 0x20fd78: lfdu    f0,-32(r3)
(gdb) stepi
0x000000000020fd7c in ?? ()
1: x/i $pc
=> 0x20fd7c: addi    r4,r4,-32
(gdb) p/x $r4
$8 = 0x3fee5100
(gdb) stepi
0x000000000020fd80 in ?? ()
1: x/i $pc
=> 0x20fd80: lfd     f1,8(r3)
(gdb) stepi
0x000000000020fd84 in ?? ()
1: x/i $pc
=> 0x20fd84: lfd     f2,16(r3)
(gdb) stepi
0x000000000020fd88 in ?? ()
1: x/i $pc
=> 0x20fd88: lfd     f3,24(r3)
(gdb) stepi
0x000000000020fd8c in ?? ()
1: x/i $pc
=> 0x20fd8c: dcbz    0,r4
(gdb) stepi
0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb) stepi
0x000000000020fd94 in ?? ()
1: x/i $pc
=> 0x20fd94: stfd    f1,8(r4)
(gdb) stepi
0x000000000020fd98 in ?? ()
1: x/i $pc
=> 0x20fd98: stfd    f2,16(r4)
(gdb) stepi
0x000000000020fd9c in ?? ()
1: x/i $pc
=> 0x20fd9c: stfd    f3,24(r4)
(gdb) stepi
0x000000000020fda0 in ?? ()
1: x/i $pc
=> 0x20fda0: bdnz    0x20fd78
(gdb) stepi
0x000000000020fd78 in ?? ()
1: x/i $pc
=> 0x20fd78: lfdu    f0,-32(r3)
(gdb) stepi
0x000000000020fd7c in ?? ()
1: x/i $pc
=> 0x20fd7c: addi    r4,r4,-32
(gdb) p/x $r4
$9 = 0x3fee50e0
(gdb) stepi
0x000000000020fd80 in ?? ()
1: x/i $pc
=> 0x20fd80: lfd     f1,8(r3)
(gdb) stepi
0x000000000020fd84 in ?? ()
1: x/i $pc
=> 0x20fd84: lfd     f2,16(r3)
(gdb) stepi
0x000000000020fd88 in ?? ()
1: x/i $pc
=> 0x20fd88: lfd     f3,24(r3)
(gdb) stepi
0x000000000020fd8c in ?? ()
1: x/i $pc
=> 0x20fd8c: dcbz    0,r4
(gdb) stepi
0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb) stepi
0x000000000020fd94 in ?? ()
1: x/i $pc
=> 0x20fd94: stfd    f1,8(r4)
(gdb) stepi
0x000000000020fd98 in ?? ()
1: x/i $pc
=> 0x20fd98: stfd    f2,16(r4)
(gdb) stepi
0x000000000020fd9c in ?? ()
1: x/i $pc
=> 0x20fd9c: stfd    f3,24(r4)
(gdb) stepi
0x000000000020fda0 in ?? ()
1: x/i $pc
=> 0x20fda0: bdnz    0x20fd78
(gdb) stepi
0x000000000020fd78 in ?? ()
1: x/i $pc
=> 0x20fd78: lfdu    f0,-32(r3)
(gdb) stepi
0x000000000020fd7c in ?? ()
1: x/i $pc
=> 0x20fd7c: addi    r4,r4,-32
(gdb) p/x $r4
$10 = 0x3fee50c0
(gdb) stepi
0x000000000020fd80 in ?? ()
1: x/i $pc
=> 0x20fd80: lfd     f1,8(r3)
(gdb) stepi
0x000000000020fd84 in ?? ()
1: x/i $pc
=> 0x20fd84: lfd     f2,16(r3)
(gdb) stepi
0x000000000020fd88 in ?? ()
1: x/i $pc
=> 0x20fd88: lfd     f3,24(r3)
(gdb) stepi
0x000000000020fd8c in ?? ()
1: x/i $pc
=> 0x20fd8c: dcbz    0,r4
(gdb) stepi
0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb) stepi
0x000000000020fd94 in ?? ()
1: x/i $pc
=> 0x20fd94: stfd    f1,8(r4)
(gdb) stepi
0x000000000020fd98 in ?? ()
1: x/i $pc
=> 0x20fd98: stfd    f2,16(r4)
(gdb) stepi
0x000000000020fd9c in ?? ()
1: x/i $pc
=> 0x20fd9c: stfd    f3,24(r4)
(gdb) stepi
0x000000000020fda0 in ?? ()
1: x/i $pc
=> 0x20fda0: bdnz    0x20fd78
(gdb) stepi
0x000000000020fd78 in ?? ()
1: x/i $pc
=> 0x20fd78: lfdu    f0,-32(r3)
(gdb) stepi
0x000000000020fd7c in ?? ()
1: x/i $pc
=> 0x20fd7c: addi    r4,r4,-32
(gdb) p/x $r4
$11 = 0x3fee50a0
(gdb)

Offline powermax

  • Enthusiast Member
  • ***
  • Posts: 80
  • Hobbyist programmer
Re: G5 qemu attempts.
« Reply #101 on: October 16, 2018, 07:00:21 PM »
Code: [Select]
1: x/i $pc
=> 0x20fd08: rlwinm. r11,r5,27,5,31
(gdb) p/x $r5
$6 = 0xb8

Please pay attention to the fact that 1: x/i $pc always shows NEXT instruction. p/x $r5 will print out the value of r5 BEFORE 0x20fd08 in this case. You need to do stepi first to trace past 0x20fd08, then p/x $r11 to track the destination of rlwinm.

It's getting interesting here. I'd like to request the value of $r11 at 0x20fd10 on both G4 and G5.

Is it doable?

Offline darthnVader

  • Platinum Member
  • *****
  • Posts: 679
  • New Member
Re: G5 qemu attempts.
« Reply #102 on: October 16, 2018, 07:35:09 PM »
G4:

Code: [Select]
Breakpoint 1, 0x000000000020fd08 in ?? ()
(gdb) display/i $pc
1: x/i $pc
=> 0x20fd08: rlwinm. r11,r5,27,5,31
(gdb) p/x $r11
$1 = 0xd
(gdb) stepi
0x000000000020fd0c in ?? ()
1: x/i $pc
=> 0x20fd0c: mtctr   r11
(gdb) p/x $r11
$2 = 0x5
(gdb)

970:

Code: [Select]
Breakpoint 1, 0x000000000020fd08 in ?? ()
2: x/i $pc
=> 0x20fd08: rlwinm. r11,r5,27,5,31
1: x/i $pc
=> 0x20fd08: rlwinm. r11,r5,27,5,31
(gdb) display/i $pc
3: x/i $pc
=> 0x20fd08: rlwinm. r11,r5,27,5,31
(gdb) p/x $r11
$3 = 0xd
(gdb) stepi
0x000000000020fd0c in ?? ()
3: x/i $pc
=> 0x20fd0c: mtctr   r11
2: x/i $pc
=> 0x20fd0c: mtctr   r11
1: x/i $pc
=> 0x20fd0c: mtctr   r11
(gdb) p/x $r11
$4 = 0x5
(gdb)

Offline powermax

  • Enthusiast Member
  • ***
  • Posts: 80
  • Hobbyist programmer
Re: G5 qemu attempts.
« Reply #103 on: October 16, 2018, 11:21:42 PM »
970:

Code: [Select]
Breakpoint 1, 0x000000000020fd08 in ?? ()
2: x/i $pc
=> 0x20fd08: rlwinm. r11,r5,27,5,31
1: x/i $pc
=> 0x20fd08: rlwinm. r11,r5,27,5,31
(gdb) display/i $pc
3: x/i $pc
=> 0x20fd08: rlwinm. r11,r5,27,5,31
(gdb) p/x $r11
$3 = 0xd
(gdb) stepi
0x000000000020fd0c in ?? ()
3: x/i $pc
=> 0x20fd0c: mtctr   r11
2: x/i $pc
=> 0x20fd0c: mtctr   r11
1: x/i $pc
=> 0x20fd0c: mtctr   r11
(gdb) p/x $r11
$4 = 0x5
(gdb)

Looks correct. What do $r3 and $r4 contain at the same location on 970?

Offline darthnVader

  • Platinum Member
  • *****
  • Posts: 679
  • New Member
Re: G5 qemu attempts.
« Reply #104 on: October 17, 2018, 06:07:16 AM »
Code: [Select]
(gdb) p/x $r3
$5 = 0x3fee50f0
(gdb) p/x $r4
$6 = 0x3fee5100
(gdb)

Offline powermax

  • Enthusiast Member
  • ***
  • Posts: 80
  • Hobbyist programmer
Re: G5 qemu attempts.
« Reply #105 on: October 17, 2018, 08:24:44 AM »
Code: [Select]
(gdb) p/x $r3
$5 = 0x3fee50f0
(gdb) p/x $r4
$6 = 0x3fee5100
(gdb)

Both $r3 and $r4 contain correct values.

We narrowed down the search to a single loop. Below its full asm:

Code: [Select]
0x20FD78     lfdu    fp0, -0x20(r3)
0x20FD7C     addi    r4, r4, -0x20
0x20FD80     lfd     fp1, 8(r3)
0x20FD84     lfd     fp2, 0x10(r3)
0x20FD88     lfd     fp3, 0x18(r3)
0x20FD8C     dcbz    r0, r4
0x20FD90     stfd    fp0, 0(r4)
0x20FD94     stfd    fp1, 8(r4)
0x20FD98     stfd    fp2, 0x10(r4)
0x20FD9C     stfd    fp3, 0x18(r4)
0x20FDA0     bdnz    0x20FD78
0x20FDA4     blr

This loop driven by the bdnz instruction will be executed 5 times. Each iteration will copy 32 bytes from $r3 to $r4, then both pointers will be decremented by 32. Don't be confused by the floating-point instructions lfd/stfd used there - they perfom no FP calculations, they are used there because they are capable of reading/writng 8 bytes at once.

dcbz clears all bytes of the block pointed by $r4 to zero. It should ensure that everything written to this block before will be wiped away because we're going to write new values.

A cache block on 32-bit PPC is 32 bytes long. On 970, it's 128 bytes long. My bet is that G5's dcbz zeros more bytes than the corresponding dcbz on 32bit CPU. That could explain why everything at the address 0x3fee5000 will be wiped away.

The question is how to catch this bug. I assume that the memory corruption happens on the cache block boundary, i.e. when the address in $r4 < 0x3fee5080. This condition will be reached after the 5th loop iteration (0x3fee5100 - 32 bytes * 5 = 0x3fee5060).

To test that, I'd set a breakpoint at 0x20FD90 (right after dcbz) and monitor memory changes. Expected values of $r4 are:

1st iteration: 0x3fee50e0
2nd iteration: 0x3fee50c0
3rd iteration: 0x3fee50a0
4th iteration: 0x3fee5080
---- it's the cache block boundary ----
5th iteration: 0x3fee5060
---- end of loop ----

When our breakpoint is reached for the 5th time, r4 should contain 0x3fee5060 and dcbz is expected to zero the whole cache block 0x3fee5000...0x3fee5080(!)

You can verify that by dumping the memory block at 0x3fee5000 in the 5th iteration.

Below the same as GDB debugging program:

Code: [Select]
(gdb) break *0x203ce8
(gdb) cont
(gdb) display/i $pc
(gdb) break *0x20FD90
(gdb) cont
... we should stop after the 1st dcbz here
(gdb) p/x $r4 (should be 0x3fee50e0)
(gdb) x/8xw 0x3fee5000 (should contain non-zero values)
(gdb) cont (execute 2nd dcbz)
(gdb) cont (execute 3rd dcbz)
(gdb) cont (execute 4th dcbz)
(gdb) p/x $r4 (should be 0x3fee5080)
(gdb) x/8xw 0x3fee5000 (should contain non-zero values)
(gdb) cont (execute 5th dcbz)
(gdb) p/x $r4 (should be 0x3fee5060)
(gdb) x/8xw 0x3fee5000 (will supposedly contain all zeroes)

Could you verify that?

Sorry for the long post. I hope you can follow me...

Offline darthnVader

  • Platinum Member
  • *****
  • Posts: 679
  • New Member
Re: G5 qemu attempts.
« Reply #106 on: October 17, 2018, 09:10:46 AM »
Looks like everything you said is correct:

Code: [Select]
(gdb) break *0x203ce8
Breakpoint 1 at 0x203ce8
(gdb) cont
The program is not being run.
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x00000000fff00100 in ?? ()
(gdb) cont
Continuing.

Breakpoint 1, 0x0000000000203ce8 in ?? ()
(gdb)  display/i $pc
1: x/i $pc
=> 0x203ce8: bl      0x20fe70
(gdb) break *0x20FD90
Breakpoint 2 at 0x20fd90
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb) p/x $r4
$1 = 0x3fee50e0
(gdb)  x/8xw 0x3fee5000
0x3fee5000: 0x40000000 0x40000000 0x00000000 0x00000000
0x3fee5010: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb) p/x $r4
$2 = 0x3fee5080
(gdb) x/8xw 0x3fee5000
0x3fee5000: 0x40000000 0x40000000 0x00000000 0x00000000
0x3fee5010: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb) p/x $r4
$3 = 0x3fee5060
(gdb) x/8xw 0x3fee5000
0x3fee5000: 0x00000000 0x00000000 0x00000000 0x00000000
0x3fee5010: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb)


Offline darthnVader

  • Platinum Member
  • *****
  • Posts: 679
  • New Member
Re: G5 qemu attempts.
« Reply #107 on: October 17, 2018, 09:16:32 AM »
Here is what it looks like for the G4 CPU:

Code: [Select]
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x00000000fff00100 in ?? ()
(gdb) cont
Continuing.
Hardware watchpoint 3: *0x3fee5000

Old value = 0
New value = 1073741824
0x000000000020271c in ?? ()
(gdb) cont
Continuing.

Breakpoint 1, 0x00000000002052f0 in ?? ()
(gdb) cont
Continuing.

Breakpoint 2, 0x0000000000205a24 in ?? ()
(gdb) cont
Continuing.
Hardware watchpoint 3: *0x3fee5000

Old value = 1073741824
New value = 1073725440
0x0000000000f105a0 in ?? ()
(gdb) cont
Continuing.

Are we jumping off into nowhere on the G5?

That's why I asked if we were jumping off into nowhere the New value should be = 1073725440 not zero.

Offline powermax

  • Enthusiast Member
  • ***
  • Posts: 80
  • Hobbyist programmer
Re: G5 qemu attempts.
« Reply #108 on: October 17, 2018, 09:21:04 AM »
Looks like everything you said is correct:

Yes, my expectations have been confirmed. The bug is also in the processor-specific bcopy() code caused by the incompatibilities between the dcbz implementations in 32bit/64bit PowerPC CPUs.

Could you verify the same piece of code on G4, just for the sake of completeness?

Offline powermax

  • Enthusiast Member
  • ***
  • Posts: 80
  • Hobbyist programmer
Re: G5 qemu attempts.
« Reply #109 on: October 17, 2018, 09:28:00 AM »
Code: [Select]
Old value = 1073741824
New value = 1073725440
0x0000000000f105a0 in ?? ()
(gdb) cont
Continuing.

Are we jumping off into nowhere on the G5?

That's why I asked if we were jumping off into nowhere the New value should be = 1073725440 not zero.

Well, the code should jump to 0xf105a0, not to 1073725440 (0x3fffc000). The latter is the amount of physical memory after exclusion of the ROM-in-RAM portion.

But on the 970, we'll get here but crash due to an exception instead...

Offline darthnVader

  • Platinum Member
  • *****
  • Posts: 679
  • New Member
Re: G5 qemu attempts.
« Reply #110 on: October 17, 2018, 09:39:09 AM »
Looks like everything you said is correct:

Yes, my expectations have been confirmed. The bug is also in the processor-specific bcopy() code caused by the incompatibilities between the dcbz implementations in 32bit/64bit PowerPC CPUs.

Could you verify the same piece of code on G4, just for the sake of completeness?

G4, everything looks the same but the last bit isn't all zero, so you're on the right track:

Code: [Select]
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x00000000fff00100 in ?? ()
(gdb) cont
Continuing.

Breakpoint 1, 0x0000000000203ce8 in ?? ()
(gdb) display/i $pc
1: x/i $pc
=> 0x203ce8: bl      0x20fe70
(gdb)  break *0x20FD90
Breakpoint 2 at 0x20fd90
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb)  p/x $r4
$1 = 0x3fee50e0
(gdb) x/8xw 0x3fee5000
0x3fee5000: 0x40000000 0x40000000 0x00000000 0x00000000
0x3fee5010: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb)  p/x $r4
$2 = 0x3fee5080
(gdb) x/8xw 0x3fee5000
0x3fee5000: 0x40000000 0x40000000 0x00000000 0x00000000
0x3fee5010: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) cont
Continuing.

Breakpoint 2, 0x000000000020fd90 in ?? ()
1: x/i $pc
=> 0x20fd90: stfd    f0,0(r4)
(gdb)  p/x $r4
$3 = 0x3fee5060
(gdb) x/8xw 0x3fee5000
0x3fee5000: 0x40000000 0x40000000 0x00000000 0x00000000
0x3fee5010: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb)


Offline powermax

  • Enthusiast Member
  • ***
  • Posts: 80
  • Hobbyist programmer
Re: G5 qemu attempts.
« Reply #111 on: October 17, 2018, 12:50:23 PM »
G4, everything looks the same but the last bit isn't all zero, so you're on the right track...

Yes.

Let's summarize what we have found out in this thread:

  • An attempt to boot Mac OS 9.2.1 on a Mac99 machine with the unsupported 970 (G5) CPU in QEMU fails early in the Trampoline. A memory overlapping error is reported.
  • We found out that the real cause for the memory overlapping error is that the content of the NKSystemInfo structure suddenly becomes filled with zeros so the memory relocation cannot operate correctly.
  • After a long debugging session discussing a lot of low-level details, the failing code could be located and analyzed. It's a part of the bcopy() function. Its purpose is to quickly move data from one memory location to another.

    bcopy is written in assembly. It utilizes the low-level dcbz instruction that clears cache blocks to zero. Unfortunately, dcbz works differently on 32bit and 64bit implementations: due to bigger cache block size in PPC64, some important data in memory becomes corrupted.

    It's possible to fix this problem by replacing legacy Trampoline's bcopy with Darwin's implementation that supports latest CPUs, see https://opensource.apple.com/source/Libc/Libc-262/ppc/gen/bcopy.s.auto.html

For the purpose of bug localization, a debugging toolchain based on QEMU+GDB has been successfully set up. Kudos to darthnVader!

This toolchain has been proven indispensable in spotting problems with low-level OS code when running on unsupported hardware. Although the Trampoline offers a simple debugging mode that displays helpful debugging messages in the OpenFirmware console, it's often not enough to understand what exactly goes wrong without invoking a low-level debugger.

Unfortunately, such a flexible debugging instrumentation requires a running OS and, consequently, isn't available during OS boot in real PPC hardware. Fortunately, with the above mentioned toolchain, it's easy to overcome this limitation. QEMU+GDB will therefore remain our only debugging option until we'll get the lowest OS level up and running.

Offline darthnVader

  • Platinum Member
  • *****
  • Posts: 679
  • New Member
Re: G5 qemu attempts.
« Reply #112 on: October 17, 2018, 03:37:57 PM »


    It's possible to fix this problem by replacing legacy Trampoline's bcopy with Darwin's implementation that supports latest CPUs, see https://opensource.apple.com/source/Libc/Libc-262/ppc/gen/bcopy.s.auto.html
    [/li]
    [/list]



    Interesting find, do you think this version of bcopy is used in bootx?

    Somewhere I have the code for BootX I used to build a version that sends some debug info to the screen, that helped me with finding a problem with KVM on a Powerbook booting OS X in Qemu. I'll have to see if I can find bcopy in the BootX sources.

    Offline powermax

    • Enthusiast Member
    • ***
    • Posts: 80
    • Hobbyist programmer
    Re: G5 qemu attempts.
    « Reply #113 on: October 17, 2018, 03:46:20 PM »
    Interesting find, do you think this version of bcopy is used in bootx?

    Somewhere I have the code for BootX I used to build a version that sends some debug info to the screen, that helped me with finding a problem with KVM on a Powerbook booting OS X in Qemu. I'll have to see if I can find bcopy in the BootX sources.

    To my knowledge, BootX uses a CPU-independent implementation of memcpy written in C. bcopy() is just a wrapper for memcpy that changes the order of parameters (see bootx/libclite.subproj/mem.c):

    Code: [Select]
    void bcopy(const void *src, void *dst, size_t len)
    {
        memcpy(dst, src, len);
    }

    Offline Daniel

    • Gold Member
    • *****
    • Posts: 300
    • Programmer, Hacker, Thinker
    Re: G5 qemu attempts.
    « Reply #114 on: October 17, 2018, 04:48:32 PM »
    Until Open Firmware is quiesced, the "enter" client interface call can be used to drop to the command prompt. You can then inspect registers and do other stuff from OF. Maybe you could tack a little code onto the end of the Trampoline that so you can patch code to call it.

    I don't know if the code section is read-only by default, but you could undo that protection with a simple mmu call.

    Offline ELN

    • Gold Member
    • *****
    • Posts: 295
    • new to the forums
    Re: G5 qemu attempts.
    « Reply #115 on: October 17, 2018, 04:50:02 PM »
    Time for a binary patch maybe? :D

    Offline powermax

    • Enthusiast Member
    • ***
    • Posts: 80
    • Hobbyist programmer
    Re: G5 qemu attempts.
    « Reply #116 on: October 18, 2018, 03:37:01 AM »
    Time for a binary patch maybe? :D

    bcopy() must be definitely patched in order to run on G5. The question is what code should replace it?

    Currently, we have two options:

    • Asm-bcopy that supports latest G4/G5 CPUs.
      Pros: it's blazing fast because it uses various low-level tricks and optimizations
      Cons: processor-dependent code, requires an assembler, several corner cases, hard to debug, need intensive testing
    • generic C implementation consisting of a simply loop a lá memcpy used in BootX
      Pros: easy to write and test, processor-independent, no machine code required
      Cons: probably not very fast but it's hard to tell without speed measures

    Each optimization must be well balanced. The Trampoline calls bcopy() very often for small memory chunks (< 256 bytes). In this case, I doubt that users will be able to notice any speed difference between the generic and highly optimized versions.

    When copying large blocks (4MB, 290KB etc.) there can be a noticeable difference but the question is how much? Is it worth it spending hours and hours writing and testing highly optimized machine code in order to gain a tiny speed improvement?

    Offline darthnVader

    • Platinum Member
    • *****
    • Posts: 679
    • New Member
    Re: G5 qemu attempts.
    « Reply #117 on: October 18, 2018, 04:38:09 AM »
    Time for a binary patch maybe? :D

    bcopy() must be definitely patched in order to run on G5. The question is what code should replace it?

    Currently, we have two options:

    • Asm-bcopy that supports latest G4/G5 CPUs.
      Pros: it's blazing fast because it uses various low-level tricks and optimizations
      Cons: processor-dependent code, requires an assembler, several corner cases, hard to debug, need intensive testing
    • generic C implementation consisting of a simply loop a lá memcpy used in BootX
      Pros: easy to write and test, processor-independent, no machine code required
      Cons: probably not very fast but it's hard to tell without speed measures

    Each optimization must be well balanced. The Trampoline calls bcopy() very often for small memory chunks (< 256 bytes). In this case, I doubt that users will be able to notice any speed difference between the generic and highly optimized versions.

    When copying large blocks (4MB, 290KB etc.) there can be a noticeable difference but the question is how much? Is it worth it spending hours and hours writing and testing highly optimized machine code in order to gain a tiny speed improvement?

    I don't really think speed is that much of an issue, likely this code will only ever need to run on a G5 or an emulated G5. Most people have fast x86 cpu's, as they are fairly cheap, and the slowest G5 is 1.6Ghz.

    There is the edge case that someone will run the code on 64bit ARM, emulating a G5, so one would be a little better there. I just highly doubt running Qemu on a phone or tablet to emulate a desktop OS is ever going to be that useful.

    My vote is 2.

    Offline darthnVader

    • Platinum Member
    • *****
    • Posts: 679
    • New Member
    Re: G5 qemu attempts.
    « Reply #118 on: October 18, 2018, 05:05:08 AM »
    On a side note, I wonder how hard it would be to port GDB-Server to run in Open Firmware.

    Likely not easy, tho theoretically possible.

    It maybe necessary if we ever give a go at getting the Trampoline to run on 970fx+ Powermacs.

    Also, there is a PowerPC notebook project, last I checked I think they were planing to use Coreboot as a firmware, and Coreboot supports Openbios as a payload. So if the project ever comes to fruition I'll try and get my hands on one, and maybe we can see if we can get the Trampoline to run one that too.

    Offline powermax

    • Enthusiast Member
    • ***
    • Posts: 80
    • Hobbyist programmer
    Re: G5 qemu attempts.
    « Reply #119 on: October 18, 2018, 05:12:38 AM »
    On a side note, I wonder how hard it would be to port GDB-Server to run in Open Firmware.

    I'm afraid it has to be rewritten from scratch. OF is Fourth, GDB is C requiring OS and C libraries to work...

    Also, there is a PowerPC notebook project [...]

    Never heard anything about that. Any description?