Mac OS 9 Lives! (Classic Mac OS Forum)

Classic Mac OS Software (Discussions on Applications) => Application Development & Programming in the Classic Mac OS => Topic started by: powermax on August 24, 2017, 08:57:14 AM

Title: Decompiling Of PPC ASM
Post by: powermax on August 24, 2017, 08:57:14 AM
I'm starting to think it would be worth writing a decompiler.

That's probably off-topic but, nevertheless, I'd like to comment on that.

Building a decompiler is an enormously hard task. If you wonder why please consider reading https://www.hex-rays.com/products/ida/support/ppt/decompilers_and_beyond_white_paper.pdf (https://www.hex-rays.com/products/ida/support/ppt/decompilers_and_beyond_white_paper.pdf) by the author of the only working x86 decompiler today - Hex rays.

There are several other projects aiming for the same goal (RecStudio, Hopper Disasssembler), but they are mostly non-retargetable (do support only x86), inflexible, non-extensible and lack interactivity.

Bad thing, none of the existing decompilers can process PowerPC binaries.

I therefore wrote my own basic decompiler in Python that helps to keep manual work as minimal as possible. It is capable of performing data-flow and control-flow analysis, identifying common compiler idioms (function prologs/epilogs, switch statements, div instructions etc) and generating a basic C-like pseudocode consisting of reconstructed expressions inside of basic blocks.

It helps to collapse three PPC instructions into one statement on average - so it's much more than just a disassembler. It's a great time saver! You still have to use your brain and hands though. Anyway, it's a good start.

Moreover, it's possible to attach Python hooks to the processing, so several heuristics could be implemented really easily. For example, Trampoline calls OF with variable number of arguments. Fully automatic detection of such a stuff in the decompiler is almost impossible but, fortunately, you could write a function that will be called back for each "call" instruction and determine the number of inputs (2nd parameter) and outputs (3rd parameter) dynamically based on the service (1st parameter).

I'm working at cleaning up this Python beast for you to play with.
Title: Re: Decompiling Of PPC ASM
Post by: nanopico on August 24, 2017, 09:29:53 AM
Off topic before, but now it's on topic.

Difficult yes, very much so. I should probably shut my mouth though more. Potentially taking on too much and at that too much enormous stuff.

And I think it sounds like your python work is a good start towards that. I don't think a full fledged decompiler is what I was thinking.  It sounds more like we could just add some more stuff to what you have.

For the Mac stuff there is also the whole Pascal and C stuff.  Anything to decompile to either language would have to handle the calling conventions of either language and where they cross over in the same binary. My take now is
Decompiler = super awesome and would be super cool. Doable, but in no way realistic and would take longer than the manual method currently being used.  For this specific work the ROI would not be good in any way shape or form.

I don't think any decompiler would be portable. And I don't think they should be.  I feel to much architecture specific stuff would be missed.

Thanks for pointing that all out.  Despite knowing this, I some times need to be reminded of these things. 

That white paper is interesting. Will definitely be giving it more than a quick glance this evening.
Title: Re: Decompiling Of PPC ASM
Post by: Naiw on August 29, 2017, 04:54:58 PM


That's probably off-topic but, nevertheless, I'd like to comment on that.

Building a decompiler is an enormously hard task. If you wonder why please consider reading https://www.hex-rays.com/products/ida/support/ppt/decompilers_and_beyond_white_paper.pdf (https://www.hex-rays.com/products/ida/support/ppt/decompilers_and_beyond_white_paper.pdf) by the author of the only working x86 decompiler today - Hex rays.

There are several other projects aiming for the same goal (RecStudio, Hopper Disasssembler), but they are mostly non-retargetable (do support only x86), inflexible, non-extensible and lack interactivity.

Bad thing, none of the existing decompilers can process PowerPC binaries.

Hopper's decompiler works just fine and no it's certainly not x86 only, I've used it on ARM and I think I tested it with some PowerPC some months ago as well, However it does not support the pef file format.
There is some SDK for Hopper somewhere (unless it's been taken down, I just find the Linux SDK on the site atm, unless it's because the site renders selectively based on the browser OS)


Arm disassembly
(https://preview.ibb.co/nQAbe5/arm.png) (https://ibb.co/gNVL6k)

Arm decompilation
(https://preview.ibb.co/jagtRk/armdecomp.png) (https://ibb.co/b792K5)

---

Edit: Just tested, PowerPC decompilation is not supported only disassembly.
Title: Re: Decompiling Of PPC ASM
Post by: nanopico on August 30, 2017, 05:43:03 AM
There is a pretty good disassembler or OS 9 that I have used before and I have attached here.
It also does some stuff for 68K emulator conversion stuff.  It runs in OS 9 so I like that part.
Title: Re: Decompiling Of PPC ASM
Post by: Daniel on January 26, 2018, 06:21:44 PM
You should be aware that PPCDisassemble2.0 messes up some of the instructions that are used in the Trampoline. The Trampoline is guarenteed to crash if you use that disassembler's output to build it. There will be about 20 (I don't recall the exact number) errors in the code.
Title: Re: Decompiling Of PPC ASM
Post by: nanopico on January 26, 2018, 06:32:07 PM
You should be aware that PPCDisassemble2.0 messes up some of the instructions that are used in the Trampoline. The Trampoline is guarenteed to crash if you use that disassembler's output to build it. There will be about 20 (I don't recall the exact number) errors in the code.

Do you have a specific spot?  Just curious so I can keep an eye out with it when I work on stuff as I do use that disassembler.
I know other disassembly tools cause issues too and it mostly seems to be with which PPC ISA you use.  There are few revisions/variants out there.   The trampoline appears to do some sort of instruction emulation for some 604 PPC instructions.  Haven't found it, but looking at the strings in the trampoline there is messages relating to it.
Title: Re: Decompiling Of PPC ASM
Post by: Daniel on January 26, 2018, 06:43:41 PM
A certain instruction (don't remember which) always had the source and destination registers swapped. It also messed up some of the condition register manipulation instructions.

Here's how I figured out there was a problem:
I had a working XCOFF version of the Trampoline, but all the code was dc.l statements. I had disassembled the code with PPCDisassemble, but that version kept crashing. I ended up using dumpXCOFF (a MPW tool) on both files. I then used Compare (another MPW tool) to find all the places where the instructions were different. There were 2 or 3 instructions that were consistently broken. I copied the good values over to the code file. That time it didn't crash.

I hope you don't have those kinds of problems, but the method described above is probably your best bet if they do occur.
Title: Re: Decompiling Of PPC ASM
Post by: nanopico on January 26, 2018, 06:55:43 PM
A certain instruction (don't remember which) always had the source and destination registers swapped. It also messed up some of the condition register manipulation instructions.

Here's how I figured out there was a problem:
I had a working XCOFF version of the Trampoline, but all the code was dc.l statements. I had disassembled the code with PPCDisassemble, but that version kept crashing. I ended up using dumpXCOFF (a MPW tool) on both files. I then used Compare (another MPW tool) to find all the places where the instructions were different. There were 2 or 3 instructions that were consistently broken. I copied the good values over to the code file. That time it didn't crash.

I hope you don't have those kinds of problems, but the method described above is probably your best bet if they do occur.

Cools.  Will definitely verify this.  Thanks.  I hadn't tried recompiling yet.  I was just trying to break things apart to map it out.  Sounds like it wouldn't have been an issue until I try to build it. 
Title: Re: Decompiling Of PPC ASM
Post by: franklin_m on March 21, 2019, 12:55:27 PM
This is free and works on PPC PEF files!!

https://www.ghidra-sre.org/

IDA / Hex-Rays also works, but it costs like $2600.
Title: Re: Decompiling Of PPC ASM
Post by: mePy2 on April 23, 2019, 04:03:11 PM
Hi guys,

Whatís going on about this topic?
I would like to dive into a Mac OS program Ė a PEF file. I first tried debugging it with gdb and LaunchCFMApp. It works, but I do not know how to use gdb. So itís useless at the moment.

I would like to disassemble and decompile this PPC exec. What can you suggest me?
Title: Re: Decompiling Of PPC ASM
Post by: Daniel on April 23, 2019, 05:15:24 PM
I would like to disassemble and decompile this PPC exec. What can you suggest me?

There are a few options, but none of them are all that good.

That's all the options I can think of. My tests were a while ago, so I don't know where to find most of this stuff.
Title: Re: Decompiling Of PPC ASM
Post by: OS923 on April 24, 2019, 03:49:41 AM
MacNosy has a weird interface, but it works for 68K and PPC code resources. It can't see PPC code in a data fork. (Copy the data fork into a resource.) If you want to do better than MacNosy then you have much work.
Title: Re: Decompiling Of PPC ASM
Post by: IIO on April 24, 2019, 12:46:29 PM
This is free and works on PPC PEF files!!

https://www.ghidra-sre.org/

IDA / Hex-Rays also works, but it costs like $2600.

hopper disassembler once was available for PPC, no idea what kind of stuff from classic it can work with.
Title: Re: Decompiling Of PPC ASM
Post by: mePy2 on April 26, 2019, 12:20:16 AM
Thank you guys, thank you Daniel.
Iíll try the programs you suggested.

Best
Title: Re: Decompiling Of PPC ASM
Post by: mePy2 on May 08, 2019, 01:05:24 AM
Hi guys, I would to learn more about the subject. Would you like to reunite us somewhere (GitHub/GitLab repo) and talk and make some projects together?
I would really really like it.
Title: Re: Decompiling Of PPC ASM
Post by: Daniel on May 08, 2019, 04:44:06 PM
Most of the communication is done on either these forums or the cdg5 mailing list. https://lists.ucc.gu.uwa.edu.au/mailman/listinfo/cdg5 (https://lists.ucc.gu.uwa.edu.au/mailman/listinfo/cdg5).

I suppose if you wanted to talk about disassembly specifically, you could open up an issue on my disassembler repo just for chat. It's not really conventional, but it can be done. https://github.com/DBJ314/dePEF-and-disasm (https://github.com/DBJ314/dePEF-and-disasm).

What kinds of projects are you thinking of? Reverse-engineering core Mac systems? Developing useful hacking tools? Creating Mac applications? There are so many interesting things that can be done...
Title: Re: Decompiling Of PPC ASM
Post by: mePy2 on May 12, 2019, 03:25:40 AM
Hi,

Thank you, Iím going to subscribe myself to the mailing list and opening an issue in your repo.
My idea is just to make some practical experience with Mac OS 9, PowerPc asm. I would start making a simple C/C++, assembly program for it.