Pages: [1]   Go Down

Author Topic: Now we have 13x faster MD5 checksums  (Read 2204 times)

Jubadub

  • 256 MB
  • *****
  • Posts: 431
  • New Member
Now we have 13x faster MD5 checksums
« on: July 03, 2024, 11:45:47 AM »

Someone incredibly awesome just made and released this for Mac OS, allowing us to calculate MD5 checksums, which is extremely useful for checking file integrity, which is very important after a download:
http://macintoshgarden.org/apps/md5classic

The only other program we could use before up until now AFAIK was Checksum 1.3: http://macintoshgarden.org/apps/checksum-13

The main problem with Checksum 1.3 is that it is a pure 68k app. But md5classic 1.0 up there is both PowerPC native and 68k native. Runs on every commercialized version of Mac OS, too (!), from System 1 to Mac OS 9.2.2.

This stuff is useful all the time, sometimes we have broken downloads, and we ought to check if they went fine. For big files, it is a HUGE pain using Checksum 1.3, since it's almost 5x slower, but now it is no more as big a deal. In fact, even for small files, thanks to an annoying confirmation box, it takes time/patience to close Checksum 1.3 every single time, but with md5classic, you can simply Cmd+Q to quit without any extra step.

Source code is available there, too, for those interested. It seems to be pure C, and includes a CodeWarrior Pro 6 project file. That is the last version of CW that can compile 68k apps out-of-the-box.
« Last Edit: July 13, 2024, 08:32:59 AM by IIO »
Logged

ssp3

  • 512 MB
  • *****
  • Posts: 875
Re: Now we have 5x faster MD5 checksums
« Reply #1 on: July 03, 2024, 08:53:54 PM »

<like>  :)
Logged
If you're not part of the solution, you're part of the problem.

Jubadub

  • 256 MB
  • *****
  • Posts: 431
  • New Member
Re: Now we have 5x faster MD5 checksums
« Reply #2 on: July 05, 2024, 12:53:47 PM »

The MD5 algorithm got improved and now the title of this thread is wrong:

It's not 5x times. Now it is 6.5x faster! :)
Logged

Greystash

  • 128 MB
  • ****
  • Posts: 244
  • Tinkerer
    • Mac-Classic.com
Re: Now we have 6.5x faster MD5 checksums
« Reply #3 on: July 05, 2024, 03:11:16 PM »

Very handy thanks for sharing!
Logged

Jubadub

  • 256 MB
  • *****
  • Posts: 431
  • New Member
Re: Now we have 5x faster MD5 checksums
« Reply #4 on: July 12, 2024, 10:01:39 AM »

It's not 5x times. Now it is 6.5x faster! :)

Well, the tool further evolved in such a hardcore manner, that it got rewritten in pure, carefully-curated, highly-optimized 68k assembly, then the same for PPC assembly, resulting in one of the wildest FAT applications EVER as far as sheer expertise is concerned.

When the program started, it was almost 5x faster than the only solution we had on PPC.
Then it evolved, and became 6.5x faster.
THEN it evolved AGAIN, and it became TEN TIMES faster.

One would think that's the limit. But nope. It got further and further squeezed... And now it is almost 13x faster.

13 TIMES

That's how much faster we can verify the integrity of transferred data, such as Macintosh Garden downloads, in Mac OS 9.2.2 and earlier on PPC Macs now. What used to take 13 minutes to check before (say, perhaps something like a DVD image download) now takes ONLY ONE MINUTE. Just stupid fast. Sheer madness. I love it.

The hand-crafted PPC assembly is yet to appear in a future version, by the way, but it seems it will only be ever-so-slightly-faster than the current version, because as it turns out, CodeWarrior Pro 6.3 actually compiles VERY efficient PowerPC code. It seems to be a far better compiler than its 68k C compiler (and, no, THINK C was even worse at 68k compilation than CWPro 6, so 68k users can really rejoice the beautiful hand-written 68k assembly this brilliant software contains).

Theoretically-speaking, there is even more room for improvement: the existing MD5 algorithm can be rewritten to take advantage of AltiVec in G4s (and G5s). Maybe there's potential to leverage GPUs, as well, in particular for pre-G4 PPC Macs. And, finally, MD5 seems to only have little-endian implementations, so a big endian implementation without any byte swapping or similar could be invented/discovered (all OSes seem to just byte-swap for big endian, we checked). Some hashing algorithms are done in big endian, though, like the SHA family, but not MD5 (yet).

All-in-all, the Garden thread is very educational in all sorts of ways, so that's a recommended read, as well.

New versions are planned, but they are mostly quality-of-life features, rather than further speed optimizations.

Note that, as of now, the source code repository wasn't updated in a while, but I expect that to change with the release of the next version, hopefully.
Logged

IIO

  • Staff Member
  • 4096 MB
  • *******
  • Posts: 4596
  • just a number
Re: Now we have 5x faster MD5 checksums
« Reply #5 on: July 13, 2024, 08:35:49 AM »

that it got rewritten in pure, carefully-curated, highly-optimized 68k assembly, then the same for PPC assembly, resulting in one of the wildest FAT applications EVER as far as sheer expertise is concerned.

sounds a bit like cliff´s ("OS923") attitude, maybe he is related to him?

Logged
insert arbitrary signature here

Jubadub

  • 256 MB
  • *****
  • Posts: 431
  • New Member
Re: Now we have 5x faster MD5 checksums
« Reply #6 on: July 13, 2024, 11:31:26 AM »

that it got rewritten in pure, carefully-curated, highly-optimized 68k assembly, then the same for PPC assembly, resulting in one of the wildest FAT applications EVER as far as sheer expertise is concerned.

sounds a bit like cliff´s ("OS923") attitude, maybe he is related to him?

I am convinced those are 2 completely unrelated people. I don't find them similar, but it's true both are experts in their respective "things".

Also, the speed got slightly bumped up again, but that will probably be the last time: some minor PPC assembly hand-made tweaks pushed up the speed enough that now it is 13.5x faster. (Checksum 1.3: 147.0 seconds, md5classic 1.0b5: 10.9 seconds)

Since the speed bump this time was smaller, version 1.0b5 is only available on GitHub in the form of source code right now. But it's easy to compile and keep it, which I did, by simply unpacking the SIT-ed .mcp and .rsrc files, opening the .mcp file in CodeWarrior Pro 6, selecting "PPC" or "FAT" from the target dropdown box, clicking on "Make" button and voila. Just click-click-click and done, anyone can do it without needing to know anything about programming or CodeWarrior.

To push the speed even beyond will require AltiVec and/or GPUs. Or something entirely different like that. Figuring out a big endian algorithm of MD5 could potentially also speed something up. Other than that, I think this is it, the ultimate MD5 hashing algorithm and program on PPC and 68k.

Now "sidd" (the developer) will give HFS+ support a go... This would enable us to break the 2 GB barrier for file checksumming directly from Mac OS for the first time since 1984, 40 years later. I hope he can make it happen.
Logged

V.Yakob

  • 64 MB
  • ****
  • Posts: 96
  • Mac User
Re: Now we have 13x faster MD5 checksums
« Reply #7 on: July 17, 2024, 11:16:54 AM »

Just like you're trying to build the application, but I'm failing.  :o

Lack of any dependencies or settings? Do you know how to fix it?
Logged
PPC — PM 8100/80, PM 9600/300, PM G3 Minitower (Rev. C), PM G3 B&W (Rev. B), PM G4 Quicksilver (2002), PM G4 MDD (2003), PM G5 (Late 2005).
Intel — Mac mini (mid 2010), iMac 5k (2017), Mac mini (2018).
AppleSilicon — Mac mini (2020), Mac Studio M2 Max + Apple Studio Display.

ssp3

  • 512 MB
  • *****
  • Posts: 875
Re: Now we have 13x faster MD5 checksums
« Reply #8 on: July 17, 2024, 11:29:02 AM »

How about uploading already compiled application? ;)

(Or does the console jockey's "everyone is on his own" nonsense have infected Mac users scene?)
Logged
If you're not part of the solution, you're part of the problem.

V.Yakob

  • 64 MB
  • ****
  • Posts: 96
  • Mac User
Re: Now we have 13x faster MD5 checksums
« Reply #9 on: July 17, 2024, 11:58:23 AM »

So interesting!

I have never built applications for OS9 before, and I have never seen such projects.

But, I always step on the rake... ;D
Logged
PPC — PM 8100/80, PM 9600/300, PM G3 Minitower (Rev. C), PM G3 B&W (Rev. B), PM G4 Quicksilver (2002), PM G4 MDD (2003), PM G5 (Late 2005).
Intel — Mac mini (mid 2010), iMac 5k (2017), Mac mini (2018).
AppleSilicon — Mac mini (2020), Mac Studio M2 Max + Apple Studio Display.

Jubadub

  • 256 MB
  • *****
  • Posts: 431
  • New Member
Re: Now we have 13x faster MD5 checksums
« Reply #10 on: July 18, 2024, 07:46:49 AM »

How about uploading already compiled application? ;)

(Or does the console jockey's "everyone is on his own" nonsense have infected Mac users scene?)

You're right, I even planned to earlier, but completely forgot to.

In any case, here it is, attached to this message. :)
Logged

Jubadub

  • 256 MB
  • *****
  • Posts: 431
  • New Member
Re: Now we have 13x faster MD5 checksums
« Reply #11 on: July 18, 2024, 07:53:02 AM »

Lack of any dependencies or settings? Do you know how to fix it?
Hmm, should have worked fine without having to tweak any settings. In the case of the error message there, it seems it cannot find the entry point for the program (method "main"). I have had this error happen on personal projects when I poked around settings so much that I triggered it to not find the entry point anymore, but this one is all pre-configured properly.

Did you use CW Pro 6.3, with all the updates? You can get it all from the Garden.
Logged

ssp3

  • 512 MB
  • *****
  • Posts: 875
Re: Now we have 13x faster MD5 checksums
« Reply #12 on: July 18, 2024, 08:38:12 AM »

In any case, here it is, attached to this message. :)

Thank you!  :)
Logged
If you're not part of the solution, you're part of the problem.

V.Yakob

  • 64 MB
  • ****
  • Posts: 96
  • Mac User
Re: Now we have 13x faster MD5 checksums
« Reply #13 on: July 18, 2024, 09:23:41 AM »

Did you use CW Pro 6.3, with all the updates? You can get it all from the Garden.
Yes, I installed CW 6.0 (Tools and Ref), rebooted Mac, installed update 6.2, then replaced MW C/C++PPC. I also installed MRJ 2.2.2 from the CW distrib.

First I tried using it in the UTM virtual machine on Mac OS 9.2.1, then on MDD with Mac OS 9.2.2. Every time the same mistake.

And I found the "MSL RuntimePPC.lib" library, it is in the CW 6 directory. It's easy to find via Sherlock.
Logged
PPC — PM 8100/80, PM 9600/300, PM G3 Minitower (Rev. C), PM G3 B&W (Rev. B), PM G4 Quicksilver (2002), PM G4 MDD (2003), PM G5 (Late 2005).
Intel — Mac mini (mid 2010), iMac 5k (2017), Mac mini (2018).
AppleSilicon — Mac mini (2020), Mac Studio M2 Max + Apple Studio Display.

Jubadub

  • 256 MB
  • *****
  • Posts: 431
  • New Member
Re: Now we have 13x faster MD5 checksums
« Reply #14 on: July 18, 2024, 10:00:33 AM »

@V.Yakob When you downloaded the source code from GitHub, did you grab the full ZIP, or did you download the files separately?

In my case, I got the whole zip, used StuffIt Expander 7.0.3 to extract all into a single folder, then I extracted both md5classic.mcp.sit and md5classic.rsrc.sit, and made sure their contents were also in the same folder as everything else (and not, say, separated into their own folders), then I opened md5classic.mcp by dropping it on the CodeWarrior IDE icon, and then clicked on "Make" after selecting PPC or FAT as the target on the top-left corner of the file list window.

It should work just like that. Here's what my folder looks like:



And my project window:



Hope this helps! I know you can do it. :)
Logged

V.Yakob

  • 64 MB
  • ****
  • Posts: 96
  • Mac User
Re: Now we have 13x faster MD5 checksums
« Reply #15 on: July 18, 2024, 10:54:21 AM »

@V.Yakob When you downloaded the source code from GitHub, did you grab the full ZIP, or did you download the files separately?

I used git clone on Mac, usually I always do so when I need to download the repository from github.
But, I'll try to download zip, thank you.

There may be a problem with file fork. In your screenshot, I see how files with .c and .h extensions are detected to open CW, but I look like just "sheets of paper" icon, like .md for example.
Logged
PPC — PM 8100/80, PM 9600/300, PM G3 Minitower (Rev. C), PM G3 B&W (Rev. B), PM G4 Quicksilver (2002), PM G4 MDD (2003), PM G5 (Late 2005).
Intel — Mac mini (mid 2010), iMac 5k (2017), Mac mini (2018).
AppleSilicon — Mac mini (2020), Mac Studio M2 Max + Apple Studio Display.

V.Yakob

  • 64 MB
  • ****
  • Posts: 96
  • Mac User
Re: Now we have 13x faster MD5 checksums
« Reply #16 on: July 18, 2024, 11:22:58 AM »

@Jubadub,Yes, that's how it turned out.
I downloaded .zip, unpacked it on OS9. I launched the project and it was immediately built without problems.

Thank you!  8)

This is the first application I've built on OS9.

Logged
PPC — PM 8100/80, PM 9600/300, PM G3 Minitower (Rev. C), PM G3 B&W (Rev. B), PM G4 Quicksilver (2002), PM G4 MDD (2003), PM G5 (Late 2005).
Intel — Mac mini (mid 2010), iMac 5k (2017), Mac mini (2018).
AppleSilicon — Mac mini (2020), Mac Studio M2 Max + Apple Studio Display.

Jubadub

  • 256 MB
  • *****
  • Posts: 431
  • New Member
Re: Now we have 13.5x faster MD5 checksums
« Reply #17 on: July 18, 2024, 11:59:08 PM »

@V.Yakob Nice! Congrats! :)

It seems that the issue before, rather than anything fork-related, was file associations based on TYPE/CREATOR codes. Or just the "TYPE" part of it, regardless of program association from the "CREATOR" part. I should have guessed that earlier, considering that, now that I think about it, I ran into this same exact issue with compiling SDL 1.2.15 (and 1.2.13) in MPW, as well, except that in MPW's case, it had told me that the files it expected were not of TYPE "TEXT", and that they should have been. So after setting the TYPE as such, the compilation process went on further ahead.

It seems that if you unpack things in OS 9, either the OS or StuffIt Expander "maps" things to CW. I assume that will only work if you have at least some version of CW installed before unpacking, and if no other program is taking over ownership and "overruling" CW, such as MPW. Although I think it is likely if the files were mapped to MPW, CW probably would have had no issue with them, either. In fact, perhaps without both CW and MPW, unpacking things in OS 9 should AT LEAST cause the code-related files to be marked with the TYPE "TEXT", and MAYBE that would have sufficed.

Nonetheless, it is good to know how much TYPE/CREATOR matters here, and that we have to keep that in mind for compilation in general under the Mac OS. In fact... that might explain some of my headaches with Bochs before! (Among other things I learned since my attempts to compile it 2 years back.)

This means GitHub is really not the ideal way to host Mac OS source code. Well, in fact, nothing "over the wire" is, at least in HTTP, FTP etc., as usual. So it's best if the WHOLE source tree was SIT-packed or similar (instead of ZIP), to spare us from any headache related to these things, and to keep the compilation experience the same for everyone, since this way we would have the EXACT same TYPE/CREATOR as the developer used. The Mac-OS-X-only type of ZIP would also work, but GitHub & co. don't use those, and such ZIP files can't be conveniently unpacked directly from Mac OS anyway (yet), and generally require Mac OS X somewhere instead, which for us is a massive no-no. Maybe AmendHub takes such things into account?

Anyway, great findings. :)
« Last Edit: July 19, 2024, 12:26:59 AM by Jubadub »
Logged

DrNo7

  • 64 MB
  • ****
  • Posts: 94
Re: Now we have 13x faster MD5 checksums
« Reply #18 on: July 19, 2024, 10:56:34 PM »

Quote
So it's best if the WHOLE source tree was SIT-packed or similar (instead of ZIP), to spare us from any headache related to these things, and to keep the compilation experience the same for everyone,

That sentence made me think about creating a release in GitHub and posting a SIT archive for the binary and one of the sources (like open-source projects are doing).

Another possibility would be to add a script in the repo that would set all the creator/type in the repo in order to ensure a working clone (and put instructions in the Reandme.md on how to do so with a fresh clone).
Logged
Ti 1 GHz / 1 GB / FW SSD / Airport Extreme PCMCIA (triple boot)
Alu 12 1.5GHz / 1.5 GB / 256 GB mSata SSD (dual boot for now)

Jubadub

  • 256 MB
  • *****
  • Posts: 431
  • New Member
Re: Now we have 13x faster MD5 checksums
« Reply #19 on: July 20, 2024, 02:58:14 AM »

@DrNo7 "Like" :)

Incidentally, version 1.1 is fresh out of the oven! It adds a number of options people requested, it's pretty sweet. I left a review of it in the Garden page.

I also made sure there was no performance regression of any kind: it completes MD5 hashes just as quickly as before, meaning it's as fast as the 1.0"b5" version I attached here earlier.

HFS+ support (meaning support for files bigger than 2 GB) is still underway, though, and doesn't work with version 1.1 either. Maybe in a future release?
Logged
Pages: [1]   Go Up
 

Recent Topics