Author Topic: A better BlockMove  (Read 2086 times)

Offline OS923

  • Platinum Member (500+ Posts)
  • *****
  • Posts: 718
  • Liked:
  • Likes Given: 3
A better BlockMove
« on: May 25, 2021, 07:28:32 AM »
I measured all BlockMoves and all combinations of the naive copy (2, 4 or 8 byte aligned or misaligned in steps of 1, 2, 4 or 8 bytes copied at a time). BlockMoveData was about 2 times faster, even for blocks as small as 16 bytes, but with a block size of 15 it was suddenly 2 times slower, and with smaller blocks it becomes worse, up to 10 times slower. Now I use a macro instead of BlockMoveData. If the block size is less than 15 then I use the naive copy, otherwise BlockMoveData.

Many programs probably suffer from slow copies. BlockMoveData is used 111 times in PowerPlant and 17 times in MSL and it's usually small blocks. Even worse, if you don't define an assignment operator or copy constructor then it creates a default which uses BlockMoveData.

Code: [Select]
#include <Files.h>
#include <iostream>

#define PZ2(x) reinterpret_cast<SInt16*>(x)
#define PCZ2(x) reinterpret_cast<const SInt16*>(x)
#define PZ4(x) reinterpret_cast<SInt32*>(x)
#define PCZ4(x) reinterpret_cast<const SInt32*>(x)
#define PZ8(x) reinterpret_cast<SInt64*>(x)
#define PCZ8(x) reinterpret_cast<const SInt64*>(x)

#define BMD(src,dst,len) ::BlockMoveData(src,dst,len);
#define BMN(src,dst,len) {UInt8 *dd=dst;const UInt8 *ss=src;const UInt8* const tt=src+len;while (ss<tt) *dd++=*ss++;}
#define BM(src,dst,len) if (len==1) *dst=*src; else if (len==2) *PZ2(dst)=*PCZ2(src); else if (len==4) *PZ4(dst)=*PCZ4(src); else if (len==8) *PZ8(dst)=*PCZ8(src); else if (len<16) BMN(src,dst,len) else BMD(src,dst,len)

int main()
    using namespace std;

    cout << "Hello World, this is CodeWarrior!" << endl;

    FSSpec a;
    FSSpec b;
    BM(reinterpret_cast<const UInt8 *>(&a),
       reinterpret_cast<UInt8 *>(&b),
    BM(reinterpret_cast<const UInt8 *>(&a),
       reinterpret_cast<UInt8 *>(&b),
    BM(reinterpret_cast<const UInt8 *>(&a),
       reinterpret_cast<UInt8 *>(&b),

    return 0;