Mac OS 9 Lives! (Classic Mac OS Forum)
Classic Mac OS Software (Discussions on Applications) => Application Development & Programming in the Classic Mac OS => Topic started by: OS923 on June 08, 2018, 06:53:39 AM

What's the fastest algorithm to convert a binary number into a decimal number?
For SInt32 or smaller numbers I use NumToString from NumberFormatting. Otherwise I use the div/mod method.
It should be possible with comparisons and subtractions. But in which order?
For example.
if n>=2000000000 then
Append "2"
n=2000000000
elsif n>=1000000000 then
Append "1"
n=1000000000
else
and so on.

or set up an array somewhere which already contains all required multiples of 2.

There is no such thing as the fastest way, it depends completely on architecture.
But I would guess something like this is probably among the fastest on most cpus...
int main()
{
unsigned long value = 0;
char binary_string[] = "11111010000";
unsigned long length = sizeof(binary_string)1;
unsigned long mask[] = {0, 0xffffffff};
for(int s=length;s>0;s) {
int bit = s1; // We're zero indexed adjust
value += (1<<bit) & mask[binary_string[lengthbit1]48]; // Need to offset with the termination byte too... finally remove the ascii value of zero to map this to our bitmask table.
}
printf("Decimal value for %s is %lu\n",binary_string, value);
return 0;
}
Now this particular example is written around a binary number in string format... because I misread the original post...
Also note that the loop is backwards intentionally because while compilers may sometimes figure this out on their own, they don't always do, and pretty much all processors has loop conditions that are significantly faster if they compare with zero.
PowerPC for example even have a special counter register designed for this purpose (CTR  ie. the loop turns into a single bdnz; decrement CTR and branch if not zero) that is way faster than compare&jump looping (cmpwi / bne).

Actually are you talking binary here or BCD? Cause after reading your other post I'm highly suspecting you're intending to ask about how to most efficiently encode/decode BCD encoded values?

When I say binary number then I mean something of the type SInt8, SInt16, SInt32, SInt64, UInt8, UInt16, UInt32, or UInt64.
So I search the fastest solution for:
void Convert(SInt8 n,StringPtr s);
void Convert(SInt16 n,StringPtr s);
void Convert(SInt32 n,StringPtr s);
void Convert(SInt64 n,StringPtr s);
void Convert(UInt8 n,StringPtr s);
void Convert(UInt16 n,StringPtr s);
void Convert(UInt32 n,StringPtr s);
void Convert(UInt64 n,StringPtr s);

When I say binary number then I mean something of the type SInt8, SInt16, SInt32, SInt64, UInt8, UInt16, UInt32, or UInt64.
So I search the fastest solution for:
void Convert(SInt8 n,StringPtr s);
void Convert(SInt16 n,StringPtr s);
void Convert(SInt32 n,StringPtr s);
void Convert(SInt64 n,StringPtr s);
void Convert(UInt8 n,StringPtr s);
void Convert(UInt16 n,StringPtr s);
void Convert(UInt32 n,StringPtr s);
void Convert(UInt64 n,StringPtr s);
Well I don't know what Covert(type n, StringPtr s); does. it's not a standard call...
So I guess you're talking about parsing values then?

I tried a method without division or multiplication but it was a bit too slow.
I found a better solution. I treat numbers as base 100000 instead of base 10. This divides the number of divisions and multiplications by 5. This requires a table of 500000 bytes.
This is now part of OS 9.3.

I believe this should be slightly faster...
unsigned int conv2(char *s)
{
unsigned register int n = 0;
do
{
n=(n<<3)+(n<<1)+(*s&0xf);
} while(*++s);
return n;
}

I'll try it, but in my experience, bit shifts and bit fields are slow.

I'll try it, but in my experience, bit shifts and bit fields are slow.
Bit shifts are only slow on interpreters.
On physical hardware they're always the fastest or as fast as the alternatives.
In reality they're often faster as all other operands are implemented as shifts and combinational bitwise logic in a CPU.
For example masking the character with 0xF instead of subtracting 48 might not be faster on all machines but there shouldn't be a single instance where its slower.
It doesn't set any condition bits and in some it might event fuse into a shift with mask instruction (when used in this context).

Regarding bit operands vs mathematic operands.
Now this isn't entierly representative as this is run through an ardurino but just an example how much of a difference replacing totally unnecessary mathematics with bitfiddeling instead can do.
The first example is an implementation of bit angle modulation (a bit similar to PWM) that I used to drive 30 RGB diodes out of a single (software) UART on an ardurino it's then fed through shift registers and the BAM allowed me to control the brightness of each individal LED.
So the first version naively implemented using just pure (fixed point) mathematics
http://www.youtube.com/watch?v=zaTh_KaMZ1Q
When I had this working I started optimising it and rewrote it to using bit arithmetic only. And the result was the following which is the exact same program.
http://www.youtube.com/watch?v=whKo9zpBqLE
I think it demonstrates quite clearly how much of a difference, and this was just low hanging fruit.

thatīs not fair. you have raised the voltage in video II from 12V to 450V.

thatīs not fair. you have raised the voltage in video II from 12V to 450V.
Hush don't reveal the secret sauce 😜

Your logic is correct for processors which can do only one thing at a time.
The problem with your bitshifts is that the processor has to wait for every calculation to complete before it can do the following calculation.
Your algorithm with bitshifts does essentially this: n=(((1*10)+2)*10+3)*10+4
If I write it like this: n=1*1000+2*100+3*10+4 then the processor can do several calculations at the same time because the order doesn't matter.
Both algorithms do 3 multiplications and 3 additions, but the second is faster.
Signed integers are faster too: n=1*1000L+2*100L+3*10L+4
And the "optimized" version is slower.
The first test that I measured calculates 100000 times the longest value.
The second test that I measured calculates 100000 times the longest value and 100000 times the shortest value.

Your logic is correct for processors which can do only one thing at a time.
The problem with your bitshifts is that the processor has to wait for every calculation to complete before it can do the following calculation.
Your algorithm with bitshifts does essentially this: n=(((1*10)+2)*10+3)*10+4
If I write it like this: n=1*1000+2*100+3*10+4 then the processor can do several calculations at the same time because the order doesn't matter.
Both algorithms do 3 multiplications and 3 additions, but the second is faster.
Signed integers are faster too: n=1*1000L+2*100L+3*10L+4
And the "optimized" version is slower.
The first test that I measured calculates 100000 times the longest value.
The second test that I measured calculates 100000 times the longest value and 100000 times the shortest value.
Yes your theory would make sense if shifts were not a one cycle operation, you are right however that a multiplication can be masked by the pipelining on some CPUs as it could theoretically be performed in parallel just as you describe.
May I ask what CPU you are targeting in this particular example?
Or even better if you mind share your benchmark setup, I find this quite interesting considering I did benchmark the algorithm on 3 architectures (x86, MIPS, ARM) before I confidently submit my post here.

Power Macintosh G4 1.25 DP (MDD) = 1.25 GHz PowerPC 7455 (G4)x2