Mac OS 9 Lives

Classic Mac OS Software => Business Software & Application Suites => Topic started by: OS923 on April 17, 2016, 02:26:09 AM

Title: Remake of OSX's Dictionary
Post by: OS923 on April 17, 2016, 02:26:09 AM
I can read OSX dictionaries into a REALbasic program. This works fine on Windows, but on OS 9 the weird pronunciation characters are replaced with question marks.

Then I read this:
Quote
Because OS 9 can't draw Unicode directly; we have to convert to the encoding associated with the font you're using. But our code that does this conversion is clever enough to short-cut the process when your string contains only ASCII characters, and you're converting from one ASCII superset to another (e.g., UTF-8 to MacRoman). That's why it's faster when there are no non-ASCII characters in your string.
Do I have to install a special Unicode font and script?

Title: Re: Remake of OSX's Dictionary
Post by: OS923 on April 25, 2016, 07:31:34 AM
I made a test page with all different Unicode characters which are used in the American dictionary.
Then I tried the code2000 font on Windows.
Some characters were missing.

I didn't find such a font for OS 9.

Is it possible to convert OSX fonts to OS 9 fonts?
Title: Re: Remake of OSX's Dictionary
Post by: MacTron on April 25, 2016, 09:23:56 AM
Is it possible to convert OSX fonts to OS 9 fonts?
If the fonts are in truetype format, then it can work on Mac Os 9. But you'll need a converter, probably.

BTW: Long time ago I had worked in a little project that use unicode text. I have used the WASTE text engine to deal with UTF8 and UTF16 encodings.
ATSUI and WorldScript II can be used also.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on April 28, 2016, 07:13:56 AM
I'm going to try the Google Noto fonts.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on May 03, 2016, 03:22:16 AM
REALbasic 5.5.5 has unicode support, but some characters are drawn as 2 characters, like a letter followed by an accent.

I had 2 possible solutions:
I consider doing them both.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on May 03, 2016, 03:24:50 AM
I converted the dictionary to a 7-bit ASCII Scheme expression without backslash.
Then I converted it to a binary format that contains the dictionary, thesaurus and index.
It's half the size of the original.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on May 03, 2016, 03:27:30 AM
The versions for Windows and Linux will be able to draw all characters as text.
Title: Re: Remake of OSX's Dictionary
Post by: MacTron on May 03, 2016, 12:27:59 PM
REALbasic 5.5.5 has unicode support, but some characters are drawn as 2 characters, like a letter followed by an accent.

but it seems that it can't deal well with UTF-16 (2 bytes per char) ...
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on June 01, 2016, 03:27:25 AM
Is there a plugin for REALbasic which can show the Unicode correctly?
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on June 07, 2016, 06:03:03 AM
I fragmalyzed REALbasic 5.5.5.
It doesn't use Text Services while this is necessary for Unicode support.
Thus OSX's Dictionary can't be remade with REALbasic 5.5.5.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on June 15, 2016, 06:08:23 AM
I remade it with REALbasic 5.5.5.
On OS9 the weird characters are replaced with ? but I don't care.
It's still very useful.
It looks OK on Windows.
Now I'm adding custom style with preferences.
Title: Re: Remake of OSX's Dictionary
Post by: InsectorX on June 16, 2016, 10:47:26 PM
Sounds cool, I'd love a copy whenever it's "ready".  even with some question marks, it sounds cool

I'm impressed
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on June 17, 2016, 10:33:33 AM
See attachment for preview.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on June 18, 2016, 02:34:52 AM
There are still problems with the style on Windows.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on June 23, 2016, 05:28:25 AM
It's finished.
http://shareware.gangstalkingwiki.com/Dictionary98_manual.htm
Title: Re: Remake of OSX's Dictionary
Post by: DieHard on June 23, 2016, 09:39:33 AM
That is really nifty to have this available on OS 9.. I will definitely download, thanks for all the effort :)
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on January 26, 2019, 01:40:11 AM
I rewrote this program in C++. See attachment.
Title: Re: Remake of OSX's Dictionary
Post by: MacTron on January 26, 2019, 04:08:46 AM
Thank you for sharing your work.
But unfortunately I couldn't try it because it closes immediately after open it, not even show a window ...
I have booted with a brand new standar 9.2.2 System folder in a G4 MDD and the result is the same.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on January 28, 2019, 03:03:14 AM
You need to have 300 MB of free memory and I think that it requires that you install the Unicode support (which you didn't because your system is "brand new"). (I never tried it without Unicode support.) If I remember correctly, on the install CD there's an "Extras" folder with a "Unicode" folder. This contains 3 files and if you drag this to the system folder then it installs 4 things, because the script file contains also another thing. You also need to install the "Noto Sans" font (which is included in the download). It doesn't work without the font.
Title: Re: Remake of OSX's Dictionary
Post by: IIO on January 28, 2019, 04:41:57 AM
sitx :P
Title: Re: Remake of OSX's Dictionary
Post by: MacTron on January 28, 2019, 08:04:42 AM
You also need to install the "Noto Sans" font (which is included in the download). It doesn't work without the font.
That's was the issue. Once I have installed "Noto Sans" font the app worked.
Title: Re: Remake of OSX's Dictionary
Post by: ELN on January 28, 2019, 07:56:51 PM
No chance of bundling the font in the app’s resource fork? This is a very cool tool!
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on January 29, 2019, 02:42:02 AM
If you install the Unicode support then you can type Unicode in the windows.

Unfortunately printing will crash. I tried to print in WorldText and it crashed too. Is there a program/printer combination that doesn't crash when trying to print Unicode?
Title: Re: Remake of OSX's Dictionary
Post by: WillyWonka on February 01, 2019, 02:54:01 AM
Does this app accept custom dictionaries?

The reference dictionary in Spanish is the Real Academia de la Lengua, which I use under OS X as it was converted to Dictionary.app format using DictUnifier.

The bundled Spanish dictionary in dictionary.app is well... poor.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on February 01, 2019, 05:22:55 AM
Does this app accept custom dictionaries?
No, I convert the XML to a binary format that contains Unicode instead of UTF-8.

If you send me a download link for this Real Academia de la Lengua converted to Dictionary.app format then I'll look into it. Does this contain also a thesaurus? The program was written for dictionary + thesaurus combination.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on February 07, 2019, 04:39:21 AM
I improved the program with error messages.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on February 11, 2019, 04:06:01 AM
There's a flaw in the multilingual text engine. You have to move the file position to 0 before you call TXNSave if you change the data fork (after Save As). I'll correct this in the next version.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on April 09, 2019, 06:19:56 AM
I rewrote the Dictionary program. The difficult stuff is part of the OS 9.3 SDK as DictionaryLib. The rest is open source and in English, less than 2000 lines of code. Then you can make dictionary programs for other languages.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on May 29, 2019, 09:42:54 AM
I reverted to the previous version because this was easier to maintain.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on May 29, 2019, 09:44:06 AM
See in the attachment the new version. It uses 200 MB of memory instead of 300 MB, it's 20% faster and it can export as HTML.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on June 04, 2019, 06:35:40 AM
The maximum text length was miscalculated. The explanation for "break", "get", "go" and "run" could not be displayed. But now I have it right with a verification procedure.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on June 06, 2019, 02:21:13 AM
I removed the code for opening Textension documents because this program isn't meant to be a Unicode editor. I removed the AppleScript support from PowerPlant. I added a Quit handler. I simplified the menus.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on January 06, 2020, 09:39:49 AM
I improved it.

It uses 20 MB more but it starts up more than twice as fast.

I changed the sorting order. It sorts case insensitive and diacritics insensitive and characters like space, "-" and "'" are ignored if the rest is equal.

I removed the progress dialog.

I found 2 words that are not displayed correctly: Novotný and Üsküdar.
Title: Re: Remake of OSX's Dictionary
Post by: IIO on January 07, 2020, 04:25:44 PM
"Novotný" is one of the most important words of all.

i use it in every second sentence i write, Novotný, and sometimes even Novotný twice in one.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on January 10, 2020, 03:57:59 AM
26.600.000 results for Novotný.
24.500.000 results for Üsküdar.
Novotný has won.
Title: Re: Remake of OSX's Dictionary
Post by: IIO on January 11, 2020, 07:40:14 PM
from germany (google.com):

Novotný About 13.100.000 results (0,56 seconds)
 
Üsküdar About 23.900.000 results (0,49 seconds)

here Üsküdar wins by far. plus he is faster!


Title: Re: Remake of OSX's Dictionary
Post by: IIO on January 11, 2020, 07:42:25 PM

direct comparison between Üsküdar and Novotný (measured on an atari ST, headless, blitter off)

(https://upload.wikimedia.org/wikipedia/commons/thumb/7/71/Maiden%27s_Tower%2C_Istanbul.jpg/1024px-Maiden%27s_Tower%2C_Istanbul.jpg)

(https://upload.wikimedia.org/wikipedia/commons/8/88/Anton%C3%ADn_Novotn%C3%BD_1968.jpg)
Title: Re: Remake of OSX's Dictionary
Post by: IIO on January 11, 2020, 07:43:44 PM
Novotný seems to be larger, that´s interesting.

i wonder if he runs under emulation?
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on January 13, 2020, 06:27:55 AM
Surprisingly, Novotný is written correctly using the Geneva font, but incorrectly using the Noto font. Noto means no tofu, tofu being the rectangle that is drawn for missing characters. Noto shows tofu for ý, but Geneva doesn't.
Title: Re: Remake of OSX's Dictionary
Post by: IIO on January 13, 2020, 12:10:46 PM
that is what i was wondering: is that the only word with the character ý?

and why would ý be broken in a regular truetype in OS9 but not in OSX?
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on April 13, 2021, 03:38:00 AM
I'm rewriting the program to make it work with all XML dictionaries in the same file format.
I do some optimization like I replace
Code: [Select]
"<o:x>" with "<x>", "<italic> with "<i>", "<x> " with " <x>", " )" with ") "and so on.
I also found 6 XML syntax errors and 3 4-byte UTF characters.
You can do this conversion, but I recommend that you leave that up to me.

Where can I find those dictionaries?
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on April 20, 2021, 07:08:38 AM
I parted from the idea of stuffing everything in one binary file. Now I do like the original program: there are 2 XML files and an index for each XML file. The program rebuilds the indexes if they are deleted (unlike the original program). It works 40 times faster than my previous program. It uses 11 times less memory. It displays the dictionary entries in the first column and the thesaurus entries in the second column. Novotny and Uskudar are still displayed incorrectly.
Title: Re: Remake of OSX's Dictionary
Post by: OS923 on April 23, 2021, 07:12:59 AM
Here's the preview of the new version. I find that it looks better than the previous version. It displays tables and lists. If you copy the pictures to the Output folder, then it should also display the pictures. It uses style sheets that you may change. The conversion from XML to HTML is open source.
Title: Re: Remake of OSX's Dictionary
Post by: robespierre on August 08, 2021, 05:20:38 AM
that is what i was wondering: is that the only word with the character ý?

and why would ý be broken in a regular truetype in OS9 but not in OSX?
Because that character is not included in the MacRoman (8-bit) encoding, unlike most other accented latin letters. As a consequence there is no way to map its code to a glyph in traditional (OS 9 and below) TrueType. OSX uses Unicode to map to glyphs so it has no difficulty.
See https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
'cmap' subtable type 1 (ScriptManager) format 0 (8-bit codes) was used for most OS 9 fonts.

It's also not possible to type it using the normal keyboard layout, even in OSX: you need to use an "Extended" key layout.