Author Topic: Remake of OSX's Dictionary  (Read 21218 times)

Offline OS923

  • Platinum Member
  • *****
  • Posts: 888
Re: Remake of OSX's Dictionary
« Reply #40 on: April 13, 2021, 03:38:00 AM »
I'm rewriting the program to make it work with all XML dictionaries in the same file format.
I do some optimization like I replace
Code: [Select]
"<o:x>" with "<x>", "<italic> with "<i>", "<x> " with " <x>", " )" with ") "and so on.
I also found 6 XML syntax errors and 3 4-byte UTF characters.
You can do this conversion, but I recommend that you leave that up to me.

Where can I find those dictionaries?

Offline OS923

  • Platinum Member
  • *****
  • Posts: 888
Re: Remake of OSX's Dictionary
« Reply #41 on: April 20, 2021, 07:08:38 AM »
I parted from the idea of stuffing everything in one binary file. Now I do like the original program: there are 2 XML files and an index for each XML file. The program rebuilds the indexes if they are deleted (unlike the original program). It works 40 times faster than my previous program. It uses 11 times less memory. It displays the dictionary entries in the first column and the thesaurus entries in the second column. Novotny and Uskudar are still displayed incorrectly.

Offline OS923

  • Platinum Member
  • *****
  • Posts: 888
Re: Remake of OSX's Dictionary
« Reply #42 on: April 23, 2021, 07:12:59 AM »
Here's the preview of the new version. I find that it looks better than the previous version. It displays tables and lists. If you copy the pictures to the Output folder, then it should also display the pictures. It uses style sheets that you may change. The conversion from XML to HTML is open source.

Offline robespierre

  • Veteran Member
  • ****
  • Posts: 123
  • malfrat des logiciels
Re: Remake of OSX's Dictionary
« Reply #43 on: August 08, 2021, 05:20:38 AM »
that is what i was wondering: is that the only word with the character ý?

and why would ý be broken in a regular truetype in OS9 but not in OSX?
Because that character is not included in the MacRoman (8-bit) encoding, unlike most other accented latin letters. As a consequence there is no way to map its code to a glyph in traditional (OS 9 and below) TrueType. OSX uses Unicode to map to glyphs so it has no difficulty.
See https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html
'cmap' subtable type 1 (ScriptManager) format 0 (8-bit codes) was used for most OS 9 fonts.

It's also not possible to type it using the normal keyboard layout, even in OSX: you need to use an "Extended" key layout.