Please login or register.

Login with username, password and session length
Advanced search  

News:

Pages: [1]   Go Down

Author Topic: I linked Google's HTML5 parser  (Read 17746 times)

OS923

  • 512 MB
  • *****
  • Posts: 888
I linked Google's HTML5 parser
« on: December 01, 2021, 09:11:54 AM »

I linked Google's Gumbo parser for OS 9.
It parses around 15 MB/s.
The memory use is around 7 times the file length.
It does complete validation and builds a DOM tree.
Unfortunately the error handling is done with assert, which means that your debug program will show an error and stop, but the release version may crash, for example because of out of memory.
There has to be a better error handling before this is really usable, for example in a web spider program.
Logged

OS923

  • 512 MB
  • *****
  • Posts: 888
Re: I linked Google's HTML5 parser
« Reply #1 on: February 22, 2022, 09:06:13 AM »

I linked Lexbor. It's about 6 times faster than Gumbo parser.

I try to improve it with shorter identifiers and better includes because now they have to be done in a particular order and I want random order because I use alphabetical order.

My goal is to convert HTML files to a binary format. This can then be used easily in C++ programs. For example, you could use it to write your own browser or an HTML simplifying proxy like they used on Palm handhelds, or you could simplify an HTML file to view it in iCAB..
Logged

OS923

  • 512 MB
  • *****
  • Posts: 888
Re: I linked Google's HTML5 parser
« Reply #2 on: March 02, 2022, 11:13:19 AM »

It's 305,000 lines of code but everything goes as planned.
Logged

OS923

  • 512 MB
  • *****
  • Posts: 888
Re: I linked Google's HTML5 parser
« Reply #3 on: April 05, 2022, 08:09:15 AM »

Finished renaming. Now sorting.
Logged
Pages: [1]   Go Up

Recent Topics

[Mac OS 9 on Unsupported Hardware] eMac 1.25ghz - OS9 installer not recognizing partition by Doctorkillbydeath87 March 10, 2026, 09:10:19 PM
[Hardware] Stylewriter Windows 10 by snes1423 March 09, 2026, 11:14:01 PM
[Hardware] PowerBook 3400c unexplained behaviour by snes1423 March 09, 2026, 11:09:42 PM
[Software] iMac G3 summer 2000 350mhz indigo cd's by snes1423 March 09, 2026, 10:20:25 PM
[Digital Audio Workstations & MIDI Applications] SPC Player/Plugin by snes1423 March 09, 2026, 09:08:58 PM
[News, Information & Feedback] WOW!!! by IIO March 07, 2026, 01:34:51 AM
[Off Topic] SSL Fail, Firefox etc. by GaryN March 06, 2026, 02:33:19 PM
[Software] iPhone image documents by ssp3 March 04, 2026, 08:12:35 AM