Mac OS 9 Lives! (Classic Mac OS Forum)

Classic Mac OS Software (Discussions on Applications) => Application Development & Programming in the Classic Mac OS => Topic started by: Knezzen on January 07, 2019, 12:34:57 PM

Title: XML parsing in REALbasic 3.5.2
Post by: Knezzen on January 07, 2019, 12:34:57 PM
I'm thinking of starting to write my own XMPP/Jabber client using REALbasic 3.5.2 (so I can use it on my SE/30 as well), but from what I can find that version of RB doesn't have any built in way to parse XML. Does anyone here know of a plugin to RB 3.5.2 making it able to parse XML? Some kind of SSL plugin would be cool as well, but I'm not setting my hopes too high ;)
Title: Re: XML parsing in REALbasic 3.5.2
Post by: Knezzen on January 09, 2019, 07:21:15 AM
I found a XML parsing class for RB that is compatible with 3.5.2. Now I need to parse incoming XML from a open TCP Socket. Anyone know of any good example code I could take a look at?
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on January 09, 2019, 07:50:17 AM
Is your XML
- a standalone document,
- a document with external references
- a list of XML expressions.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: Knezzen on January 09, 2019, 10:51:01 AM
Is your XML
- a standalone document,
- a document with external references
- a list of XML expressions.

I need to send and recieve XML over TCP to communicate with the XMPP (Jabber) server.
All communication is done with XML. See https://xmpp.org/rfcs/rfc3920.html

I'm using the XML class attached to this post if you want to take a look.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on January 12, 2019, 01:28:39 AM
I see that it works as a document that is a stream and streams tend to be long so I want to warn you that parsing the XML in REALbasic 5.5.5 is done in one instruction and then you have an object tree where every object contains a copy of the text that was parsed and that used so much memory that the XML document was limited to about 200 MB.

Interpreting that object tree is painful. It would be easier to derive a class MyXMLReader from XMLReader, place a dummy instruction like dim b as boolean in every event handler, and place a breakpoint at all those instructions. When you call Parse the debugger will stop in every event handler. That gives you the chance to look at the parameters. Then you can decide what your program will look like.

I found it fast and it does complete validation.

I'll look into your class for REALbasic 3.5.2.

If you have to wait until the complete XML file has been received before you can parse, then I recommend that you try this parsing in a separate program instead of an internet program. That will already be difficult enough to start with.

Alternatively, you can write your own parser in a C++ plugin, then you don't have to wait with parsing until the complete document is received and you control how much memory it uses.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: Naiw on January 12, 2019, 09:08:24 AM
I don't have any answer to what libraries there is.

But implementing a SAX parser shouldn't be extremely difficult even if you don't find anything preexisting. Now I didn't look at this particular XML format so perhaps a DOM parser works better but those are more difficult, often slower but especially a lot more memory hog.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on January 15, 2019, 07:31:14 AM
I looked at the stuff in the attachment. It may solve your problem, but it's not a good solution.

I see that you're not interested in validation.

It can parse while you are receiving the XML. That's good.

The parser is not efficient. Look for example at the Parse function:

  posInSegment = 1
  while posInSegment <= len(xmlSegment)
    currentChar = asc(mid(xmlSegment, posInSegment, 1))

That's terrible.

xmlModule.Entity2Unicode does 444 string comparisons before you arrive at "diams".
Title: Re: XML parsing in REALbasic 3.5.2
Post by: Knezzen on January 15, 2019, 12:25:03 PM
xmlModule.Entity2Unicode does 444 string comparisons before you arrive at "diams".

Damn, that's pretty terrible. So what to do?
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on January 18, 2019, 08:53:08 AM
Did you ever write a parser?
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on January 25, 2019, 05:24:23 AM
See the solution in the attachment. The first program shows how to parse while you receive data. The second program is a simple XML parser without validation. Every function is relatively short and you can easily adapt it. I find it too slow for a commercial application and it will be difficult to do this faster in this language. An alternative is that you write this code in C++ and then wrap it as a plugin for REALbasic.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on January 31, 2019, 02:58:17 AM
Someone else tried this already in 2002.

http://www.tempel.org/ftp/pub/REALbasic/outdated/TTsXMLParser.sit.bin

I didn't try this yet.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on February 05, 2019, 06:58:05 AM
It doesn't validate either. Then my solution is better.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: Knezzen on March 05, 2019, 10:34:10 AM
To resurrect this a bit. How do I wrap something like this as a Realbasic plug-in?
There are quite a few XMPP libraries out there and it would be wonderful if I could wrap one of them as a Realbasic plug-in.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on March 06, 2019, 08:26:51 AM
It's not so difficult. Best is that you first write a C++ class XMPPReader so that you can do new XMPPReader. Then you need to define a struct for every method, event and so on. Then you make a struct with links to all these structs and then you have an initialization instruction and you are ready to go. It works with the debugger. If you've never seen an example then it looks terrible, but the examples show that they made it easy for you. There are examples on the website of Thomas Tempelmann (www.tempel.org). I have a few examples for the plugin SDK from REALbasic 2005 (for REALbasic >= 5.0) in the OS 9.3 SDK. I'll upload them here. What I found a bit tricky is the memory management. You have to use special instructions to get a lock on your data, otherwise the reference count is 0 and it's deleted. When you delete XMPPReader then you have to release those locks. If you change member data then you have to release the old lock and set a new lock.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on March 06, 2019, 10:29:20 AM
Instead of searching "REALbasic plugin source code" you can choose a word that you expect to be in that code, like REALmethodDefinition.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on March 08, 2019, 07:18:35 AM
To resurrect this a bit. How do I wrap something like this as a Realbasic plug-in?
There are quite a few XMPP libraries out there and it would be wonderful if I could wrap one of them as a Realbasic plug-in.
See the example in the attachment.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on March 08, 2019, 07:20:36 AM
The plugin SDK doesn't give you access to all the functions. A plugin takes a "resolver" function as argument. This function gives you a function when you have its name. It's comparable to resolving symbols with the code fragment manager. When you look into the data fork of REALbasic then you see these function names, but you don't know the prototypes. You should have the complete list of entry names. Then you can do in a plugin everything that you can do in REALbasic.
Title: Re: XML parsing in REALbasic 3.5.2
Post by: OS923 on October 11, 2019, 08:34:51 AM
I found a memory leak in "Partial input parser". Now all memory leaks are gone.