Author Topic: XML parsing in REALbasic 3.5.2  (Read 1619 times)

Offline Knezzen

  • Platinum Member
  • *****
  • Posts: 786
  • Pro Tools addict!
    • Macintosh Garden
XML parsing in REALbasic 3.5.2
« on: January 07, 2019, 12:34:57 PM »
I'm thinking of starting to write my own XMPP/Jabber client using REALbasic 3.5.2 (so I can use it on my SE/30 as well), but from what I can find that version of RB doesn't have any built in way to parse XML. Does anyone here know of a plugin to RB 3.5.2 making it able to parse XML? Some kind of SSL plugin would be cool as well, but I'm not setting my hopes too high ;)

Offline Knezzen

  • Platinum Member
  • *****
  • Posts: 786
  • Pro Tools addict!
    • Macintosh Garden
Re: XML parsing in REALbasic 3.5.2
« Reply #1 on: January 09, 2019, 07:21:15 AM »
I found a XML parsing class for RB that is compatible with 3.5.2. Now I need to parse incoming XML from a open TCP Socket. Anyone know of any good example code I could take a look at?

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #2 on: January 09, 2019, 07:50:17 AM »
Is your XML
- a standalone document,
- a document with external references
- a list of XML expressions.

Offline Knezzen

  • Platinum Member
  • *****
  • Posts: 786
  • Pro Tools addict!
    • Macintosh Garden
Re: XML parsing in REALbasic 3.5.2
« Reply #3 on: January 09, 2019, 10:51:01 AM »
Is your XML
- a standalone document,
- a document with external references
- a list of XML expressions.

I need to send and recieve XML over TCP to communicate with the XMPP (Jabber) server.
All communication is done with XML. See https://xmpp.org/rfcs/rfc3920.html

I'm using the XML class attached to this post if you want to take a look.

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #4 on: January 12, 2019, 01:28:39 AM »
I see that it works as a document that is a stream and streams tend to be long so I want to warn you that parsing the XML in REALbasic 5.5.5 is done in one instruction and then you have an object tree where every object contains a copy of the text that was parsed and that used so much memory that the XML document was limited to about 200 MB.

Interpreting that object tree is painful. It would be easier to derive a class MyXMLReader from XMLReader, place a dummy instruction like dim b as boolean in every event handler, and place a breakpoint at all those instructions. When you call Parse the debugger will stop in every event handler. That gives you the chance to look at the parameters. Then you can decide what your program will look like.

I found it fast and it does complete validation.

I'll look into your class for REALbasic 3.5.2.

If you have to wait until the complete XML file has been received before you can parse, then I recommend that you try this parsing in a separate program instead of an internet program. That will already be difficult enough to start with.

Alternatively, you can write your own parser in a C++ plugin, then you don't have to wait with parsing until the complete document is received and you control how much memory it uses.

Offline Naiw

  • Consistant Contributor
  • ***
  • Posts: 115
  • new to the forums
Re: XML parsing in REALbasic 3.5.2
« Reply #5 on: January 12, 2019, 09:08:24 AM »
I don't have any answer to what libraries there is.

But implementing a SAX parser shouldn't be extremely difficult even if you don't find anything preexisting. Now I didn't look at this particular XML format so perhaps a DOM parser works better but those are more difficult, often slower but especially a lot more memory hog.

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #6 on: January 15, 2019, 07:31:14 AM »
I looked at the stuff in the attachment. It may solve your problem, but it's not a good solution.

I see that you're not interested in validation.

It can parse while you are receiving the XML. That's good.

The parser is not efficient. Look for example at the Parse function:

  posInSegment = 1
  while posInSegment <= len(xmlSegment)
    currentChar = asc(mid(xmlSegment, posInSegment, 1))

That's terrible.

xmlModule.Entity2Unicode does 444 string comparisons before you arrive at "diams".

Offline Knezzen

  • Platinum Member
  • *****
  • Posts: 786
  • Pro Tools addict!
    • Macintosh Garden
Re: XML parsing in REALbasic 3.5.2
« Reply #7 on: January 15, 2019, 12:25:03 PM »
xmlModule.Entity2Unicode does 444 string comparisons before you arrive at "diams".

Damn, that's pretty terrible. So what to do?

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #8 on: January 18, 2019, 08:53:08 AM »
Did you ever write a parser?

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #9 on: January 25, 2019, 05:24:23 AM »
See the solution in the attachment. The first program shows how to parse while you receive data. The second program is a simple XML parser without validation. Every function is relatively short and you can easily adapt it. I find it too slow for a commercial application and it will be difficult to do this faster in this language. An alternative is that you write this code in C++ and then wrap it as a plugin for REALbasic.

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #10 on: January 31, 2019, 02:58:17 AM »
Someone else tried this already in 2002.

http://www.tempel.org/ftp/pub/REALbasic/outdated/TTsXMLParser.sit.bin

I didn't try this yet.

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #11 on: February 05, 2019, 06:58:05 AM »
It doesn't validate either. Then my solution is better.

Offline Knezzen

  • Platinum Member
  • *****
  • Posts: 786
  • Pro Tools addict!
    • Macintosh Garden
Re: XML parsing in REALbasic 3.5.2
« Reply #12 on: March 05, 2019, 10:34:10 AM »
To resurrect this a bit. How do I wrap something like this as a Realbasic plug-in?
There are quite a few XMPP libraries out there and it would be wonderful if I could wrap one of them as a Realbasic plug-in.

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #13 on: March 06, 2019, 08:26:51 AM »
It's not so difficult. Best is that you first write a C++ class XMPPReader so that you can do new XMPPReader. Then you need to define a struct for every method, event and so on. Then you make a struct with links to all these structs and then you have an initialization instruction and you are ready to go. It works with the debugger. If you've never seen an example then it looks terrible, but the examples show that they made it easy for you. There are examples on the website of Thomas Tempelmann (www.tempel.org). I have a few examples for the plugin SDK from REALbasic 2005 (for REALbasic >= 5.0) in the OS 9.3 SDK. I'll upload them here. What I found a bit tricky is the memory management. You have to use special instructions to get a lock on your data, otherwise the reference count is 0 and it's deleted. When you delete XMPPReader then you have to release those locks. If you change member data then you have to release the old lock and set a new lock.
« Last Edit: March 06, 2019, 08:38:13 AM by OS923 »

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #14 on: March 06, 2019, 10:29:20 AM »
Instead of searching "REALbasic plugin source code" you can choose a word that you expect to be in that code, like REALmethodDefinition.

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #15 on: March 08, 2019, 07:18:35 AM »
To resurrect this a bit. How do I wrap something like this as a Realbasic plug-in?
There are quite a few XMPP libraries out there and it would be wonderful if I could wrap one of them as a Realbasic plug-in.
See the example in the attachment.

Offline OS923

  • Gold Member
  • *****
  • Posts: 404
Re: XML parsing in REALbasic 3.5.2
« Reply #16 on: March 08, 2019, 07:20:36 AM »
The plugin SDK doesn't give you access to all the functions. A plugin takes a "resolver" function as argument. This function gives you a function when you have its name. It's comparable to resolving symbols with the code fragment manager. When you look into the data fork of REALbasic then you see these function names, but you don't know the prototypes. You should have the complete list of entry names. Then you can do in a plugin everything that you can do in REALbasic.