Squeak
  links to this page:    
View this PageEdit this PageUploads to this PageHistory of this PageTop of the SwikiRecent ChangesSearch the SwikiHelp Guide
XML: Parsers
Last updated at 9:15 pm UTC on 3 November 2006
YAXO (the combination of YAX and Exobox) will be included in Squeak3.3. (http://squeaklet.com/Yax/index.html)
(3/3/2002 umejava)

(This table is not and never was sorted by any objective criteria:-)

It could be something that I'm doing wrong, but, the Exobox change set appears to be corrupted. It uncompresses, but, the resulting text contains some unwanted spaces, and perhaps some repeated character sequences. –BillSchwab

Nope. I had the same thing. Can't seem to get a decent changeset out of it. Wim

XML Parser Implementor Download / More Information Comments
exobox XML parser exobox XML.cs.gz The real name of the file is "XML.2.cs.gz" but the "2" seems to be used by the Swiki versioning systen It's not a port of another parser, and is therefore probably a bit more Squeakish than the Camp Smalltalk version, but it's a complete well-formedness parser minus UTF-8 support. The file seems to be broken. Download fromXML UI Spec Builderinstead. Or extract from latest stable squeak -http://swiki.squeakfoundation.org/stablesqueak (1/20/2002 umejava)
CampSmalltalk-XmlParser Cincom, Camp Smalltalk, ported to Squeak by Bijan Parsia Squeak code, Camp Smalltalk The extensions of the filenames do not make sense: remove the ".gz" and simply file in. (Fixed 10/7/200; BJP) Also note that this is not up to date with VW 5i.2. I have to talk to Roger (who back ports it to VW 3.0) and see what's new. Also see http://www.mars.dti.ne.jp/~umejava/smalltalk/soapOpera/index.html. In the site, repackaged VWXML is available.(1/20/2002 umejava)
"cleanroom" XML Framework Michael Rueger I was not able to upload it here due to a Swiki-problem... but fortunately there is a mailing list archive: XML Framework and there is also Yax -ak I did not have a single problem to file it in and execute the examples ak
BhrXmlParser David R Harris , bported by Helge Horch BhrXmlParser "a partial XML parser with a SAX-like event-driven interface"

There was a Camp Smalltalk project to port the Cincom XML parse (which a lot of people like) to other versions of Smalltalk. I know they got it running on VisualAge, and I think there were people working on Squeak. The page describing the project is at http://wiki.cs.uiuc.edu/CampSmalltalk/XmlParser+XSL+and+DOM+Level+2 but it doesn't say much about the status. Contact the leaders. The Cincom XML parser was made open source. -Ralph Johnson

  1. There are two partial XML parsers for Squeak floating about, with partial DOMish support. There's also the indev XML parser and DOM support for VisualWorks, which folks have talked about porting. These may be of interest.
  2. There's DOMish support in Scamper (for HTML). Integration with Scamper would be very nice.
  3. Have you considered Groves rather than/in addition to DOM support? Groves seem more Smalltalky/Squeaky (to the degree I understand them). There was just a release of a Python Groves implementation that looks rather nice, for reference.





Email me and I can send you the partial parsers :) (I'm going to peek at yours.)
Good. :)
Hard core? Really? I have this Groves advocate (Ken MacLeod , who wrote the Perl groves implementaion) who keeps say that groves is easy and DOM a major pain. From our discussions, it sounds like they'd be especially easy in Smalltalk. They also can (supposedly) pretty easily emulate DOM style. I'll confess that I have trouble emunderstanding/em them :) I like the idea of grove plans, though. I'll investigate some more. (I looked at the indev parser and felt sick too, but I think that was partially because DOM just looks hideous to me ;).)
New! I was just thinking that talk of a validating parser, in general, is misleading. After all, XML is a meta-language. DTDs are like emgrammars/em (ok, they ARE grammars :)). Validation is syntax checking. Etc. So, I though, why not make it more or less explicit and use a T-Gen style interface/system to have a XML parser generator? emSince/em we just got a port of T-Gen :) (Of course, I'll bet that there are some nasty DTDs that aren't too easy to generate parsers for given normal automatic parsing techniques :( Oh well.)



I would recommend that you seriously check out the XML work being done by the Python community. That work would need to be "Squeakified," but they have already worked out a number of key issues and it well might save time to stand on their broad shoulders.


OK. I can't currently read Python so this will take some time. The actual XML stuff does not seem hard to do. It has been designed (with the exception of the DOM) to be easy. Anything in another language will still have to be recoded, and I think that Expat is a good basis for now because it takes encoding into consideration.




Some older work by others


See work by InDelv http://www.indelv.com/ see email
Date: Wed, 22 Sep 1999 17:32:46 -0700
From: Duane Maxwell
Subject: Re: Markup Language (SGML/XML) Parsing/Processing?

He has an example of a simple XML parser.
See message # 6305 in the eGroups archive (http://www.egroups.com/group/squeak/6305.html).