LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   XML Tokenizer (https://www.linuxquestions.org/questions/programming-9/xml-tokenizer-710744/)

manolakis 03-11-2009 08:47 AM

XML Tokenizer
 
Hi there,

Does anybody know where can I find a fast Java XML Tokenizer (not parser)?

Thank you.

theNbomr 03-11-2009 09:26 AM

Can you tell us how you distinguish between tokenizers and parsers in the context of XML? SAX-based parsers tend to break up an XML document according to functional elements, although I'm not sure if that corresponds to your concept of a token.
--- rod.

manolakis 03-11-2009 09:36 AM

Hi there,

Sorry for not being clear. I am looking for a program which will accept an xml file, and will return the tags of the document (splitted). It will also be preferable if the tokenizer performs any syntactic validation. I do not know if you are any familiar with Flex. I am looking for a program like that for XML in Java.
Hope that this is clearer.

Thank you.

theNbomr 03-11-2009 11:58 AM

Sure, I've used flex numerous times before. It is not a tokenizer, but a program that generates tokenizers. As I understand it, you need it to generate a tokenizer that is Java source code, as opposed to it's usual C/C++ source code generation. Sorry that I don't know of any such program.
Still, it sounds like what you are after is what a SAX-based parser does (which includes validation). In case you are not already familiar with the idiom, SAX parsers allow you to specify callbacks which you provide, and which the parser calls upon seeing specified XML elements. The elements could be start tags, end tags, attributes, cdata, etc. Your callbacks are passed the instance data associated with each call, for processing according to your needs.

Perhaps someone else knows of something more closely matching what you are looking for.

--- rod.


All times are GMT -5. The time now is 06:23 AM.