LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-11-2009, 08:47 AM   #1
manolakis
Member
 
Registered: Nov 2006
Distribution: xubuntu
Posts: 464

Rep: Reputation: 37
XML Tokenizer


Hi there,

Does anybody know where can I find a fast Java XML Tokenizer (not parser)?

Thank you.
 
Old 03-11-2009, 09:26 AM   #2
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
Can you tell us how you distinguish between tokenizers and parsers in the context of XML? SAX-based parsers tend to break up an XML document according to functional elements, although I'm not sure if that corresponds to your concept of a token.
--- rod.
 
Old 03-11-2009, 09:36 AM   #3
manolakis
Member
 
Registered: Nov 2006
Distribution: xubuntu
Posts: 464

Original Poster
Rep: Reputation: 37
Hi there,

Sorry for not being clear. I am looking for a program which will accept an xml file, and will return the tags of the document (splitted). It will also be preferable if the tokenizer performs any syntactic validation. I do not know if you are any familiar with Flex. I am looking for a program like that for XML in Java.
Hope that this is clearer.

Thank you.
 
Old 03-11-2009, 11:58 AM   #4
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
Sure, I've used flex numerous times before. It is not a tokenizer, but a program that generates tokenizers. As I understand it, you need it to generate a tokenizer that is Java source code, as opposed to it's usual C/C++ source code generation. Sorry that I don't know of any such program.
Still, it sounds like what you are after is what a SAX-based parser does (which includes validation). In case you are not already familiar with the idiom, SAX parsers allow you to specify callbacks which you provide, and which the parser calls upon seeing specified XML elements. The elements could be start tags, end tags, attributes, cdata, etc. Your callbacks are passed the instance data associated with each call, for processing according to your needs.

Perhaps someone else knows of something more closely matching what you are looking for.

--- rod.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Define a tokenizer class spank Programming 1 10-19-2006 11:10 AM
configure: error: could not find DocBook XML DTD V4.1.2 in XML catalog Fadoksi Linux - Software 1 07-16-2006 06:41 AM
Tokenizer for Open Firmware drsparikh Linux - Software 0 03-16-2004 04:23 AM
Java tokenizer problem Andy@DP Programming 7 03-02-2004 01:49 AM
java + string tokenizer dave bean Programming 2 12-10-2003 03:22 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:08 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration