readme.txt 19 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514
  1. /** @mainpage
  2. <h1> TinyXML-2 </h1>
  3. TinyXML is a simple, small, efficient, C++ XML parser that can be
  4. easily integrated into other programs.
  5. <h2> What it does. </h2>
  6. In brief, TinyXML parses an XML document, and builds from that a
  7. Document Object Model (DOM) that can be read, modified, and saved.
  8. XML stands for "eXtensible Markup Language." It is a general purpose
  9. human and machine readable markup language to describe arbitrary data.
  10. All those random file formats created to store application data can
  11. all be replaced with XML. One parser for everything.
  12. http://en.wikipedia.org/wiki/XML
  13. There are different ways to access and interact with XML data.
  14. TinyXML-2 uses a Document Object Model (DOM), meaning the XML data is parsed
  15. into a C++ objects that can be browsed and manipulated, and then
  16. written to disk or another output stream. You can also construct an XML document
  17. from scratch with C++ objects and write this to disk or another output
  18. stream. You can even use TinyXML-2 to stream XML programmatically from
  19. code without creating a document first.
  20. TinyXML-2 is designed to be easy and fast to learn. It is one header and
  21. one cpp file. Simply add these to your project and off you go.
  22. There is an example file - xmltest.cpp - to get you started.
  23. TinyXML-2 is released under the ZLib license,
  24. so you can use it in open source or commercial code. The details
  25. of the license are at the top of every source file.
  26. TinyXML-2 attempts to be a flexible parser, but with truly correct and
  27. compliant XML output. TinyXML-2 should compile on any reasonably C++
  28. compliant system. It does not rely on exceptions, RTTI, or the STL.
  29. <h2> What it doesn't do. </h2>
  30. TinyXML doesn't parse or use DTDs (Document Type Definitions) or XSLs
  31. (eXtensible Stylesheet Language.) There are other parsers out there
  32. that are much more fully
  33. featured. But they are also much bigger, take longer to set up in
  34. your project, have a higher learning curve, and often have a more
  35. restrictive license. If you are working with browsers or have more
  36. complete XML needs, TinyXML-2 is not the parser for you.
  37. <h2> TinyXML-1 vs. TinyXML-2 </h2>
  38. <h2> Tutorials. </h2>
  39. For the impatient, here is a tutorial to get you going. A great way to get started,
  40. but it is worth your time to read this (very short) manual completely.
  41. - @subpage tutorial0
  42. <h2> Code Status. </h2>
  43. TinyXML is mature, tested code. It is very stable. If you find
  44. bugs, please file a bug report on the sourceforge web site
  45. (www.sourceforge.net/projects/tinyxml). We'll get them straightened
  46. out as soon as possible.
  47. There are some areas of improvement; please check sourceforge if you are
  48. interested in working on TinyXML.
  49. <h2> Related Projects </h2>
  50. TinyXML projects you may find useful! (Descriptions provided by the projects.)
  51. <ul>
  52. <li> <b>TinyXPath</b> (http://tinyxpath.sourceforge.net). TinyXPath is a small footprint
  53. XPath syntax decoder, written in C++.</li>
  54. <li> <b>TinyXML++</b> (http://code.google.com/p/ticpp/). TinyXML++ is a completely new
  55. interface to TinyXML that uses MANY of the C++ strengths. Templates,
  56. exceptions, and much better error handling.</li>
  57. </ul>
  58. <h2> Features </h2>
  59. <h3> Using STL </h3>
  60. TinyXML can be compiled to use or not use STL. When using STL, TinyXML
  61. uses the std::string class, and fully supports std::istream, std::ostream,
  62. operator<<, and operator>>. Many API methods have both 'const char*' and
  63. 'const std::string&' forms.
  64. When STL support is compiled out, no STL files are included whatsoever. All
  65. the string classes are implemented by TinyXML itself. API methods
  66. all use the 'const char*' form for input.
  67. Use the compile time #define:
  68. TIXML_USE_STL
  69. to compile one version or the other. This can be passed by the compiler,
  70. or set as the first line of "tinyxml.h".
  71. Note: If compiling the test code in Linux, setting the environment
  72. variable TINYXML_USE_STL=YES/NO will control STL compilation. In the
  73. Windows project file, STL and non STL targets are provided. In your project,
  74. It's probably easiest to add the line "#define TIXML_USE_STL" as the first
  75. line of tinyxml.h.
  76. <h3> UTF-8 </h3>
  77. TinyXML supports UTF-8 allowing to manipulate XML files in any language. TinyXML
  78. also supports "legacy mode" - the encoding used before UTF-8 support and
  79. probably best described as "extended ascii".
  80. Normally, TinyXML will try to detect the correct encoding and use it. However,
  81. by setting the value of TIXML_DEFAULT_ENCODING in the header file, TinyXML
  82. can be forced to always use one encoding.
  83. TinyXML will assume Legacy Mode until one of the following occurs:
  84. <ol>
  85. <li> If the non-standard but common "UTF-8 lead bytes" (0xef 0xbb 0xbf)
  86. begin the file or data stream, TinyXML will read it as UTF-8. </li>
  87. <li> If the declaration tag is read, and it has an encoding="UTF-8", then
  88. TinyXML will read it as UTF-8. </li>
  89. <li> If the declaration tag is read, and it has no encoding specified, then TinyXML will
  90. read it as UTF-8. </li>
  91. <li> If the declaration tag is read, and it has an encoding="something else", then TinyXML
  92. will read it as Legacy Mode. In legacy mode, TinyXML will work as it did before. It's
  93. not clear what that mode does exactly, but old content should keep working.</li>
  94. <li> Until one of the above criteria is met, TinyXML runs in Legacy Mode.</li>
  95. </ol>
  96. What happens if the encoding is incorrectly set or detected? TinyXML will try
  97. to read and pass through text seen as improperly encoded. You may get some strange results or
  98. mangled characters. You may want to force TinyXML to the correct mode.
  99. You may force TinyXML to Legacy Mode by using LoadFile( TIXML_ENCODING_LEGACY ) or
  100. LoadFile( filename, TIXML_ENCODING_LEGACY ). You may force it to use legacy mode all
  101. the time by setting TIXML_DEFAULT_ENCODING = TIXML_ENCODING_LEGACY. Likewise, you may
  102. force it to TIXML_ENCODING_UTF8 with the same technique.
  103. For English users, using English XML, UTF-8 is the same as low-ASCII. You
  104. don't need to be aware of UTF-8 or change your code in any way. You can think
  105. of UTF-8 as a "superset" of ASCII.
  106. UTF-8 is not a double byte format - but it is a standard encoding of Unicode!
  107. TinyXML does not use or directly support wchar, TCHAR, or Microsoft's _UNICODE at this time.
  108. It is common to see the term "Unicode" improperly refer to UTF-16, a wide byte encoding
  109. of unicode. This is a source of confusion.
  110. For "high-ascii" languages - everything not English, pretty much - TinyXML can
  111. handle all languages, at the same time, as long as the XML is encoded
  112. in UTF-8. That can be a little tricky, older programs and operating systems
  113. tend to use the "default" or "traditional" code page. Many apps (and almost all
  114. modern ones) can output UTF-8, but older or stubborn (or just broken) ones
  115. still output text in the default code page.
  116. For example, Japanese systems traditionally use SHIFT-JIS encoding.
  117. Text encoded as SHIFT-JIS can not be read by TinyXML.
  118. A good text editor can import SHIFT-JIS and then save as UTF-8.
  119. The <a href="http://skew.org/xml/tutorial/">Skew.org link</a> does a great
  120. job covering the encoding issue.
  121. The test file "utf8test.xml" is an XML containing English, Spanish, Russian,
  122. and Simplified Chinese. (Hopefully they are translated correctly). The file
  123. "utf8test.gif" is a screen capture of the XML file, rendered in IE. Note that
  124. if you don't have the correct fonts (Simplified Chinese or Russian) on your
  125. system, you won't see output that matches the GIF file even if you can parse
  126. it correctly. Also note that (at least on my Windows machine) console output
  127. is in a Western code page, so that Print() or printf() cannot correctly display
  128. the file. This is not a bug in TinyXML - just an OS issue. No data is lost or
  129. destroyed by TinyXML. The console just doesn't render UTF-8.
  130. <h3> Entities </h3>
  131. TinyXML recognizes the pre-defined "character entities", meaning special
  132. characters. Namely:
  133. @verbatim
  134. &amp; &
  135. &lt; <
  136. &gt; >
  137. &quot; "
  138. &apos; '
  139. @endverbatim
  140. These are recognized when the XML document is read, and translated to there
  141. UTF-8 equivalents. For instance, text with the XML of:
  142. @verbatim
  143. Far &amp; Away
  144. @endverbatim
  145. will have the Value() of "Far & Away" when queried from the TiXmlText object,
  146. and will be written back to the XML stream/file as an ampersand. Older versions
  147. of TinyXML "preserved" character entities, but the newer versions will translate
  148. them into characters.
  149. Additionally, any character can be specified by its Unicode code point:
  150. The syntax "&#xA0;" or "&#160;" are both to the non-breaking space characher.
  151. <h3> Printing </h3>
  152. TinyXML can print output in several different ways that all have strengths and limitations.
  153. - Print( FILE* ). Output to a std-C stream, which includes all C files as well as stdout.
  154. - "Pretty prints", but you don't have control over printing options.
  155. - The output is streamed directly to the FILE object, so there is no memory overhead
  156. in the TinyXML code.
  157. - used by Print() and SaveFile()
  158. - operator<<. Output to a c++ stream.
  159. - Integrates with standart C++ iostreams.
  160. - Outputs in "network printing" mode without line breaks. Good for network transmission
  161. and moving XML between C++ objects, but hard for a human to read.
  162. - TiXmlPrinter. Output to a std::string or memory buffer.
  163. - API is less concise
  164. - Future printing options will be put here.
  165. - Printing may change slightly in future versions as it is refined and expanded.
  166. <h3> Streams </h3>
  167. With TIXML_USE_STL on TinyXML supports C++ streams (operator <<,>>) streams as well
  168. as C (FILE*) streams. There are some differences that you may need to be aware of.
  169. C style output:
  170. - based on FILE*
  171. - the Print() and SaveFile() methods
  172. Generates formatted output, with plenty of white space, intended to be as
  173. human-readable as possible. They are very fast, and tolerant of ill formed
  174. XML documents. For example, an XML document that contains 2 root elements
  175. and 2 declarations, will still print.
  176. C style input:
  177. - based on FILE*
  178. - the Parse() and LoadFile() methods
  179. A fast, tolerant read. Use whenever you don't need the C++ streams.
  180. C++ style output:
  181. - based on std::ostream
  182. - operator<<
  183. Generates condensed output, intended for network transmission rather than
  184. readability. Depending on your system's implementation of the ostream class,
  185. these may be somewhat slower. (Or may not.) Not tolerant of ill formed XML:
  186. a document should contain the correct one root element. Additional root level
  187. elements will not be streamed out.
  188. C++ style input:
  189. - based on std::istream
  190. - operator>>
  191. Reads XML from a stream, making it useful for network transmission. The tricky
  192. part is knowing when the XML document is complete, since there will almost
  193. certainly be other data in the stream. TinyXML will assume the XML data is
  194. complete after it reads the root element. Put another way, documents that
  195. are ill-constructed with more than one root element will not read correctly.
  196. Also note that operator>> is somewhat slower than Parse, due to both
  197. implementation of the STL and limitations of TinyXML.
  198. <h3> White space </h3>
  199. The world simply does not agree on whether white space should be kept, or condensed.
  200. For example, pretend the '_' is a space, and look at "Hello____world". HTML, and
  201. at least some XML parsers, will interpret this as "Hello_world". They condense white
  202. space. Some XML parsers do not, and will leave it as "Hello____world". (Remember
  203. to keep pretending the _ is a space.) Others suggest that __Hello___world__ should become
  204. Hello___world.
  205. It's an issue that hasn't been resolved to my satisfaction. TinyXML supports the
  206. first 2 approaches. Call TiXmlBase::SetCondenseWhiteSpace( bool ) to set the desired behavior.
  207. The default is to condense white space.
  208. If you change the default, you should call TiXmlBase::SetCondenseWhiteSpace( bool )
  209. before making any calls to Parse XML data, and I don't recommend changing it after
  210. it has been set.
  211. <h3> Handles </h3>
  212. Where browsing an XML document in a robust way, it is important to check
  213. for null returns from method calls. An error safe implementation can
  214. generate a lot of code like:
  215. @verbatim
  216. TiXmlElement* root = document.FirstChildElement( "Document" );
  217. if ( root )
  218. {
  219. TiXmlElement* element = root->FirstChildElement( "Element" );
  220. if ( element )
  221. {
  222. TiXmlElement* child = element->FirstChildElement( "Child" );
  223. if ( child )
  224. {
  225. TiXmlElement* child2 = child->NextSiblingElement( "Child" );
  226. if ( child2 )
  227. {
  228. // Finally do something useful.
  229. @endverbatim
  230. Handles have been introduced to clean this up. Using the TiXmlHandle class,
  231. the previous code reduces to:
  232. @verbatim
  233. TiXmlHandle docHandle( &document );
  234. TiXmlElement* child2 = docHandle.FirstChild( "Document" ).FirstChild( "Element" ).Child( "Child", 1 ).ToElement();
  235. if ( child2 )
  236. {
  237. // do something useful
  238. @endverbatim
  239. Which is much easier to deal with. See TiXmlHandle for more information.
  240. <h3> Row and Column tracking </h3>
  241. Being able to track nodes and attributes back to their origin location
  242. in source files can be very important for some applications. Additionally,
  243. knowing where parsing errors occured in the original source can be very
  244. time saving.
  245. TinyXML can tracks the row and column origin of all nodes and attributes
  246. in a text file. The TiXmlBase::Row() and TiXmlBase::Column() methods return
  247. the origin of the node in the source text. The correct tabs can be
  248. configured in TiXmlDocument::SetTabSize().
  249. <h2> Using and Installing </h2>
  250. To Compile and Run xmltest:
  251. A Linux Makefile and a Windows Visual C++ .dsw file is provided.
  252. Simply compile and run. It will write the file demotest.xml to your
  253. disk and generate output on the screen. It also tests walking the
  254. DOM by printing out the number of nodes found using different
  255. techniques.
  256. The Linux makefile is very generic and runs on many systems - it
  257. is currently tested on mingw and
  258. MacOSX. You do not need to run 'make depend'. The dependecies have been
  259. hard coded.
  260. <h3>Windows project file for VC6</h3>
  261. <ul>
  262. <li>tinyxml: tinyxml library, non-STL </li>
  263. <li>tinyxmlSTL: tinyxml library, STL </li>
  264. <li>tinyXmlTest: test app, non-STL </li>
  265. <li>tinyXmlTestSTL: test app, STL </li>
  266. </ul>
  267. <h3>Makefile</h3>
  268. At the top of the makefile you can set:
  269. PROFILE, DEBUG, and TINYXML_USE_STL. Details (such that they are) are in
  270. the makefile.
  271. In the tinyxml directory, type "make clean" then "make". The executable
  272. file 'xmltest' will be created.
  273. <h3>To Use in an Application:</h3>
  274. Add tinyxml.cpp, tinyxml.h, tinyxmlerror.cpp, tinyxmlparser.cpp, tinystr.cpp, and tinystr.h to your
  275. project or make file. That's it! It should compile on any reasonably
  276. compliant C++ system. You do not need to enable exceptions or
  277. RTTI for TinyXML.
  278. <h2> How TinyXML works. </h2>
  279. An example is probably the best way to go. Take:
  280. @verbatim
  281. <?xml version="1.0" standalone=no>
  282. <!-- Our to do list data -->
  283. <ToDo>
  284. <Item priority="1"> Go to the <bold>Toy store!</bold></Item>
  285. <Item priority="2"> Do bills</Item>
  286. </ToDo>
  287. @endverbatim
  288. Its not much of a To Do list, but it will do. To read this file
  289. (say "demo.xml") you would create a document, and parse it in:
  290. @verbatim
  291. TiXmlDocument doc( "demo.xml" );
  292. doc.LoadFile();
  293. @endverbatim
  294. And its ready to go. Now lets look at some lines and how they
  295. relate to the DOM.
  296. @verbatim
  297. <?xml version="1.0" standalone=no>
  298. @endverbatim
  299. The first line is a declaration, and gets turned into the
  300. TiXmlDeclaration class. It will be the first child of the
  301. document node.
  302. This is the only directive/special tag parsed by TinyXML.
  303. Generally directive tags are stored in TiXmlUnknown so the
  304. commands wont be lost when it is saved back to disk.
  305. @verbatim
  306. <!-- Our to do list data -->
  307. @endverbatim
  308. A comment. Will become a TiXmlComment object.
  309. @verbatim
  310. <ToDo>
  311. @endverbatim
  312. The "ToDo" tag defines a TiXmlElement object. This one does not have
  313. any attributes, but does contain 2 other elements.
  314. @verbatim
  315. <Item priority="1">
  316. @endverbatim
  317. Creates another TiXmlElement which is a child of the "ToDo" element.
  318. This element has 1 attribute, with the name "priority" and the value
  319. "1".
  320. @verbatim
  321. Go to the
  322. @endverbatim
  323. A TiXmlText. This is a leaf node and cannot contain other nodes.
  324. It is a child of the "Item" TiXmlElement.
  325. @verbatim
  326. <bold>
  327. @endverbatim
  328. Another TiXmlElement, this one a child of the "Item" element.
  329. Etc.
  330. Looking at the entire object tree, you end up with:
  331. @verbatim
  332. TiXmlDocument "demo.xml"
  333. TiXmlDeclaration "version='1.0'" "standalone=no"
  334. TiXmlComment " Our to do list data"
  335. TiXmlElement "ToDo"
  336. TiXmlElement "Item" Attribtutes: priority = 1
  337. TiXmlText "Go to the "
  338. TiXmlElement "bold"
  339. TiXmlText "Toy store!"
  340. TiXmlElement "Item" Attributes: priority=2
  341. TiXmlText "Do bills"
  342. @endverbatim
  343. <h2> Documentation </h2>
  344. The documentation is build with Doxygen, using the 'dox'
  345. configuration file.
  346. <h2> License </h2>
  347. TinyXML is released under the zlib license:
  348. This software is provided 'as-is', without any express or implied
  349. warranty. In no event will the authors be held liable for any
  350. damages arising from the use of this software.
  351. Permission is granted to anyone to use this software for any
  352. purpose, including commercial applications, and to alter it and
  353. redistribute it freely, subject to the following restrictions:
  354. 1. The origin of this software must not be misrepresented; you must
  355. not claim that you wrote the original software. If you use this
  356. software in a product, an acknowledgment in the product documentation
  357. would be appreciated but is not required.
  358. 2. Altered source versions must be plainly marked as such, and
  359. must not be misrepresented as being the original software.
  360. 3. This notice may not be removed or altered from any source
  361. distribution.
  362. <h2> References </h2>
  363. The World Wide Web Consortium is the definitive standard body for
  364. XML, and their web pages contain huge amounts of information.
  365. The definitive spec: <a href="http://www.w3.org/TR/2004/REC-xml-20040204/">
  366. http://www.w3.org/TR/2004/REC-xml-20040204/</a>
  367. I also recommend "XML Pocket Reference" by Robert Eckstein and published by
  368. OReilly...the book that got the whole thing started.
  369. <h2> Contributors, Contacts, and a Brief History </h2>
  370. Thanks very much to everyone who sends suggestions, bugs, ideas, and
  371. encouragement. It all helps, and makes this project fun. A special thanks
  372. to the contributors on the web pages that keep it lively.
  373. So many people have sent in bugs and ideas, that rather than list here
  374. we try to give credit due in the "changes.txt" file.
  375. TinyXML was originally written by Lee Thomason. (Often the "I" still
  376. in the documentation.) Lee reviews changes and releases new versions,
  377. with the help of Yves Berquin, Andrew Ellerton, and the tinyXml community.
  378. We appreciate your suggestions, and would love to know if you
  379. use TinyXML. Hopefully you will enjoy it and find it useful.
  380. Please post questions, comments, file bugs, or contact us at:
  381. www.sourceforge.net/projects/tinyxml
  382. Lee Thomason, Yves Berquin, Andrew Ellerton
  383. */