We are pleased to introduce you the new development version of
release brings the XMLTable input filter, the SQL transformation and a
small improvement in the filesystem input filter:
* *XMLTable input*: This tool does similar job like the xmltable
function known from SQL. It uses the XPath language for selecting
parts of the input XML – one XPath expression points to record nodes
and one or more XPath expressions point to attribute nodes/values
relatively to particular record node. Thus it is able to produce one
or more relations from an arbitrary XML input. The input is parsed
at once and converted to DOM in memory i.e. no streaming – thus
processing of huge XML files requires appropriate amounts of RAM, on
the other hand: our expression can access whole XML document and
pick values not only from currently processed record node. This tool
uses the Libxml2 library (XML parser and XPath processor).
* *SQL transformation*: SQL is one of most powerful languages for
processing relational data and the most widespread one. Now it can
be used even on-the-fly in shell pipelines – without having any
database server running. It is useful for record filtering, JOINing
several relations together, doing aggregations or computations. By
default everything is done in memory, but with the --file parameter
we can use a temporary file and with --keep-file we can make it not
so temporary. This tool uses the SQLite library.
* *file system input*: new optional attribute has been added: --file
content which allows getting the text content (currently only in the
UTF-8 encoding) of the file which allows us using the file system as
a simple database.
See the examples
Please note that this is still a development relasease and thus the API
(libraries, CLI arguments, formats) might and will change. Any
suggestions, ideas and bug reports are welcome in this mailing list.