[Yanel-dev] New XMLDB repository
Andreas Wuest
awuest at student.ethz.ch
Tue Feb 13 01:34:00 CET 2007
Hi
On 12.2.2007 22:52 Uhr, Michael Wechner wrote:
> Andreas Wuest wrote:
>
>> Hi
>>
>> On 12.2.2007 21:02 Uhr, Michael Wechner wrote:
>>
>>> I am not sure, because XML can also contain binary data (using
>>> CDATA). This is also because one should use application/xml and not
>>> text/xml
>>
>>
>> Just for the record: CDATA cannot contain binary data.
>
>
> what I mean with binary data are images, etc. sorry for maybe mixing not
> correct technical language here, but one often has reserved XML
> characters within such data and hence it makes sense to embed it within
> CDATA.
Yes, a Base64 encoding may produce a "<" for example, which would only
go unpunished if inside a CDATA section. Nevertheless, the actual byte
values of the "binary" data must still match the byte values (or
multi-byte values actually, with regard to UTF-8, UTF-16 etc.) allowed
by the employed character set.
So, technically, such a document is still a text document, and should
also be treated as such (because when persisting it to a file and then
re-reading it, the charset has to be taken into account, otherwise not
only the text but also your Base64 encoded image will look differently
after decoding!). Furthermore, there is generally a reason why you would
envelope an image inside an XML document. One of them is to provide
meta-data alongside the image, which you'd like to query. And as far as
XML databases are concerned, for querying you'd need a text document.
Anyway, the point of all this is that if you have an XML document, you
can and should treat it as character data. It simply can't be binary data.
So when Josias proposed to simply assume every XML as text, he was
indeed correct.
--
Kind regards,
Andi
More information about the Yanel-development
mailing list