[Yanel-dev] New XMLDB repository
Michael Wechner
michael.wechner at wyona.com
Mon Feb 12 21:02:41 CET 2007
Josias Thöny wrote:
> Andreas Wuest wrote:
>
>> Hi Josias
>>
>> On 12.2.2007 15:55 Uhr, Josias Thöny wrote:
>>
>>> Andreas Wuest wrote:
>>>
>>>> Hi
>>>>
>>>> I've finished and checked in a basic implementation of the XMLDB
>>>> repository, based on the XML:DB API.
>>>
>>>
>>> Cool :)
>>>
>>>>
>>>> Unfortunately, Yarep is documented really bad, so I couldn't find
>>>> out what the exact contracts for the various methods are. For
>>>> example, should getSize() or delete() throw a repository exception
>>>> if the resource does not exist, or return 0 or false, etc.
>>>>
>>>> I've extensively documented the XMLDBStorage class, so you can see
>>>> what it does on the first glance.
>>>>
>>>> The Reader/Writer and InputStream/OutputStream are implemented
>>>> using aggregation. Don't know if it would be more desireable to
>>>> e.g. subclass StringReader and override the close() method instead.
>>>>
>>>> Also, there are some other API related problems: Yanel always seems
>>>> to call getInputStream to directly read from the repo. Now, this is
>>>> all fine and dandy on a file based repo, but the XML database
>>>> stores XML documents as character data, and returns them as
>>>> strings. With other words, in order for the OutputStream to work,
>>>> we have to convert the string to bytes, which, of course, involves
>>>> character encoding. I just use UTF-8 to de- and encode, but of you
>>>> really want to read an XML resource, the getReader method should be
>>>> used.
>>>>
>>>> The same goes for writing, but with some additional complication.
>>>> You should NEVER use getOutputStream to write an XML document.
>>>> getOutputStream creates a binary resource in the database. Use
>>>> getWriter instead to write character data, which creates an XML
>>>> resource.
>>>
>>>
>>> Well, I didn't realize that some repository implementations might
>>> handle binary data differently than text data. But I guess it makes
>>> sense.
>>> So probably we should change yanel to use the reader/writer methods
>>> for text data, and add reader/writer methods to the node-based api,
>>> too.
>>> Would that help?
>>
>>
>> That would help for sure. Although I don't know how Yanel can find
>> out which method to call for reading, because it does not know in the
>> first place if a requested resource is character-based or binary.
>
>
> Yeah, I had some doubts about that also.
> Maybe we could simply say that a FileResource is always treated as
> binary, and a XMLResource is always text. Would that be too simple?
I am not sure, because XML can also contain binary data (using CDATA).
This is also because one should use application/xml and not text/xml
>
>
>>
>> One possible way would be for the repository implementation to guide
>> Yanel, because the repository should generally know what type of
>> resource is being requested (at least, XMLDB knows, we may see other
>> back-ends in the future which do not even know this one though). If
>> Yanel uses getInputStream(), and the repo decides that this is not a
>> binary resource, it could throw an exception, and Yanel would then
>> try getReader(), or vice versa. We could also introduce a flag on
>> those two methods, e.g. forceRead, which would prevent the repo impl
>> from throwing if the resource to be read is of the wrong type, but
>> read anyway.
>>
>
> If we say that the repo "knows" about the type of a resource, it could
> provide a method isBinary() or something like that, so yanel could
> know which method to call (getReader/getInputStream). I normally
> prefer to "ask first" instead of handling an error.
> When someone calls a reading method which does not match the type, a
> best-effort conversion could be applied.
> I'm not entirely sure though how the repo would know the type
> (text/binary). Should it assume that it's binary when it was written
> by getOutputStream, and text otherwise?
from my guts I think Yanel should not have to care what kind of data
it's piping through, but maybe my guts tell my something wrong ;-)
How is JCR handling this?
Cheers
Michi
> WDYT?
>
> josias
>
>> For writing, there should basically be no problem, since Yanel can
>> decide based on the MIME-type if it is going to write a
>> character-based or a binary resource.
>>
>
>
> _______________________________________________
> Yanel-development mailing list
> Yanel-development at wyona.com
> http://wyona.com/cgi-bin/mailman/listinfo/yanel-development
>
--
Michael Wechner
Wyona - Open Source Content Management - Apache Lenya
http://www.wyona.com http://lenya.apache.org
michael.wechner at wyona.com michi at apache.org
+41 44 272 91 61
More information about the Yanel-development
mailing list