[Yanel-dev] New XMLDB repository
Josias Thöny
josias.thoeny at wyona.com
Mon Feb 12 16:55:27 CET 2007
Andreas Wuest wrote:
> Hi Josias
>
> On 12.2.2007 15:55 Uhr, Josias Thöny wrote:
>
>> Andreas Wuest wrote:
>>> Hi
>>>
>>> I've finished and checked in a basic implementation of the XMLDB
>>> repository, based on the XML:DB API.
>>
>> Cool :)
>>
>>>
>>> Unfortunately, Yarep is documented really bad, so I couldn't find out
>>> what the exact contracts for the various methods are. For example,
>>> should getSize() or delete() throw a repository exception if the
>>> resource does not exist, or return 0 or false, etc.
>>>
>>> I've extensively documented the XMLDBStorage class, so you can see
>>> what it does on the first glance.
>>>
>>> The Reader/Writer and InputStream/OutputStream are implemented using
>>> aggregation. Don't know if it would be more desireable to e.g.
>>> subclass StringReader and override the close() method instead.
>>>
>>> Also, there are some other API related problems: Yanel always seems
>>> to call getInputStream to directly read from the repo. Now, this is
>>> all fine and dandy on a file based repo, but the XML database stores
>>> XML documents as character data, and returns them as strings. With
>>> other words, in order for the OutputStream to work, we have to
>>> convert the string to bytes, which, of course, involves character
>>> encoding. I just use UTF-8 to de- and encode, but of you really want
>>> to read an XML resource, the getReader method should be used.
>>>
>>> The same goes for writing, but with some additional complication. You
>>> should NEVER use getOutputStream to write an XML document.
>>> getOutputStream creates a binary resource in the database. Use
>>> getWriter instead to write character data, which creates an XML
>>> resource.
>>
>> Well, I didn't realize that some repository implementations might
>> handle binary data differently than text data. But I guess it makes
>> sense.
>> So probably we should change yanel to use the reader/writer methods
>> for text data, and add reader/writer methods to the node-based api, too.
>> Would that help?
>
> That would help for sure. Although I don't know how Yanel can find out
> which method to call for reading, because it does not know in the first
> place if a requested resource is character-based or binary.
Yeah, I had some doubts about that also.
Maybe we could simply say that a FileResource is always treated as
binary, and a XMLResource is always text. Would that be too simple?
>
> One possible way would be for the repository implementation to guide
> Yanel, because the repository should generally know what type of
> resource is being requested (at least, XMLDB knows, we may see other
> back-ends in the future which do not even know this one though). If
> Yanel uses getInputStream(), and the repo decides that this is not a
> binary resource, it could throw an exception, and Yanel would then try
> getReader(), or vice versa. We could also introduce a flag on those two
> methods, e.g. forceRead, which would prevent the repo impl from throwing
> if the resource to be read is of the wrong type, but read anyway.
>
If we say that the repo "knows" about the type of a resource, it could
provide a method isBinary() or something like that, so yanel could know
which method to call (getReader/getInputStream). I normally prefer to
"ask first" instead of handling an error.
When someone calls a reading method which does not match the type, a
best-effort conversion could be applied.
I'm not entirely sure though how the repo would know the type
(text/binary). Should it assume that it's binary when it was written by
getOutputStream, and text otherwise?
WDYT?
josias
> For writing, there should basically be no problem, since Yanel can
> decide based on the MIME-type if it is going to write a character-based
> or a binary resource.
>
More information about the Yanel-development
mailing list