-
-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache Overhaul for OpenSeadragon (reviewed). #2407
base: master
Are you sure you want to change the base?
Conversation
Follow up on #2329 (comment):
or b) if we suppose the data was not sent out in an incompatible format:
or c):
Simplified! Almost no changes, except:
I think it is obvious from the previous example, that we can modify the data regardless of what other plugins do, if we use |
I was thinking about the asynchronous data processing, and IMHO we would transitively make everything async. Insead, we could adopt Tile data access could opt for callback when the data is ready. Tile drawing would inspect whether the tile data (wrapped as
Example:
Advantage: flexible. Disadvantage: more complicated, more overhead. For example, we need to provide helper functions for making sure multiple data sources are ready at once, etc... possibly we could behind the scenes wrap everything in Promises and execute everything in the right order when data is available. E.g.:
Another approach is to split the data operation into two phases: before and during Test were OK on my localhost, but failed here. I noticed that sometimes the navigator did not refresh and I had to call |
This is awesome, and a lot to absorb! I am going to be busy for a bit upcoming, so I'm not going to be able to get to this right away, but I will! Meanwhile, I'm hoping @pearcetm can I help as a sounding board, since he's got the other major patch in the works, and there's probably some touch points between them. Thank you, @Aiosa, for tackling this! |
I moved on with the async decision once thinking this through several times. The cache object follows the 'maybe' type, meaning the cache might be loaded ( Furthermore, tile has new
The downside is the missing support for |
I also added default timeout for QUnit of 60s, debugging is a drag if a test does not time out - you cannot rerun it in a separate window, since it shows rerun link only when failed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm back! Still not entirely sure how to approach this PR. I'll need to read through it again. Here are a few random comments, though.
@Aiosa What do you consider left to be done here? Are there open questions we should discuss?
We'll want a new documentation page on how this system works and how to use it. That would be helpful for me in reviewing the code as well.
Thank you for taking this project on!
src/datatypeconvertor.js
Outdated
* Unique identifier (unlike toString.call(x)) to be guessed | ||
* from the data value | ||
* | ||
* @function uniqueType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name here in the function documentation doesn't agree with the name of the actual function.
* @param {Object} eventArgs - Event-specific data. | ||
* @return {OpenSeadragon.Promise|undefined} - Promise resolved upon the event completion. | ||
*/ | ||
raiseEventAwaiting: function ( eventName, eventArgs ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an elegant solution to adding async events!
@@ -2886,6 +2886,30 @@ function OpenSeadragon( options ){ | |||
} | |||
} | |||
|
|||
/** | |||
* Promise proxy in OpenSeadragon, can be removed once IE11 support is dropped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IE11 support is already dropped as of #2300!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's an old comment from a piece of code I had implemented locally, and forgot to remove. Since we don't have polyfils I thought this might be a good error-cacher, if someone has old version of some JS framework/browser. I can remove it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove it. I suppose it could be amusing to have some sort of "you're on an old browser" detection, but if we add that I think it should be explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I've found out why I left it there:
332:58 error Promise.resolve() is not supported in op_mini all compat/compat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's just Opera Mini? We don't explicitly support that browser, but I suppose if it's easy we might as well. It would be good to add an explicit comment that that's why this is here, then.
src/tiledimage.js
Outdated
return data; | ||
}, | ||
getCompletionCallback: function () { | ||
$.console.error("[tile-loaded] getCompletionCallback is not supported: it is compulsory to handle the event with async functions if applicable."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a deprecation; this is complete removal. Would it be possible to deprecate it (have it still work but produce an error) for a few releases before we remove it, or is that not going to work because of how things are changing?
I'm concerned about the new version of OSD breaking a bunch of plugins. Of course some of the popular plugins that would be affected (like https://github.com/usnistgov/OpenSeadragonFiltering) aren't in active development, so if we are removing this API eventually, we may need to fork them and fix them ourselves. Regardless, no doubt people's apps use this API and it would be nice to give them a grace period if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this was impossible (in a nice way) since handling of this was fundamentally different from how it works now. Since its behavior was also erroneous in certain use cases, I decided for removal. I will try to look into it once again, maybe introduce switch flag that would enable the new feature and later turn that flag on by default.
As for plugins, I wanted to fork the filtering plugin and fix it to move from all the old 'dirty' ways (e.g. context2D prop). Doing so would also allow me to analyze the new system performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. It would be nice to have such a flag for a release or two if possible, for people's own projects. For plugins I agree it would be great to fork the filtering plugin and bring it up to date if you're up for it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I remembered it wrongly, the support for it can be left there, I just did not like it since the async way is much cleaner code, and not buggy. But I added it (locally for now). Also:
Even in this popular plugin it's used incorrectly. If you get a callback, you should call it, otherwise the pipeline will never finish. I mean right, they probably reset the whole tiled image and want to prevent processing things that were loaded before the reset, but I am not sure what happens with the unfinished call... 'leaks?' hanging pieces of memory, closures, or will it get freed, dereferenced? Since I am not sure I guess nobody really is.
BTW It also seems the zombie cache will be much of use here:
https://github.com/Aiosa/OpenSeadragonFiltering/blob/f1e350c08e8bf050fdc84b89e9b04ad041c861b9/openseadragon-filtering.js#L195
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! And yeah, I agree it needs to be moved away from. I just want us to do it more gradually so people have a chance to update with warning :)
Well, the docs, trying out how it performs/behaves (e.g. re-implementing the Filtering plugin) and possibly adding better cache handling/ordering using smarter file structures for cache expiration (instead of iterating the whole cache array every time) and see how that performs (the cache system already implements a heap so why not using it). |
Cool, sounds like a good list! |
…dragon into cache-overhaul-reviewed
I am facing a (probably final) design issue I would like to resolve. The problem is that as of now, the conversion automatically throws away references to the old data in its cache. This might be problemtic with user-interventions, and cache re-use. It all boils down to the Suppose that we fetch na image. We use canvas so that tile cache data is rewritten with
at worst, This brings me to another related problem: we have two types:
which would fire image <-> canvas conversions for each tile per frame. But that's probably just the cost of such flexibility. My point was that even Finally, asynchronous type conversion within a synchronous event ( Solutions / IdeasIntentionally drop support for To cache on the data type level seems to me like unecessary overhead. Keeping always the very first data type cached explicitly, running the tile data initialization from the beginning, and somehow solve cache write collision differently (otherwise, the result of the whole execution is pointless). Not raising the event when the cache was already initialized (my choice). But the user might expect the event being called for all tiles, since not all tiles sharing the data share also their position (though IMHO, then if this is issue they should have different cache key). Checking that no type conversion is done in This overhaul does not introduce these issues - they've always been there. But being forced to do manual conversion made a developer relize the consequences. Having simple API might make people write super slow rendering ( |
… called. Fix bugs (zombie was disabled on item replace, fix zombie cache system by separating to its own cache array). Fix CacheRecord destructor & dijkstra. Deduce cache only from originalCacheKey. Force explicit type declaration with types on users.
… Return deprecated support for getCompletionCallback. Turn on zombie cache if sources replaced & equal.
if (newIndex !== -1) { | ||
queueItem.options.index = newIndex; | ||
} | ||
_this.world.removeItem(queueItem.options.replaceItem); | ||
if (!replaced._zombieCache && replaced.source.equals(queueItem.tileSource)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is nice feature that we allow implicit usage of zombies in case the sources equal & tile is being replaced. Zombies used automatically when suitable. The problem is equals
implementation accross all sources. For now I just check base URLs but query params might vary a lot. Maybe try generating tile URL and copare these URLS? But how do we know particular tile exists? Might not know even the equals
function itself..
I don't think I fully understand the scenario you're describing, but I do have a couple of quick impressions:
As for the bigger question, would it be solved by allowing the system to save intermediate steps? I agree that it probably shouldn't by default, but maybe that could be an option? Maybe by default you always go back to the original when modifying? |
The advantage would be users being able to use various types - handiness. And it might as well introduce these re-conversion issues. I don't know which one is better. In the end, the same code gets called anyway (either automatically or manually), the difference is that conversion process is asynchronous.
Yes, after thinking about it, I think the ebst option is to change the tile API a bit to not to change the tile cache immediatelly, but only when user explicitly tells it to. E.g. This will help since renderer will happily draw old data untill new data arrives. We might as well either force users to return drawer-compytible type (or ensure this automatically) so that when user finishes, the conversion to something drawer-compatible happens ASAP to not to hinder the animation. Caching immediate steps on data-type level would be a bit harder to implement. For now, I am trying to design support for the most basic advanced scenario: tile will get cache + original cache key (these equal by default), and first change to the cache data will ensure the original data is preserved. Then, if a tile is loaded to the system and cache exists (zombie or re-use), depending on cache it uses it can
When users decide not to keep the old data (now the default option) and overwrite the cache itself (one per tile), then it makes no sense to re-execute When users say 'keep the original data when calling
Thinking of mandelbrot, a valid option could be also that tile has no cache objects ( This seems to me like workflow that works. |
I'm still only partially following; this cache stuff is confusing to me! Anyway, what you've described sounds sensible. Regarding the cache key naming, is that something that would be a convention and the developer would have to know what to do, or is it something that we enforce somehow? If it's a convention, what are the failure modes when they don't do it properly? How do we educate them to do it right?
We're talking about context2d vs. canvas, right? It's the same data type, just different parts. We should just use context2d and if people need the canvas they can use |
These cache naming things would be convention, but convention that is handled internally, they have two cache key props they can use. When they don't properly use it, the only thing that happens is that some tile might receive different cache object than they would expect it to and vice versa. Purely from the system point of view, as if they did not properly provide 'properly unique cache key getter'. So we educate them by using the cache API through the tile instance only. The only problem is that they could accidentaly use the same cache key as we automatically derive - there is a very low chance, and we as well might use some URL-unfriendly characters to ensure this does not happen. Thanks for the last point! Indeed, canvas type is abiguous as it's context defines its behavior. Than it's clear. |
Okay, sounds reasonable! |
… in the rendering process, remove 'canvas' type support, many bugfixes and new tests.
…eaders test: did not finish because we don't call tile-loaded on cached tiles by default.
I also have the new version of the filtering plugin. There are some improvements in the plugin thanks to the overhaul:
The downside is that first time load takes slightly longer to display fully loaded viewport, since async data processing means delayed execution, and some animations (+other logics) is executed instead where we would prefer doing more data processing. One thing that could help here is to reduce the amount of re-draw requests (each main cache update has to explicitly request redraw since asnyc). It should have faster refresh of the content, since auto zombies are used when replacing equal tiledImage on the same index, it was working for me but now I checked console and there are still download jobs fired on refresh, so I guess my latest changes broke something 🗡️ And finally, I am still arriving at chicken-egg problem with plugin data pipelining, I will have to think about it a bit more. I had to add |
// reloaded, so user code should get re-executed. IMHO, | ||
// with tile propagation the best would be throw away old tile | ||
// and start anew | ||
callTileLoadedWithCachedData: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the headers change either update the tile keys or clear cache as well, since we don't update the cache key and thus potentially have incorrect state? I think the only state-consistent way of doing this is by unloading all the tiles this image owns, and let the system pull them anew. Whether the headers are reflected on the key later is again up to the system (overriden method for cache generation?).
Great that you're updating the filtering plugin! This will be good to have, but also it's a good test case. When you say it takes a little longer at startup, how much longer are we talking about? It does seem like when a plugin asks for the version of the data prior to their changes, it should include the changes from the previous plugins. That said, how do we determine what the plugin order is? And of course it's not just plugins... It could be the app itself. It does seem like some way to "stack" effects in a stable manner would be good. |
With cached tiles in browser, the synchronous loading happens instantly. With the async processing you can actually see the tiles jumping from low to high resolution, a few hundred milliseconds. But that also might be because each tile has to say 'I got new data viewport needs refresh', because it gets eventually ready in async context. Meanwhile, animation frames fire and each time it probably finds out we need viewport update, instead of waiting a bit to process more tiles at once. So I hope being smarter about when we say 'viewport needs update' will help. Or playing more with how drawers are 'greedy' (having them render in anync too to counter-balkanve the penslization). I need the WebGL PR merged for this to test properly. It modifies the drawer structure.
The I was talking mainly about the tile cache access. Even if the plugin order is given (which is not possible for async/callback based plugin interaction at all in current version, see the caman usage in the Filtering plugin), I still need to resolve how to pipeline the data correctly. Let's say plugin A does basic thing: gets data (makes a copy) and stores modified data (replaces the old data). It makes two caches: one for drawing, one with the original data saved. We have two methods: getData and getOriginalData. Now, original data is created only first time, subsequent calls to setData will overwrite just the rendering item. Now, if plugin B does exactly the same, they either use getData which will correctly work with the modified data obtained in any previous step, even regardless of the data type. But, if some plugin wanted to access the original data (Filtering plugin notices we changed filters, it re-executes the filters), it ignores the output of any other plugin that might come before since only first plugin to modify data will store non-modified copy as the original. So we kind of need three caches (original, data in previous step, data now; just how swapping works), and the middle step can get deleted after finish. Or simply change the behavior of Edit: I noticed the HTML drawer does not fire |
The code should be merged. There are some minor issues and tests still do not fully pass - sometimes there are few weird errors, and it seems CI got stuck too. Also there are quite some TODOs which need to be resolved (particularly with the new async behavior to finetune animation), but from my point of view this is almost ready. @pearcetm could you verify this works on your end? Btw: I tried to also start cleaning up code where I made more changes (use
|
Reorganizing the files sounds good, and that looks like a good organization. We should do that in a separate PR after this, to avoid confusion in this one. |
@Aiosa I see you added a filtering plugin demo page. Awesome! It seems currently broken, to me at least. Is that what you're seeing too? |
For me it works both for the drawers. The only issue is that
I did not have time to fix this yet, but I was happy it is already operable - regardless of what drawer you use. |
…neric drawerswitcher utility for demos.
Added for demos implementation of generic drawer type switcher, we should add it to most demos to allow for flexible manual testing. |
I have a question: how (and what purpose) is that of I mean even now with OSDv4 I render zip archives thanks to the 'advanced data pipeline'. This PR makes it possible to render virtually anything. Thoughts? EDIT: Right now I realized there is quite an issue with the current design since webgl creates texture on the 'original data reference' which never gets freed, so I am trying now a design where drawers by default keep their own internal simplified cache inside cache items for drawing... |
…e' for drawer data, optional to use.
It looks like The thing I like about it is the idea that it helps with cross-browser compatibility. The formats that we say we support are the ones we know are supported on all of the browsers OSD is expected to run on. For instance, it's not uncommon for people to want to load .tiff files directly into OSD, but only Safari supports them directly (at least according to https://developer.mozilla.org/en-US/docs/Web/Media/Formats/Image_types). This means if they tested only on Safari (for some kooky reason) they would end up thinking that it works. With the cache overhaul, it seems like maybe this feature is even more important, since we are now going to be supporting a lot more possible formats. I imagine it should tell the developer whether the format they are using is either natively supported by browsers or there is some sort of converter installed for it. |
… was not called on internal cache. Polish code. Improve filtering demo.
Okay, I managed to get some free time and fix a few things. I still have to measure the performance but otherwise it should work. There is no ability to say 'we support format XY' with the convertor class, so developers need to
To me this looks confusing. Format != data type looks to me like window != viewport coordinates, people will confuse it all the time. I would like to unify this somehow; in the end this function is used in just two sources (dzi and iiif), only the second one really uses it, dzi just throws an error. The rest of the sources just ignore it. Moreover format declaration cannot be fully trusted anyway. IMHO it would be better to just check that given data can be converted, and if the conversion fails just print an error that lists several reasons why this could be the case. If you test only on Safari you still won't see any errors anyway whether we check formats or not. If you forget to add some format that is supposed to work, you get false alarms. So the format support could be just the IIIF class thing... We could just add event that fires when a conversion that was supposed to work fails and users can do whatever they want with that. The current implementation of |
@Aiosa I haven't had time to dive into this extensively or do much testing, but I hope to do so over the next couple of weeks. In your mind is the filtering demo the best place to start? Have you been able to address the issues you noted above?
|
I managed to mostly fix these issues, however, I discovered few more. I tried to implement sharing of cache between multiple viewers. I did not succeed. Not only you sometimes cannot afford to share the cache completely (WebGL cannot share loaded texture between contexts) which I was able to solve more or less, but the overal tile fetching strategy is that tile record is created once it has data downloaded, which is too late: already two jobs per tile were requested at that point. So I realized it maybe might be much better idea to work with TiledImage positioning within one OSD instance. Harder to adjust plugins (e.g. annotation) to this separation though. Concerning the new cache API, there are still easy to misuse scenarios, for example my implementation of the filtering plugin was doing this
and I did not realize that my implementation of data restoration was in fact making the data item shared between two caches, and when one cache was freed so was the other -> black screen. But this happens only when you add some filters, then remove all of them and then try to again add new filters. But forcing users to copy on read will have its cost too. Then there is another issue when the user calls tile invalidation too often (e.g. range input drag), my implementation of the plugin sometimes does not update all tiles to the latest state. It will still take quite some work to get this right. |
Okay I thought it is fixed but the webgl drawer still sometimes shows black screen on filter adjustments, I already ten times thought that it works already, then I clean up console log stuff and it stops working. I tried to put request redraw everywhere I could think of, still sometimes it does not properly redraw viewport... I will see Edit: finally fixed. A stupid typo. |
…roduce managed mock getters for tests.
This is gnarly stuff! Thank you for working through all of it and keeping us updated ❤️ |
@iangilman I would like your opinion on
To me it seems like it tries to solve the single issue where image type cannot be supported due to formats, and also to add ability for IIF to specify a desired image format. The former can just check the image conversion routine and alert on error. Maybe even raise an event...
The latter would better be in the IIIF class anyway, since it is a specific protocol feature. |
Yeah, it may be time to let go of That said, with the new data conversion infrastructure, aren't we expanding our knowledge of what is supported? For instance, if we're loading vectors or bytestreams, either we have modules installed that can convert those into pixels to be drawn to the screen or we don't, right? So it's not like the existence of those new formats makes it harder for us to know what can be loaded, right? Anyway, as far as I can tell, the only advantage When things fail on the current browser, though, providing meaningful error messages would be good! |
I am sorry I did not progress much here, but lately I was very busy. I will try to return to this in a few weeks when all the stuff calms down a bit :) |
@Aiosa Thank you for the update! |
Merge for the recent changes. I am still unable to get stuff smoothly working. I replaced Since I was doing an async pipeline I somehow assumed that most calls should just 'work' and execute as soon as possible with the only limit to keep the cache state consistent. But cache consitency != screen consistency. Instead, we should restrict execution flow as much as possible to allow only meaningful routes:
And we can both support async & sync mechanisms just by simply swithing between |
Second attempt to redesign OSD cache. Follows up on #2329.
Areas to discuss are marked with
FIXME
comment - please, let's discuss these issues.Why do we need this?
Extending abilities of OpenSeadragon require more flexible system to work with data. This overhaul fixes many issues, bugs and API inconsistencies.
What is solved
getContext2D
,destroy
forTiledImage
- hacky way for imagetilesource to work in older versions of OSD, the tilesource has been rewrittentile.context2D
- hacky way of rendering, storing evergrowing amount of data in the browser memoryWhat needs to be solved (apart from FIXME)
missing support for shared cache between viewerscontext2D
which was a performant, but memory-expensive choice, one should increase cache size instead of hardcoding data