-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TSDS Data stream] It is possible to delete the write index of a TSDS data stream #108722
Comments
Pinging @elastic/es-data-management (Team:Data Management) |
This is an interesting situation, for example, TSDS don't really have a "write index" because documents are routed to the backing index to which their We probably still want to fix this behavior and disallow deleting the latest generation index still, but we should be clear in our docs that deleting any index can still generate the |
That's a good enough reason to prevent it IMHO... unless there's an easy way for the user to recreate it
I think that for both classic and time-series data streams, it is already not possible to delete the latest generation index. However, for TSDS, (correct me if I'm wrong) it is possible to delete the previous backing indexes that are still writable (i.e. up to the configured |
@dakrone you are right. I will rephrase the ticket to not use the write index terminology. What about: That the user cannot delete the backing indices whose timeframe includes "now", this "now" should align with the "now" used when determining the timeframes of the indices. The idea behind this is that since we expect TSDS to accept current data, we should protect the user from accidentally deleting what we expect to be the most written index. Thoughts? |
Why not? If a user were to configure a 7d look back time, with a 3-day retention, would we want to prevent them from doing that?
I agree about preventing deletion of the most-written index, as I think it would lead to a poor user experience. I don't know yet whether we should protect all writeable indices from deletion, given that with a maximum 7d lookback that could be a very large number of indices (for high-volume indices rolling over every hour, for instance). |
The thinking with DS (whether classic or TS) is that deletions (should) occur by means of retention settings (either via ILM or DS lifecycle). Even though it "could" make sense for the user to manually delete the N oldest indexes (whether writable or not) in order to free storage (or whatever else reason), they could achieve the same result by adjusting their retention settings. However, deleting an index that is inside the array of indices makes much less sense, and even more so if it is writable. Say you have |
Interestingly though, all backing indices in a data stream are writeable (TSDS or not), and a user may or may not be performing writes/updates/deletes to these backing indices (we tell users that need to do updates to do this). I think it's worth us (the team, I mean) discussing whether we want to allow "donut hole" indices, as you mentioned, and how we would do this technically. For instance, it's still perfectly valid in your scenario for a user to delete |
I like your "donut hole" metaphor :-) |
Circling back to this and looking at the documentation for Data stream lifecycle, in step 3 I can read that the write index that's been rolled over is automatically tail merged. I'm curious to know whether this tail merging process happens only once after rollover or whether there is some kind of write detection mechanism that will rerun the tail merge again after "some" write operations? The thinking being that since those indexes are not supposed to be written to anymore, they are tail-merged for optimization's sake, but if they are being written to again after that merge process, they are potentially back into a sub-optimal state. |
It happens only once, then the index has an internal flag set to avoid it being rerun in the future.
Looking at the code, it appears that we do not wait for it to exit the write "window" before force merging it, so older documents could be written during this time. I'll open an issue for us to change this. |
I opened #109030 |
Elasticsearch Version
8.13
Installed Plugins
No response
Java Version
bundled
OS Version
not relevant
Problem Description
Expected behaviour
As a user I should not be able to delete the write index of a data stream, so I can always be able to write to it.
Current behaviour
In the case of a TSDS data stream, there is a period right after a rollover during which the user can still write on the just rolled over index. However, now it's possible for a user to delete this write index because it's not the last one and then all the writes will fail until the newer index becomes the write index.
Steps to Reproduce
Create a tsds data stream
Execute a rollover
Try to index again
Now we have two indices [
xxx-000001
,xxx-000002
]. The following document will end up in the first indexxxx-000001
.Try to delete
xxx-000001
Try to index again
This time indexing the document fails because the correct write index has been deleted.
Logs (if relevant)
No response
The text was updated successfully, but these errors were encountered: