Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the Trash data model #42845

Open
wants to merge 53 commits into
base: master
Choose a base branch
from

Conversation

johnswanson
Copy link
Contributor

@johnswanson johnswanson commented May 17, 2024

Ok, so I had a realization at the PERFECT time, immediately after the RC cutoff. Great job, brain!

Here's the realization. For the Trash, we need to keep track of two things:

  • where the item actually is located in the hierarchy, and

  • what collection we should look at to see what permissions apply to the item.

For example, a Card might be in the Trash, but we need to look at Collection 1234 to see that a user has permission to Write that card.

My implementation of this was to add a column,
trashed_from_collection_id, so that we could move a Card or a Dashboard to a new collection_id, but keep track of the permissions we actually needed to check.

So:

  • collection_id was where the item was located in the collection hierarchy, and

  • trashed_from_collection_id was where we needed to look to check permissions.

Today I had the realization that it's much, much more important to get PERMISSIONS right than to get collection hierarchy right. Like if we mess up and show something as in the Trash when it's not in the Trash, or show something in the wrong Collection - that's not great, sure. But if we mess up and show a Card when we shouldn't, or show a Dashboard when we shouldn't, that's Super Duper Bad.

So the problem with my initial implementation was that we needed to change everywhere that checked permissions, to make sure they checked BOTH trashed_from_collection_id and collection_id as appropriate.

So... there's a much better solution. Instead of adding a column to represent the permissions that we should apply to the dashboard or card, add a column to represent the location in the hierarchy that should apply to the dashboard or the card.

We can simplify further: the only time we want to display something in a different place in the hierarchy than usual is when it was put directly into the trash. If you trash a dashboard as a part of a collection, then we should display it in that collection just like normal.

So, we can do the following:

  • add a trashed_directly column to Cards and Dashboards, representing whether they should be displayed in the Trash instead of their actual parent collection

  • use the collection_id column of Cards and Dashboards without modification to represent permissions.

There's one main downside of this approach. If you trash a dashboard, and then delete the collection that the dashboard was originally in, what do we do with dashboard.collection_id?

  • we have to change it, because it's a foreign key

  • we can't set it to null, because that represents the root collection

In this initial implementation, I've just cascaded the delete: if you delete a dashboard and then delete a collection, the dashboard will be deleted. This is not ideal. I'm not totally sure what we should do in this situation.

Fixes #43494

@metabase-bot metabase-bot bot added the .Team/AdminWebapp Admin and Webapp team label May 17, 2024
Copy link

replay-io bot commented May 17, 2024

Status Complete ↗︎
Commit f2e0d87
Results
⚠️ 4 Flaky
2611 Passed

@johnswanson johnswanson force-pushed the investigate-better-perms-for-trash branch 5 times, most recently from d3ae5b7 to 41a9209 Compare May 17, 2024 22:31
@johnswanson johnswanson added the backport Automatically create PR on current release branch on merge label May 20, 2024
@johnswanson johnswanson changed the title WIP: Improve the Trash Improve the Trash data model May 20, 2024
@johnswanson johnswanson force-pushed the investigate-better-perms-for-trash branch 4 times, most recently from b4b73c6 to 47d1b11 Compare May 23, 2024 18:45
(format "%%/%s/" (str parent-id)))]
(format "%%/%s/" (str parent-id)))
collection-ids (if (= collection-ids :all)
(t2/select-pks-set :model/Collection :archived (or
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was an existing bug here. When looking for collection children, we didn't care at all about :archived. So with collections like /A/B/C where B is archived, we'd show A as having no effective children.

I'm a little concerned about performance here and still thinking about whether we can help that.

@johnswanson johnswanson force-pushed the investigate-better-perms-for-trash branch 11 times, most recently from e5d26c9 to b73bb64 Compare May 31, 2024 16:11
@johnswanson johnswanson force-pushed the investigate-better-perms-for-trash branch from 165e51d to 060aaf1 Compare June 3, 2024 13:50
johnswanson and others added 29 commits June 10, 2024 11:20
Previously named `move-on-archive-or-unarchive`, which was no longer
accurate since we're not moving anything!
The Trash exists as a real collection so that we can list items in and
such without needing special cases. But we don't want anything to
_actually_ get moved to it.
Co-authored-by: Noah Moss <32746338+noahmoss@users.noreply.github.com>
Co-authored-by: Noah Moss <32746338+noahmoss@users.noreply.github.com>
Also consolidated it into a single function. It takes the
trash-collection-id as an argument to prevent a circular dependency.
In the product, we want to move to "trashed" as the descriptor for
things that are currently-known-as-archived.

It'd be nice to switch to that on the backend as well, but that'll
require quite a lot of churn on the frontend as we'd need to change the
API (or have an awkward translation layer, where we call it `trashed` on
the backend, turn it into `archived` when sending data out via the API,
and then the FE turns around and called it "trashed" again).

For now, let's just be consistent on the backend and call it `archived`
everywhere. So:

- `archived_directly` instead of `trashed_directly`, and
- `archive_operation_id` instead of `trash_operation_id`
for `collection/permissions-set->visible-collection-ids`:

- `:include-archived` => `:include-archived-items`

- `:include-trash?` => `:include-trash-collection`
This used to be a possible return value for
`permissions-set->visible-collection-ids` indicating that the user could
view all collections. Now that function always returns a set of
collection IDs (possibly including "root" for the root collection) so we
don't need to deal with that anymore.
Dispatch on the model of each item (but still keep them in batches for
efficiency.)
It's slightly different (TTL instead of exactly per-request) but that
should be fine.
@johnswanson johnswanson force-pushed the investigate-better-perms-for-trash branch from c16930f to f2e0d87 Compare June 10, 2024 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Automatically create PR on current release branch on merge .Team/AdminWebapp Admin and Webapp team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve the Trash data model
4 participants