How We Made Sharing Easy at Macro

Access management is one of the hardest systems we’ve built at Macro - and the one we’ve rewritten the most. As you read the post you’ll see how we stopped treating sharing as its own feature and let it fall out of something people were already doing.

Entities in Macro#

Macro is a unified system for all of your work. We combine the functionality of products like Linear, Notion, Slack, Google Drive and Superhuman all into one space where you can reference anything anywhere. Despite the wide array of features Macro supports we must ensure that all these entities support the same fundamental model for sharing.

Some of these items are:

Documents
Projects
AI Chats
Emails

Each entity may have some slightly unique behaviors for entity access but overall they all share the same fundamental sharing model.

One of the biggest benefits of having all of this in one workspace is a unified permissioning system. We want access to be unified so that agents, unified search, and humans can all access everything seamlessly.

Macro has had 2 main “share” models. The initial model was fairly standard and mirrored how something like Google Drive and other cloud drives allow you to share items.

Items can be made publicly accessible
Items can be shared with individual users or your entire organization.
If a project (folder) is shared with you, then all items within that project are also shared with you.

These requirements led to our first entity access management system that I’ll refer to as “simple sharing”.

To support simple sharing we set up a UserItemAccess table that would record all the entities that were shared and who/what they were shared with.

Let’s say a user shares doc_1 with user_1 and they share proj_1 with their entire organization (org_1), which contains user_2 and user_3. The UserItemAccess table would contain the following rows.

entity_id	entity_type	user_id	access_level
`doc_1`	document	`user_1`	view
`proj_1`	project	`user_2`	view
`proj_1`	project	`user_3`	view

You’ll notice that we have more rows than you may have expected for proj_1. That’s because we made UserItemAccess purely user based. This meant that any time a user was added/removed from an organization we had to get all items shared within that organization and assign/remove them from the user. This was a less than ideal situation as it led to a potentially large and complex query on user creation or removal within an organization. However, the amount of shared items amongst an organization wasn’t that large and users were not often removed from their organizations so at the time there wasn’t much downside to keeping it this way. Now, you may ask why we didn’t support an organization level share in the UserItemAccess table, this is because originally we didn’t have organization level sharing. When we went to add it, it got shoe-horned into the existing UserItemAccess table the easiest way we could think of. At this point in Macro, speed was paramount and getting features out so that we could feel how they worked and tweak as needed was more important than debating over the perfect solution for weeks/months. This ended up being a good thing as we have since dropped organizations entirely from Macro in favor of a new team system.

When it came to finding all items a user has access to you’d have to look up in the UserItemAccess table for your user id and for every project you’d need to recursively drill down into that project to grab all sub-items. By all accounts this seemed to be fine at first and provided us with all the desired functionality for our initial sharing setup.

The Eureka Moment#

As time went on and channels (group messaging) were released, we frequently found ourselves sending entities to one another through a channel only to be hit with the “I don’t have access to this item” response.

After living with it, we discovered how we were often sharing things and then sending the link in the channel. We thought, why don’t we just make the sharing based off of sending things in channels rather than having its own sharing system?

This extremely common occurrence (and annoyance) led to the new channel based sharing approach. This approach boils down to a simple thought:

“If I send an item to someone in a channel, they should get access to that item” (with the caveat that you yourself have access to the item).

This was done because at the end of the day, within cloud storage providers if I’m given access to an item there is nothing stopping me from downloading that item and sharing as I please with others. At least with this new approach the owner of the item can track where it’s being shared and revoke access when needed.

Getting Into the Technicals#

So now when an item was shared within a channel, we would grab all users within that channel and insert UserItemAccess rows for them with a new shared_with_channel_id column that was used to track where the share came from.

This added more overhead to adding/removing users from a channel as we now had to add/remove items for them accordingly.

The Problem: This Was Slow#

As the quantity of items a user has within Macro grew with the addition of new entities, we noticed that the recursive query was extremely slow especially on a cache miss (for some power users taking over 10 seconds to grab “user accessible items”).

This was unacceptable to us and I began designing our new system to better handle this.

After looking into the actual cause of the problem, it was realized that the recursion was extremely slow for these power users as they often had large nested projects coupled with many “root” level items (items not within a project).

The query planner was not able to perform an efficient query over all these items leading to a large amount of time being taken in the recursion step.

Furthermore, the overhead of granting access to items for a user when they are added/removed from a channel or organization was mentally taxing to always have to remember and left a lot of room for potential error.

The proposed solution allowed us to simplify how we grant item access, requiring only 1 record created per share per item.

I went for a flattened design that required no recursion at the cost of becoming more “write heavy”. To be clear, the recursion doesn’t disappear entirely - we still walk the project tree and enumerate items when granting access. We’ve simply moved that cost from read time to write time, which happens far less often and is much more tolerant of latency. Write heavy for us was a good choice as a vast majority of items were not deeply nested within projects and it allows us to move to an event based system later should our writes start taking a while to propagate so we can keep all user operations fast.

CREATE TABLE entity_access
(
	entity_id UUID NOT NULL, -- document_id, project_id, chat_id, email_thread_id
	entity_type EntityType NOT NULL, -- document, project, chat, email
	source_id TEXT NOT NULL, -- channel_id, team_id or macro_user_id if creator
	source_type TEXT NOT NULL, -- channel, team, user if creator
	access_level AccessLevel NOT NULL, -- the access level granted to the source for a given entity
	granted_from_project_id UUID -- if a project was shared, items in the project will be tracked with this column
);

How does the entity_access system fix the previous iterations’ shortcomings?

Instead of creating records per user we create them per source for channels and teams, this means removing a user from a channel or a team is as simple as deleting them as a member from the respective entity. NO MORE SIDE EFFECTS.

The source of the entity_access can either be a channel, team or user (for item creators).

The granted_from_project_id is what allows us to know which project was actually shared to create the entity_access record. This means if that project is deleted or moved it becomes a lot easier to update all required entity_access records.

Adding an item to a project#

In this example we will have the following project structure:

A/ -- owner 1
-- B/ -- owner 2
---- C/
----- Add item here

Let’s say we are adding an item to project c.

Programmatically, we need to walk up the tree to get all parent project ids including project_c. This will give us an array of [project_a, project_b, project_c]. Next, get all channel/team source entries and access_levels for those projects.

source_id	source_type	access_level	granted_from_project_id
`channel_1`	channel	view	`project_a`
`team_1`	team	comment	`project_b`
`team_2`	team	edit	`project_c`

In this example:

project_a was shared with channel_1 which means that channel_1 should have view access
team_1 was shared project_b which means it should get comment access
team_2 was shared project_c with edit access

With that information we need to insert the following records:

entity_id	entity_type	source_id	source_type	access_level	granted_from_project_id
`<entity_id>`	`<entity_type>`	`channel_1`	channel	view	`project_a`
`<entity_id>`	`<entity_type>`	`team_1`	team	comment	`project_b`
`<entity_id>`	`<entity_type>`	`team_2`	team	edit	`project_c`
`<entity_id>`	`<entity_type>`	`<user_id>`	user	owner
`<entity_id>`	`<entity_type>`	`owner1`	user	owner	`project_a`
`<entity_id>`	`<entity_type>`	`owner2`	user	owner	`project_b`

Alongside the channel/team sources, we also fetch each parent project’s owner (owner1, owner2) and insert an owner row for them so they retain access to items added anywhere beneath their projects. The <user_id> row with no granted_from_project_id is the creator of the new item itself.

Granted, there is more work upfront to ensure we are inserting correct access for a new item in a project, but our workload is very read-heavy and we should be prioritizing that over insertion speed.

Moving a sub-project#

In this example we will have the following projects:

A/
--B/
----C/
------ document_a
------ document_b

X/
--Y/
----Z/

In the example we are going to move project C from project B into project Z.

We will need to walk up the project tree to get all parent project ids (project_a, project_b). Notably this excludes project_c as we don’t need to change any of its existing permissions. project_c can still be shared with the team/channel like it used to be with no changes required.

Now we need to get all items in project_c (document_a, document_b).

Next, we DELETE FROM entity_access WHERE entity_id = ANY([document_a, document_b, project_c]) AND granted_from_project_id = ANY([project_a, project_b]). This will remove any implicit permissions that parent projects got when project_c was inside of it.

Now, we perform the steps in Adding an item to a project with all the items in project c (including project c itself).

Deleting a project#

When a project is deleted, all of its items are deleted as well. This process will be the exact same as the current process with UserItemAccess with a slightly different query to delete from entity_access where granted_from_project_id = x OR entity_id = x.

How do we access items#

Now, instead of performing the recursive tree walk for projects to get all accessible items, we can simply get the users channels and teams (easily cacheable queries) and then perform 1 query on entity_access with source_id being ANY([channels, teams, user_id]).

Because a user can reach the same entity through multiple sources (for example, a channel that grants view and a team that grants edit), this query can return more than one row for a single entity. When that happens, the highest access level wins - so in that example the user ends up with edit.

Conclusion#

Sharing is now as easy as sending someone a message.

We can now quickly retrieve all items a user has access to.

Flattening entity access into a single table reshaped the system around the access pattern that actually mattered to us:

Reads got fast. The recursive tree walk became a single indexed lookup, taking our worst-case power user from 10+ seconds on a cache miss down to ~600ms.
Membership changes lost their side effects. Because grants are stored per source rather than per user, adding or removing someone from a channel or team is just adding or removing a member - no fan-out across every shared item.

None of this came for free. The recursion didn’t disappear, it moved to write time: granting access still walks the project tree and enumerates items. We were comfortable with that trade because our workload is overwhelmingly read-heavy, most items aren’t deeply nested, and if write propagation ever becomes a bottleneck the flattened design lets us push that work into an event-based pipeline without touching the read path.