Authorize data access at database level

Question

From the question title you might guess what is this about. I'll try to describe what I currently have and what I want to archive.

Suppose an application that handles four entities: User , Team , Repository and Document . The relationships between those entities are:

Each user belong to zero or more teams .
Each document belong to one repository
An user may own zero or more repositories
Each repository can be created as public or private
The content of a public repository is visible to all users who share a team with the repository's owner.
A private repository is only visible to it's owner.

Accessing documents of an user is not a problem, those are all documents stored in repositories that he owns. But the thing get complicated because what I really need is all documents visible to an user, this is all it's documents plus those documents other people made public and share a team with him.

Currently I'm enforcing this authorization mechanism in the Data Access Layer. This implies fetching all documents and do some filtering following rules above. I'm aware that this implementation is not scalable and I wonder if I can improve my database model by moving the authorization logic to the database. This way the filtering will be done by the DB engine and only requested entities will be returned to the client code.

This question is not tied to an specific implementation, but I'll tag it for the specific tools I'm using. Maybe it can be useful for someone's answer.

Answer 1

First let me explain why using entity framework (or another ORM tool) is more elegant than using stored procedures.

Stored Procedures are evil . That's why. As the link explains in detail, stored procedures tend to grow as a second BL and are therefore difficult to maintain. A simple task as renaming a column will become a big task when this column is used in multiple stored procedures. When you use a ORM tool, visual studio will do most of the work for you.

That said brings me to the second advantage of entity framework. You can compose your query by using your favorite .net language. Entity framework will not execute your query directly. You control when the query will be executed as you can read here . When doing this entity framework will compile your Linq statements to a complete tsql statement and run this against the database. So there is absolutely no need to fetch all data and loop through each record.

Tip: Move your cursor over the variable name and ef will give you a preview of the TSQL statement it will compile.

So how should your Linq query look like? I composed a test database based on your description and made an entity framework (ef6) model of it which looks like:

EF模型

This Linq query will do what you want, at least as I understood your question correctly.

private IEnumerable<Document> GetDocumentsOfUser(Guid userId)
    {
        using (var db = new DocumentRepositoryEntities())
        {
            // Get owned repositories by the user
            var ownedRepositories = db.Repositories
                          .Where(r => r.Owner.UserId == userId);

            // Get all users of teams the user belongs to
            var userInOtherTeams =
                 db.Users.Where(u => u.UserId == userId)
                        .SelectMany(u => u.Teams)
                        .SelectMany(t => t.Users);

            // Get the public repositories owned by the teammembers
            var repositoriesOwnedByTeamMembers =
                userInOtherTeams.Where(u => u.Repositories.Any())
                                .SelectMany(u => u.Repositories)
                                .Where(r => !r.Private);

            // Combine (union) the 2 lists of repositories
            var allRepositories = ownedRepositories.Concat(
                                  repositoriesOwnedByTeamMembers);

            // Get all the documents from the selected repositories
            return allRepositories.SelectMany(r => r.Documents)
                                  .Distinct()
                                  .ToArray(); //query will be composed here!
        }
    }

Note that the linq statement will be compiled to a TSQL select statement when the call to .ToArray() is made.

Answer 2

Based on your description, the goal is to find all of the repositories that the user currently has access to, then retrieve the documents from each of those repositories.

If this were my implementation, I would add a stored procedure to the database that accepts the current user's ID, then gathers the list of accessible repositories into a local table variable, then select from the documents table where the repository for the document is in the list of accessible repositories.

DECLARE
  @Teams TABLE (TeamID UNIQUEIDENTIFIER NOT NULL PRIMARY KEY (TeamID))

DECLARE
  @Repositories TABLE (RepositoryID UNIQUEIDENTIFIER NOT NULL PRIMARY KEY (RepositoryID))

  /* Get the list of teams the user is a member of */
INSERT INTO @Teams
SELECT Teams.TeamID
  FROM Teams INNER JOIN TeamUsers ON Teams.ID = TeamUsers.TeamID
 WHERE TeamUsers.UserID = @UserID

  /* Get the list of repositories the user shares a team member with */
INSERT INTO @Repositories
SELECT RepositoryID
  FROM Repositories
 WHERE OwnerID = @UserID 
    OR (OwnerID IN (SELECT DISTINCT TeamUsers.UserID
                     FROM TeamUsers INNER JOIN @Teams ON TeamUsers.TeamID = @Teams.TeamID)
       AND IsShared = 1)

  /* Finally, retrieve the documents in the specified repositories */
SELECT Documents.* 
  FROM Documents INNER JOIN @Repositories ON Documents.RepositoryID = @Repositories.RepositoryID

Answer 3

While the answer competent_tech suggests is valid, and good if your need is a one-off, what you would ideally want to do is implement your authorization requirements in a dedicated layer, in an externalized fashion. Reasons to do this include:

easier to maintain a decoupled architecture
you can update your authorization without touching your application and/or database
you do not need SQL / stored procedure knowledge
you can report more easily on what authorization is applied where: this is important if you have auditors breathing down your neck.

To achieve externalized authorization (see here for a Gartner report on the topic), you need to consider attribute-based access control (ABAC - see here for a report on ABAC by NIST ) and the eXtensible Access Control Markup Language (XACML - more info here ) as a means to implement ABAC.

If you follow the ABAC approach you get:

a clean, decoupled architecture with the notion of
- an enforcement point or interceptor that will sit between your application and your database (in the case of ABAC applied to databases)
- an authorization decision engine that reaches decisions and will produce a filter statement (a WHERE clause in the case of a SQL database) that the enforcement point will append to the original SQL statement
a policy-based and attribute-based authorization model whereby you can write authorization requirements in easy-to-understand statements instead of procedures, PL-SQL or other SQL artefacts. Examples include:
- *a user can edit a document they own
- a user can view documents if the user's team == the document's team
- a user can view documents of another team if and only if the documents are marked as public
- a user with the role editor can edit documents that belong to their team if and only if the document state is draft*

In the above examples, the user type, the resource type (document), the action (view, edit), the document's team, the user's team, and the document's visibility (private or public) are all examples of attributes. Attributes are the lifeline, the building blocks of ABAC.

ABAC can easily help you implement your authorization requirements from the simplest ones to the more advanced ones (such as can be found in export regulations, compliance regulations, or other business rules).

One neat benefit of this approach is that it is not specific to databases. You can apply the same principle and policies to home-grown apps, APIs, web services, and more. That's what I call the any-depth architecture / approach to externalized authorization. The following diagram summarizes it well:

任意深度的架构和外部授权

The PDP is your centralized authorization engine.

Authorize data access at database level

Question

3 answers

solution1
3 ACCPTED 2014-10-13 22:35:28

solution2
1 2014-10-06 20:28:59

solution3
1 2014-10-07 20:53:34

Authorize data access at database level

Question

3 answers

solution1 3 ACCPTED 2014-10-13 22:35:28

solution2 1 2014-10-06 20:28:59

solution3 1 2014-10-07 20:53:34

solution1
3 ACCPTED 2014-10-13 22:35:28

solution2
1 2014-10-06 20:28:59

solution3
1 2014-10-07 20:53:34