Batch Entity Loading

Optimize multi-entity loading with fetchBatch for high-throughput authorization.

The Problem

Without batch loading, checking 100 documents requires 100 database queries:

For 100 documents: ~500ms without batch vs ~5ms with batch (100x faster).

Implementing fetchBatch

Override fetchBatch in your EntityFetcher:

class DocumentFetcher(db: Database)(using ec: ExecutionContext)
    extends EntityFetcher[Future, Entities.Document, String] {
  
  def fetch(id: String): Future[Option[Entities.Document]] =
    db.findDocument(id).map(_.map(toCedar))
  
  override def fetchBatch(ids: Set[String])(using Applicative[Future]): Future[Map[String, Entities.Document]] =
    db.run(documents.filter(_.id.inSet(ids)).result).map { docs =>
      docs.map(d => d.id -> toCedar(d)).toMap
    }
  
  private def toCedar(doc: DomainDocument): Entities.Document = ...
}

Return Type

fetchBatch returns F[Map[Id, A]]:

Keys are the entity IDs
Values are the loaded entities
Missing IDs are simply not included in the map

// Request IDs: Set("doc-1", "doc-2", "doc-3")
// If doc-2 doesn't exist:
// Result: Map("doc-1" -> Document(...), "doc-3" -> Document(...))

Database-Specific Examples

Slick (PostgreSQL/MySQL)

override def fetchBatch(ids: Set[String])(using Applicative[Future]) =
  db.run(documents.filter(_.id.inSet(ids)).result).map { docs =>
    docs.map(d => d.id -> toCedar(d)).toMap
  }

Doobie (PostgreSQL)

import doobie.*
import doobie.implicits.*
import doobie.postgres.implicits.*

override def fetchBatch(ids: Set[String])(using Applicative[Future]) = {
  val query = 
    fr"SELECT id, folder_id, name, owner FROM documents WHERE" ++ 
    Fragments.in(fr"id", ids.toList.toNel.get)
  
  query.query[DomainDocument]
    .to[List]
    .transact(xa)
    .map(docs => docs.map(d => d.id -> toCedar(d)).toMap)
    .unsafeToFuture()
}

Skunk (PostgreSQL)

import skunk.*
import skunk.implicits.*

override def fetchBatch(ids: Set[String])(using Applicative[Future]) = {
  val idList = ids.toList
  val query = sql"""
    SELECT id, folder_id, name, owner 
    FROM documents 
    WHERE id = ANY($text4)
  """.query(document)
  
  session.prepare(query).flatMap(_.stream(idList, 64).compile.toList)
    .map(docs => docs.map(d => d.id -> toCedar(d)).toMap)
    .unsafeToFuture()
}

Factory Method

Use the factory for simple cases:

val fetcher = EntityFetcher.withBatch[Future, Entities.Document](
  f = id => db.findDocument(id).map(_.map(toCedar)),
  batch = ids => db.findDocuments(ids).map(_.map(d => d.id -> toCedar(d)).toMap)
)

When fetchBatch is Called

EntityStore calls fetchBatch when:

Batch authorization - runner.filterAllowed(items) or runner.batchIsAllowed(checks)
Loading entities for request - When multiple entities of the same type are needed
Deferred resolution - When loading parent chains for multiple resources

If fetchBatch is not overridden, the default implementation calls fetch for each ID individually.

Chunking Large Batches

For very large batches, consider chunking:

override def fetchBatch(ids: Set[String])(using Applicative[Future]) = {
  val chunks = ids.grouped(1000).toList  // Max 1000 per query
  
  Future.sequence(chunks.map { chunk =>
    db.run(documents.filter(_.id.inSet(chunk)).result)
  }).map { results =>
    results.flatten.map(d => d.id -> toCedar(d)).toMap
  }
}

This avoids database query size limits while still being much faster than individual queries.

Parallel Loading Across Types

EntityStore loads different entity types in parallel:

Performance Tips

Always implement fetchBatch - Even with caching, cold starts benefit from batch loading
Add database indexes - Ensure your id columns are indexed for IN queries
Monitor query performance - Large IN clauses can be slow without proper indexes
Consider query limits - Some databases limit IN clause size (e.g., Oracle: 1000)
Use appropriate batch sizes - Balance between fewer queries and query complexity

The Problem​

Implementing fetchBatch​

Return Type​

Database-Specific Examples​

Slick (PostgreSQL/MySQL)​

Doobie (PostgreSQL)​

Skunk (PostgreSQL)​

Factory Method​

When fetchBatch is Called​

Chunking Large Batches​

Parallel Loading Across Types​

Performance Tips​