Connection Pooling
Overview
ldbc-connector is a library designed for Cats Effect that provides high-performance and safe database connection pooling. Unlike traditional JVM thread-based pooling (such as HikariCP), it is designed to fully utilize Cats Effect's fiber-based concurrency model.
Limitations with Scala Native 0.4.x
The current Scala Native 0.4.x only supports single-threaded execution. Since the connection pooling functionality in ldbc-connector is designed with multi-threading in mind, when using it with Scala Native 0.4.x, the following issues may occur:
- Concurrent connection management does not work correctly
- Background tasks (HouseKeeper, AdaptivePoolSizer, etc.) may not execute as expected
- Concurrent connections are effectively limited to 1
- Deadlocks or unexpected behavior may occur
Scala Native 0.5.x is planned to support multi-threading, but until then, using connection pooling with Scala Native is not recommended. Instead, we recommend creating and using a new connection for each database operation.
Architecture Overview
The ldbc-connector pooling system consists of the following main components:
1. PooledDataSource
The core component of the pool that manages the entire connection lifecycle.
Main responsibilities:
- Coordinating connection acquisition and release
- Managing pool size
- Collecting metrics
- Coordinating background tasks
2. ConcurrentBag
A high-performance concurrent data structure inspired by HikariCP's ConcurrentBag, but optimized for Cats Effect fibers rather than JVM threads.
Features:
- Lock-free operations
- Direct handoff between fibers
- Efficient wait queue management
- Atomic state management using
Ref[F]
3. CircuitBreaker
A critical component for preventing the "Thundering Herd" problem when the database is down.
CircuitBreaker Details
State Transition Diagram
Purpose of CircuitBreaker
The CircuitBreaker pattern is implemented to solve the following problems:
-
Thundering Herd Problem Prevention
- Prevents situations where a large number of clients attempt to reconnect simultaneously after the database becomes temporarily unavailable
- This avoids further increasing the load on an already struggling database
-
Fail Fast
- When the database is unresponsive, new connection attempts fail immediately
- This prevents clients from waiting for long timeouts
-
Gradual Recovery
- Carefully tests whether the service has recovered through the Half-Open state
- Exponential backoff gradually increases retry intervals for repeated failures
Implementation Details
trait CircuitBreaker[F[_]]:
def protect[A](action: F[A]): F[A]
def state: F[CircuitBreaker.State]
def reset: F[Unit]
Configuration parameters:
maxFailures
: Number of failures before transitioning to Open state (default: 5)resetTimeout
: Time before transitioning from Open to Half-Open state (default: 60 seconds)exponentialBackoffFactor
: Timeout increase factor on failure (default: 2.0)maxResetTimeout
: Maximum reset timeout (default: 5 minutes)
Operation Flow
-
Closed State
- All requests are processed normally
- Failures are counted, and the state transitions to Open when threshold is reached
-
Open State
- All requests fail immediately (fail fast)
- After reset timeout elapses, attempts to transition to Half-Open state
-
Half-Open State
- Allows a single test request
- On success: Returns to Closed state
- On failure: Returns to Open state with exponentially increased timeout
JVM Threads vs Cats Effect Fibers
Differences and Characteristics of Concurrency Models
JVM Thread Characteristics (HikariCP, etc.)
Advantages:
- OS-level support: Preemptive scheduling
- Existing ecosystem: Many libraries and tools support
- Debug tools: Mature profiling and monitoring tools
- Simple execution model: Easy-to-understand execution flow
Limitations:
- Memory usage: 1-2MB per thread
- Context switching: High cost at kernel level
- Scalability: Practical limit of thousands of threads
- Blocking: Threads actually block
Cats Effect Fiber Characteristics (ldbc-connector)
Advantages:
- Memory efficiency: ~150 bytes per fiber
- Lightweight context switching: User-space switching
- High scalability: Millions of fibers possible
- Semantic blocking: Worker threads are released
Limitations:
- Cooperative scheduling: Requires explicit yield points
- Learning curve: Requires functional programming knowledge
- Ecosystem: Limited supporting libraries
- Debug complexity: Difficult to trace asynchronous execution flow
Impact on Pooling Design
The characteristics of each concurrency model lead to different pooling implementation approaches:
Thread-based Pool (HikariCP style)
- Pool size: Requires careful configuration due to OS resource constraints
- Wait strategy: Blocking wait, utilizing dedicated thread pools
- Use cases: CPU-intensive tasks, integration with legacy systems
- Operations: Easy management with existing monitoring tools
Fiber-based Pool (ldbc-connector)
- Pool size: Larger pool sizes possible
- Wait strategy: Non-blocking wait, efficient resource sharing
- Use cases: I/O-intensive tasks, environments requiring high concurrency
- Operations: Requires Cats Effect-compatible monitoring and management
Comparison with HikariCP
Similarities
- Performance-focused design
- Lock-free data structures like ConcurrentBag
- Connection management using proxy pattern
- Automatic pool size adjustment
Differences
Feature | HikariCP | ldbc-connector |
---|---|---|
Concurrency Model | JVM Threads | Cats Effect Fibers |
Blocking Handling | Blocks threads | Semantic blocking |
Scalability | Limited by thread count | Virtually unlimited fibers |
CircuitBreaker | Requires external library | Built-in |
Error Handling | Exception-based | Functional |
Resource Management | try-with-resources | Resource |
Usage Scenarios and Selection Criteria
When ldbc-connector is suitable:
-
High Concurrency Environments
- Thousands of concurrent connection requests
- Microservice architectures
- Reactive applications
-
I/O-Bound Workloads
- Long-running queries
- Simultaneous access to multiple databases
- Asynchronous processing pipelines
-
Cats Effect Ecosystem
- Already using Cats Effect
- Adopting functional programming approach
- Type safety-focused environments
When thread-based pools like HikariCP are suitable:
-
Integration with Existing Systems
- Legacy applications
- Traditional frameworks like Spring Framework
- Compatibility with JDBC-compliant tools
-
Operational Considerations
- Utilizing existing monitoring and management tools
- Leveraging existing team knowledge
- Proven, stable implementation
-
Simple Concurrency Requirements
- Moderate concurrent connection counts
- Predictable workloads
- CPU-intensive processing
Background Tasks
ldbc-connector runs multiple background tasks to maintain pool health:
HouseKeeper
- Removes expired connections
- Handles idle timeout
- Maintains minimum connection count
AdaptivePoolSizer
- Dynamically adjusts pool size based on utilization metrics
- Scales up and down based on load
- Stabilization through cooldown periods
KeepaliveExecutor
- Periodically validates idle connections
- Maintains connections and prevents timeouts
Configuration Examples
Basic Usage
import cats.effect.IO
import ldbc.connector.*
import scala.concurrent.duration.*
// Pool configuration
val config = MySQLConfig.default
.setHost("localhost")
.setPort(3306)
.setUser("myuser")
.setPassword("mypassword")
.setDatabase("mydb")
// Pool size settings
.setMinConnections(5) // Minimum connections (default: 5)
.setMaxConnections(20) // Maximum connections (default: 10)
// Timeout settings
.setConnectionTimeout(30.seconds) // Connection acquisition timeout (default: 30s)
.setIdleTimeout(10.minutes) // Idle timeout (default: 10min)
.setMaxLifetime(30.minutes) // Maximum lifetime (default: 30min)
.setValidationTimeout(5.seconds) // Validation timeout (default: 5s)
// Validation & Health checks
.setAliveBypassWindow(500.millis) // Skip validation if used recently (default: 500ms)
.setKeepaliveTime(2.minutes) // Idle validation interval (default: 2min)
.setConnectionTestQuery("SELECT 1") // Custom test query (optional)
// Leak detection
.setLeakDetectionThreshold(2.minutes) // Connection leak detection (default: none)
// Maintenance
.setMaintenanceInterval(30.seconds) // Background cleanup interval (default: 30s)
// Adaptive sizing
.setAdaptiveSizing(true) // Dynamic pool size adjustment (default: false)
.setAdaptiveInterval(1.minute) // Adaptive sizing check interval (default: 1min)
// Create pooled datasource
val poolResource = MySQLDataSource.pooling[IO](config)
// Use the pool
poolResource.use { pool =>
pool.getConnection.use { conn =>
// Use connection
for
stmt <- conn.createStatement()
rs <- stmt.executeQuery("SELECT 1")
_ <- rs.next()
result <- rs.getInt(1)
yield result
}
}
Pool with Metrics Tracking
import ldbc.connector.pool.*
val metricsResource = for
tracker <- Resource.eval(PoolMetricsTracker.inMemory[IO])
pool <- MySQLDataSource.pooling[IO](
config,
metricsTracker = Some(tracker)
)
yield (pool, tracker)
metricsResource.use { case (pool, tracker) =>
// Use pool and monitor metrics
for
_ <- pool.getConnection.use(conn => /* use connection */ IO.unit)
metrics <- tracker.getMetrics
_ <- IO.println(s"Total acquisitions: ${metrics.totalAcquisitions}")
_ <- IO.println(s"Average acquisition time: ${metrics.acquisitionTime}")
yield ()
}
Pool with Lifecycle Hooks
case class SessionContext(userId: String, startTime: Long)
val beforeHook: Connection[IO] => IO[SessionContext] = conn =>
for
_ <- conn.createStatement().flatMap(_.executeUpdate("SET SESSION sql_mode = 'STRICT_ALL_TABLES'"))
startTime = System.currentTimeMillis
yield SessionContext("user123", startTime)
val afterHook: (SessionContext, Connection[IO]) => IO[Unit] = (ctx, conn) =>
IO.println(s"Connection used by ${ctx.userId} for ${System.currentTimeMillis - ctx.startTime}ms")
val poolWithHooks = MySQLDataSource.poolingWithBeforeAfter[IO, SessionContext](
config = config,
before = Some(beforeHook),
after = Some(afterHook)
)
CircuitBreaker Configuration
CircuitBreaker is configured automatically internally, with the following default values in the current implementation:
maxFailures
: 5 (threshold for transitioning to Open state)resetTimeout
: 30 seconds (time before transitioning to Half-Open state)exponentialBackoffFactor
: 2.0 (backoff factor)maxResetTimeout
: 5 minutes (maximum reset timeout)
Benchmark Results
The following shows benchmark results comparing the performance of ldbc-connector and HikariCP with different thread counts. The benchmark measures concurrent performance in executing SELECT statements.
Test Environment
- Benchmark content: Concurrent execution of SELECT statements
- Test targets: ldbc-connector vs HikariCP
- Thread counts: 1, 2, 4, 8, 16
Result Graphs
Thread Count: 1
Thread Count: 2
Thread Count: 4
Thread Count: 8
Thread Count: 16
Analysis of Benchmark Results
The following trends can be observed from these benchmark results:
-
Low Concurrency Environment (1-2 threads)
- Performance of both implementations shows relatively close values
- The difference in overhead is small for simple workloads
-
Medium Concurrency Environment (4-8 threads)
- As concurrency increases, the characteristics of each implementation begin to emerge
- The impact of the fiber-based lightweight concurrency model can be observed
-
High Concurrency Environment (16 threads)
- The behavioral differences between the two implementations become clear under high concurrency
- Resource efficiency and scalability characteristics become prominent
Performance Characteristics Discussion
The benchmark results reflect the inherent characteristics of each approach:
ldbc-connector (Fiber-based)
- Efficient resource utilization through lightweight concurrency primitives
- CPU usage optimization through semantic blocking
- Scalability under high concurrency
HikariCP (Thread-based)
- Stable performance from mature implementation
- Fairness through OS-level scheduling
- High compatibility with existing JVM toolchains
Recommendations by Use Case
Recommendations based on benchmark results for different use cases:
-
When low to moderate concurrency is required
- Both implementations provide sufficient performance
- Choose based on existing infrastructure and team experience
-
When high concurrency is required
- Consider application characteristics (I/O-centric vs CPU-centric)
- Also consider operational requirements (monitoring, debugging, troubleshooting)
-
For dynamic workloads
- Consider utilizing adaptive pool sizing
- CircuitBreaker behavior during failures is also an important selection criterion
It's important to note that benchmark results are measurements under specific conditions, and actual application performance is influenced by many factors including workload characteristics, database configuration, and network environment. Testing with actual workloads is recommended before production deployment.
Summary
The ldbc-connector pooling system is an implementation that leverages Cats Effect's concurrency model. By incorporating the CircuitBreaker pattern, it enhances resilience during database failures.
Fiber-based and thread-based approaches each have their strengths and weaknesses. It's important to make appropriate choices by comprehensively evaluating application requirements, existing infrastructure, team skill sets, and operational considerations.
ldbc-connector is a choice that can maximize its characteristics particularly in environments requiring high concurrency or when adopting the Cats Effect ecosystem.