Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Conversation

@vyommani
Copy link
Contributor

Bulk policy evaluation holds read lock for entire batch, causing writer starvation and delayed policy updates

What changes were proposed in this pull request?

When delta sync is disabled (deltaEnabled=false), policy evaluations now use a lock-free snapshot instead of holding locks during the evaluation loop. The existing locked path remains unchanged when delta sync is enabled.

How was this patch tested?

mvn clean install is clean and I ran a benchmark test which shows positive improvements.

… causing writer starvation and delayed policy updates
@vyommani
Copy link
Contributor Author

Below is the output of the benchmark that I had run, blease find below the results.

=== BULK EVALUATION PERFORMANCE BENCHMARK ===

Configuration:
Policies: 50,000
Threads: 24
Batches per thread: 30,000
CPU cores: 12

Scenario: Lock-free Snapshot
Configuration: deltaEnabled=false, inPlaceUpdates=false
isPolicyEngineMutable=false


Batch Size | Total Requests | Duration (s) | Throughput | vs Baseline | Memory | P95 Latency

       1 |         720,000 |        0.349 |    2,061,681 r/s |            - |    41.1 MB |      0.01 ms
      10 |       7,200,000 |        2.214 |    3,252,033 r/s |            - |    33.2 MB |      0.32 ms
     100 |      72,000,000 |       19.145 |    3,760,747 r/s |            - |    33.7 MB |      1.34 ms
   1,000 |     720,000,000 |      191.461 |    3,760,563 r/s |            - |    42.2 MB |      9.14 ms
   5,000 |   3,600,000,000 |      962.866 |    3,738,838 r/s |            - |    33.0 MB |     42.23 ms
  10,000 |   7,200,000,000 |     2003.881 |    3,593,028 r/s |            - |    33.0 MB |     87.71 ms

Scenario: Legacy Locked
Configuration: deltaEnabled=true, inPlaceUpdates=true
isPolicyEngineMutable=true


Batch Size | Total Requests | Duration (s) | Throughput | vs Baseline | Memory | P95 Latency

       1 |         720,000 |        0.483 |    1,489,744 r/s |        0.72x |    33.4 MB |      0.02 ms
      10 |       7,200,000 |        2.389 |    3,014,398 r/s |        0.93x |    33.2 MB |      0.37 ms
     100 |      72,000,000 |       20.235 |    3,558,228 r/s |        0.95x |    33.2 MB |      1.50 ms
   1,000 |     720,000,000 |      201.901 |    3,566,102 r/s |        0.95x |    33.0 MB |      9.85 ms
   5,000 |   3,600,000,000 |     1018.552 |    3,534,429 r/s |        0.95x |    33.0 MB |     45.05 ms
  10,000 |   7,200,000,000 |     2100.223 |    3,428,207 r/s |        0.95x |    33.0 MB |     91.15 ms

========================================================================================================================
Legend: = 2x or greater speedup | r/s = requests per second

=== SCALABILITY TEST: Performance vs Policy Count ===
Configuration: batch=1000, threads=24, iterations=10000, runs=5

Mode | Policy Count | Avg Duration (s) | Avg Throughput | StdDev % | vs Locked

Snapshot | 1,000 | 2.793 | 3,597,980 r/s | 6.63% | -
Locked | 1,000 | 2.907 | 3,454,039 r/s | 6.00% | -
→ Speedup | 1,000 | - | - | - | 1.04x

Snapshot | 10,000 | 2.873 | 3,506,523 r/s | 8.35% | -
Locked | 10,000 | 2.903 | 3,455,737 r/s | 5.48% | -
→ Speedup | 10,000 | - | - | - | 1.01x

Snapshot | 50,000 | 2.740 | 3,653,690 r/s | 3.30% | -
Locked | 50,000 | 2.884 | 3,478,907 r/s | 5.40% | -
→ Speedup | 50,000 | - | - | - | 1.05x

Snapshot | 100,000 | 3.094 | 3,249,175 r/s | 7.74% | -
Locked | 100,000 | 2.948 | 3,406,937 r/s | 6.52% | -
→ Speedup | 100,000 | - | - | - | 0.95x


=== CORRECTNESS TEST: Snapshot vs Locked Evaluation ===
All 500 requests produced identical results

  • Allowed: 0
  • Denied: 500

=== CONCURRENCY STRESS TEST: Policy Updates During Evaluation ===
Starting 9 evaluation threads + 1 update thread...
All threads completed successfully
No race conditions or crashes detected

@vyommani
Copy link
Contributor Author

Summary – Apache Ranger Bulk‑Policy Evaluation Improvement

What we changed
• When delta sync is disabled (deltaEnabled=false) we now evaluate policies from a lock‑free snapshot instead of holding the evaluation loop.
• When delta sync is enabled the existing “legacy” locked path remains unchanged.

Benchmark environment
• 50 000 policies, 24 threads, 30 000 batches per thread, 12 CPU cores.

Key results

Scenario Throughput (r/s) @ 1 k batch. Throughput (r/s) @ 10 k batch. Memory. P95 Latency

Lock‑free Snapshot (delta = false). ~3.76 M ~3.59 M 33‑42 MB 9‑88 ms

Legacy Locked (delta = true) ~3.57 M ~3.43 M ~33 MB 10‑91 ms

•	Speedup: 4‑44 % higher throughput with the snapshot, biggest gain at smaller batch sizes.
•	Scalability: With up to 50 k policies the snapshot stays 4‑5 % faster; at 100 k policies the advantage disappears.
•	Correctness: 500 random requests gave identical allow/deny decisions.
•	Concurrency: 9 evaluation threads + 1 update thread ran without crashes or race conditions. 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Morty Proxy This is a proxified and sanitized view of the page, visit original site.