ImmuDB performance on AWS environments ( S3 + EKS ) #1442

Oct 31, 2022

andreimerfu
Oct 31, 2022

Hi community! 🚀

In the last few days, I'm facing the following issue with Read/Write performance when I'm using AWS S3 as a storage solution for ImmuDB. When I ran a load test on my app that put serious pressure on the ImmuDB database, I saw very high response times that seems to be caused by the ImmuDB-S3 integration. Just to prove that, I replaced the S3 storage with AWS EFS temporarily, and everything works as a charm.

Also, I want to mention that I have a VPC endpoint (gateway) for S3, thus communication between the Kubernetes cluster and S3 should be inside of the Amazon network not over the Internet.

Below are a few result times from my tests:

s3 vs efs for 102800 requests: 00:22:13, 00:18:22 -> 4 minutes difference
s3 vs efs for 261836 requests: 01:15:14, 00:46:52 -> 30 minutes difference 
s3 vs efs for 351448 requests: 01:54:14, 01:03:52 -> 51 minutes difference

What do you think? Is this a known issue?
Thank you!

Environment:
Amazon EKS 1.23
ImmuDB (deployed with HELM, single replica)
S3 storage (AWS VPC endpoint configured)

Oct 31, 2022

byo
Oct 31, 2022

Hi @andreimerfu, we've observed some performance issues when S3 backend is used once the database reaches some size threshold. Thus the S3 backend still needs some work.

One of the possibilities is that it is not optimal when accessing BTree that when stored in S3 server. The reason may be that accessing even small portion of that data can end up downloading a pretty large (e.g. 1MB) chunk of data from S3. Can you describe your workload a bit more? Important parameters here would be whether it is using SQL or KV, the number of entries, amount of data and if this is more read or write workload (or what are the access patterns).

As an alternative we also observed that using amazon EBS volumes with a standard filesystem like ext4 on top of it also works fine for immudb (and kubernetes can nicely integrate with such volumes).

0 replies

byo · Nov 3, 2022

andreimerfu
Nov 3, 2022
Author

Hi @byo ! 👋🏻

A few details about the workload:

key-value
no data was present in the database before the test started
100 threads (users) with 1000 loops and a ramp-up period of 1 second
8 requests are send to immudb that alternate read and write so read and write is balanced
after the test the database contains 300013 transactions

2 replies

byo Nov 4, 2022

Thanks for the info, I'll prepare some local benchmark with similar characteristics. Two more questions: how many KV entries per transaction do you write? And how keys are generated - what's the average length and are there random or something more structured?

andreimerfu Nov 14, 2022
Author

Hi @byo ! Thank you for looking into this!

We're using maximum 10 keys per transaction;
When the keys are created, we're using execAll method and preconditions, in order to check if the keys already exists (or not) in the database
The keys are composed by a predefined prefix, an UUID and a suffix. Thus, an example key format is: <prefix>:UUID:<suffix>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ImmuDB performance on AWS environments ( S3 + EKS ) #1442

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments · 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Search code, repositories, users, issues, pull requests...

ImmuDB performance on AWS environments ( S3 + EKS ) #1442

Uh oh!

andreimerfu Oct 31, 2022

Replies: 2 comments · 2 replies

Uh oh!

byo Oct 31, 2022

Uh oh!

Uh oh!

andreimerfu Nov 3, 2022 Author

Uh oh!

byo Nov 4, 2022

Uh oh!

andreimerfu Nov 14, 2022 Author

andreimerfu
Oct 31, 2022

byo
Oct 31, 2022

andreimerfu
Nov 3, 2022
Author

andreimerfu Nov 14, 2022
Author