Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Discussion options

Hi community! 🚀

In the last few days, I'm facing the following issue with Read/Write performance when I'm using AWS S3 as a storage solution for ImmuDB. When I ran a load test on my app that put serious pressure on the ImmuDB database, I saw very high response times that seems to be caused by the ImmuDB-S3 integration. Just to prove that, I replaced the S3 storage with AWS EFS temporarily, and everything works as a charm.

Also, I want to mention that I have a VPC endpoint (gateway) for S3, thus communication between the Kubernetes cluster and S3 should be inside of the Amazon network not over the Internet.

Below are a few result times from my tests:

s3 vs efs for 102800 requests: 00:22:13, 00:18:22 -> 4 minutes difference
s3 vs efs for 261836 requests: 01:15:14, 00:46:52 -> 30 minutes difference 
s3 vs efs for 351448 requests: 01:54:14, 01:03:52 -> 51 minutes difference

What do you think? Is this a known issue?
Thank you!

Environment:
Amazon EKS 1.23
ImmuDB (deployed with HELM, single replica)
S3 storage (AWS VPC endpoint configured)

You must be logged in to vote

Replies: 2 comments · 2 replies

Comment options

Hi @andreimerfu, we've observed some performance issues when S3 backend is used once the database reaches some size threshold. Thus the S3 backend still needs some work.

One of the possibilities is that it is not optimal when accessing BTree that when stored in S3 server. The reason may be that accessing even small portion of that data can end up downloading a pretty large (e.g. 1MB) chunk of data from S3. Can you describe your workload a bit more? Important parameters here would be whether it is using SQL or KV, the number of entries, amount of data and if this is more read or write workload (or what are the access patterns).

As an alternative we also observed that using amazon EBS volumes with a standard filesystem like ext4 on top of it also works fine for immudb (and kubernetes can nicely integrate with such volumes).

You must be logged in to vote
0 replies
Comment options

Hi @byo ! 👋🏻

A few details about the workload:

  • key-value
  • no data was present in the database before the test started
  • 100 threads (users) with 1000 loops and a ramp-up period of 1 second
  • 8 requests are send to immudb that alternate read and write so read and write is balanced
  • after the test the database contains 300013 transactions
You must be logged in to vote
2 replies
@byo
Comment options

Thanks for the info, I'll prepare some local benchmark with similar characteristics. Two more questions: how many KV entries per transaction do you write? And how keys are generated - what's the average length and are there random or something more structured?

@andreimerfu
Comment options

Hi @byo ! Thank you for looking into this!

  • We're using maximum 10 keys per transaction;
  • When the keys are created, we're using execAll method and preconditions, in order to check if the keys already exists (or not) in the database
  • The keys are composed by a predefined prefix, an UUID and a suffix. Thus, an example key format is: <prefix>:UUID:<suffix>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
🙏
Q&A
Labels
None yet
2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.