To query a pattern on multi directed graph with label / attribute filtering efficiently, support Neo4j and SQLite3 as the storage layer.
The project is still under development, however it has already been proved to be a correct implementation.
The test dataset is LDBC-SNB-SF0.1, and the example used for testing all come from LDBC-SNB Business Intelligence (BI 1 ~ 20). You could find the actual query statements from ./resources/cypher.
EmberGraph is only one-more-step to be formally published:
To optimize the
matching order generation algorithm(most likely to beiterative dynamic programmingjust like whatNeo4jhas done since it's2.2version)
Even though we're still working on the one-last-step to use the better matching-order seeking strategy, the performance of this pre-published version is approximately equivalent to that of Neo4j. (Oh, even faster in some cases!)
You should have installed git-lfs first, to correctly clone this project:
git clone https://github.com/DrEden33773/EmberGraph
git lfs pull
cargo build # This will automatically install uv if in need, and then initialize `./ember-graph-import`Then, you should manually do some initialization steps:
- Transformed the original dataset into a
Neo4j-import-friendlyone. (Using./ember-graph-import/raw_data_formatter.py) - Generated basic
label-basedstatistics. (Using./ember-graph-import/statistics_generator.py) - Imported
Neo4j-import-friendlydataset intoNeo4jorSQLite3. (Using./ember-graph-import/neo4j_admin_import.pyor./ember-graph-import/sqlite_import.py)
All of the python scripts mentioned above could executed via the command:
cd ./ember-graph-import
source ./.venv/bin/activate
uv run <SCRIPT-NAME>Right now, if you want to, you could run commands:
cargo run --example bi_<x> # (where x in [1..=20])To check the query result of bi_1 to bi_20.
Yes, you might have guessed -- It's totally possible to get the highest performance to build under the release mode.
However, I have to mind you that could be TOO SLOW.
So, if you really don't mind, here're several better options:
- Linux:
cargo build --release --all-target -j $(nproc)- Mac:
cargo build --release --all-target -j $(sysctl -n hw.ncpu)- Windows(Powershell):
cargo build --release --all-target -j $env:NUMBER_OF_PROCESSORS- Windows(CMD):
cargo build --release --all-target -j %NUMBER_OF_PROCESSORS%After all, that's a short-term pain for long-term gain😂.
Then, you could run the command below to check the query result of bi_1 to bi_20:
cargo run --release --example bi_<x> # (where x in [1..=20])