-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Dependency Graph based ExecutionStrategy #1409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Added DependencyGraph basic tests - Added "start" and "finish" methods to TraverserVisitor and modified Traverser to fire "start" and "finish" events
- got rid of TraverserVisitor "start" and "finish" notifications to keep it simple - added default, size and copy constructors to DependencyGraph
…ap that improved reliability.
…Edge.disconnectEndpoints methods
…rsively parent contexts or provide default values - simplified field collection in ExecutionPlanBuilder to recursively search fr vars/results using TraverserContext.computeXXXIfAbsent methods
…ved ExecutionPlanBuilder inside it. - Created ExecutionStrategy interface - Created Execution class
…ertices when [automatically] resolved - updated ExecutionPlan to replace FieldVertices dependency on OperationVertex with dependency on DocumentVertex
- added DependencyGraphContext signature interface
- propagated DependencyGraphContext into topological sorting process
- added javadoc comments
|
Thanks a lot @gkesler for this effort! Looking forward to look into it more, but give us some time :-) A first glimpse showed a couple of changed files which are not really changes: could you make sure that you are using the graphql-java code style: https://github.com/graphql-java/graphql-java/blob/master/graphql-java-code-style.xml. Thanks! |
|
@andimarek just merged with latest and greatest from master. |
|
Unfortunately, I am a dinosaur using NetBeans. |
…olved nodes from traversing - reverted a fragment in Traverser to add to visited *after* visiting a node
|
Hi @gkesler, sorry for the long delay. One thing I would be interested is how Interfaces/Union will work with that: As far as I understand the But Interfaces/Union can only be evaluated while executing, so I would expect that some kind of feedback loop into the ExecutionPlan must happen while executing. But this would somehow contradict the idea of the prepared Thanks again for this PR. P.S.: My idea complains a couple of times about |
|
@andimarek As I understand the problem statement, during execution of a query that references interface/union fields some of the execution paths may be not executed due to runtime type resolution of fragments. This could be easily implemented in the ExecutionPlan with adding a new type of Vertex, call it ConditionalVertex, that would be recorded into the execution plan for every inline fragment or fragment spread and its auto close method (it is delegated to ExecutionPlanContext) and it is implemented for FieldVertex not to resolve further in case its source is empty or null. Does this contradict the idea of prepared execution plan? I think it perfectly fits in it. The plan represents a query execution state machine, i.e. it has all possible states and state transitions to guarantee execution of the query. State transitions can have guard conditions that trigger them only under some circumstances. State machine execution is finished when state machine reaches the final state. During the execution not every state might be entered, it depends on the particular sequence of triggering events/accumulated data. |
|
thanks @gkesler for the explanation. I wanna go through some steps to document the idea here a bit and to understand the whole approach better. First thing: Taking the example test with the following query: {foo {
id
bar {
id
name
}
}}we get the following execution plan: and the direction of the edge is the dependency: so That correct? |
|
accurate picture @andimarek Dependency arrow points at the vertex resolved first. So in The execution of the query is based on topological sorting of graph vertices.
Please note, I have introduced dependencies "foo" --> "document", "operation" --> "document" and "operation" --> all nodes for the convenience. Each dependency edge is associated with an action that gets executed when source vertex is resolved (either via auto resolution or external field resolution). This way when "document" vertex is auto resolved its dependent "operation" vertex is notified and at that time we can inject some bootstrapping code to prepare operation execution, for instance initializing root and context values for the operation execution. "operation" vertex depends on each "field" vertices mainly because of 2 reasons:
|
|
@gkesler I have some trouble to understand "foo" -> "document" dependency. PS |
|
@amatiushkin It does not have to be this way, but it is quite convenient. When a vertex is resolved, the algorithm executes actions associated with all that vertex outbound edges. Those actions prepare contexts for the next vertices resolutions. As well as actions associated with edges that point to "operation" vertex perform join functions. "document" vertex does not require external resolution, i.e. data fetching. Therefore it is resolved automatically. Consider it to be a state machine "initial" state. When resolved (state entered) it fires action implemented in https://github.com/graphql-java/graphql-java/pull/1409/files#diff-42fc714ebd9e1cc4e5a3edd8801738d1 L.102 that sets up source for the very first vertex. Please note that action is different from the consequent similar actions between field vertices. |
|
thanks @gkesler @amatiushkin . I would like to see how we can model the following query: # Write your query or mutation here
{
nodes {
id
... on Cat {
eatsTuna
friends {
... on Human {
name
}
}
}
... on Dog {
breed
friends {
... on Cat {
name
}
}
}
}
}
with the following schema: type Query {
nodes: [Node]
}
interface Node {
id: ID
}
type Human implements Node {
id: ID
name: String
}
type Dog implements Node {
id: ID
breed: String
friends: [Node]
}
type Cat implements Node {
id: ID
eatsTuna: Bool
friends: [Node]
}
I see the challenge here in the combination of Fragments and lists: |
|
@andimarek, thanks a lot for the example. Please, note, I have added 1 more vertex type (in yellow on the diagram) associated with fragment conditions. These nodes don't need an external resolution via sending external requests, but instead they resolve automatically to the list of source objects if the type condition match or to empty list if it does not, so the dependent vertices won't need resolution. The topological order is:
So everything, except "fragment" vertices is straightforward and similar to what is in the PR. Each dependent vertex receives a list of resolved sources from its dependency. Let's consider "fragment Cat" in #3.
Topological ordering is essentially a BFS traversal that keeps going a particular path until it sees a vertex that cannot be automatically resolved. So, in case of new |
|
thanks @gkesler. I see how this could work for the described query. Let me now add the final complexity: merged fields. I think in order build the full dependency graph for a general query we need to analyze it more than we do so far: just traversing is not enough. We actually need to operate on merged fields, without actually executing the query. For example: query myQyery($myVar: Boolean) {
nodes {
id
... on Cat {
eatsTuna
friends {
... on Human {
name
}
... @skip(if: $myVar) {
... on Cat {
...CatF1
friends {
... on Human {
name @skip(if: $myVar)
}
}
}
}
}
... {
friends {
id
}
}
friends {
id
... on Cat {
id
}
}
}
... on Dog {
breed
friends {
...CatF1
...CatF2
}
}
}
}
fragment CatF1 on Cat {
eatsTuna
friends {
...CatF2
}
}
fragment CatF2 on Cat {
eatsTuna
}Sorry for this rather complex query. The point is: fields must be merged together and Fragments need to be inlined in order to build the correct dependency graph. A simpler example is: query myQyery($myVar: Boolean) {
nodes {
id
... {
id
}
id @skip(if: $myVar)
... on Human {
id
}
... on Cat {
id
}
}
}Here I think building a traverser which gives us a merged fields dependency tree is a pre condition for this kind of execution strategy. I always wanted to build such an analyzer, because it also useful for other use cases, but we haven't done it so far. |
|
@andimarek thanks for the new challenging question. The major benefit of using dependency graph is that once the graph is populated, it is always executed using the very same algorithm based on the topological sorting. IMHO, this leads to an interesting consequence that the linked graph could be transformed into a matrix, which can drastically reduce execution overhead. Merging fields capability is yet great benefit of the graph. Let's assume we have created a
|
|
@gkesler yes I agree that the ExecutionPlan is probably the right place to create exactly this kind of MergedField based dependency graph. |
|
absolutely @andimarek. Let me create a PR that just adds DependencyGraph. thanks! |
|
@gkesler this is a long overdue, but we introduced and successfully used a new structure called NormalizedExecutableOperation I believe that this is the right structure to be used as input for creating an execution plan. |




In order to simplify complexity of ExecutionStrategy, especially the batched one, it makes sense to re-implement it with a formal DependencyGraph.
GraphQL spec pretty much defines an algorithm to execute request. Unfortunately, if implemented literally, it leads to a famous N+1 problems.
DependencyGraph allows to separate concerns and focus strictly on the topological sorting of the field resolution instructions.
Additionally, looking forward, DependencyGraph based execution will drastically simplify:
@bbakerman @andimarek
This is a POC that DependencyGraph based execution works.
I would be happy to work with you to make it into the production.