Add first-class embedding primitives across network API, DSL, graph_ops, and pipeline#1270
Merged
SkBlaz merged 3 commits intomasterSkBlaz/py3plex:masterfrom Mar 14, 2026
copilot/add-node-and-edge-embedding-supportSkBlaz/py3plex:copilot/add-node-and-edge-embedding-supportCopy head branch name to clipboard
Merged
Add first-class embedding primitives across network API, DSL, graph_ops, and pipeline#1270SkBlaz merged 3 commits intomasterSkBlaz/py3plex:masterfrom copilot/add-node-and-edge-embedding-supportSkBlaz/py3plex:copilot/add-node-and-edge-embedding-supportCopy head branch name to clipboard
SkBlaz merged 3 commits intomasterSkBlaz/py3plex:masterfrom
copilot/add-node-and-edge-embedding-supportSkBlaz/py3plex:copilot/add-node-and-edge-embedding-supportCopy head branch name to clipboard
Conversation
Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>
Co-authored-by: SkBlaz <10035780+SkBlaz@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Add first-class node and edge embedding support in py3plex
Add first-class embedding primitives across network API, DSL, graph_ops, and pipeline
Mar 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR makes embeddings a first-class ML primitive in py3plex with multilayer-native behavior and end-to-end integration across the builder/DSL architecture, graph_ops, and pipelines. It introduces a unified embedding surface while reusing and extending existing embedding infrastructure (
netmf,metapath2vec, DSL embedding stage).New embedding package (
py3plex/ml/embedding)base.py(base interface + result type re-export)node2vec.py,deepwalk.py,netmf.py,line.py,metapath2vec.pymultiplex.py(multiplex-aware variants)trainer.py,utils.py,similarity.py,evaluation.pypy3plex/ml/__init__.pyandpy3plex/ml/embedding/__init__.py.Unified network entry point
multi_layer_network.embed(...)to instantiate and run embedding models via a single API.node2vec,deepwalk,netmf,line,metapath2vec) and multiplex-aware methods (multiplex_node2vec,supra_adjacency,layer_regularized).EmbeddingResult promoted to first-class result object
py3plex/embeddings/base.pywith:embedding[node_id])vectors,nodes,dimensionto_pandas(),to_numpy(),to_arrow()similarity,knn,most_similar)cluster)save/loadfor parquet/arrow/npy/npz)DSL integration
EmbeddingSpecandQ.embed(...)to support node2vec/deepwalk/line-oriented parameters (p,q,window_size,negative_samples,workers,order, aliasdimensions).network.embed(...)and expose embedding vectors via query attributes for downstream export/use.graph_ops integration
NodeFrame.embed(...)to attach embedding vectors as anembeddingcolumn in node frames.Pipeline integration
NodeEmbeddingpipeline step (method/dimensions-driven).NodeEmbeddingin public exports and added compatibility shim forpy3plex.pipeline.steps.embedding.Docs and package surface
Example
Original prompt
This section details on the original issue you should resolve
<issue_title>emb fclass</issue_title>
<issue_description>Implement first-class node and edge embedding support in py3plex as a core ML primitive integrated with the builder/DSL architecture.
Goals:
Embeddings must work naturally with multilayer networks.
API must integrate with existing DSL (Q), pipelines, and graph_ops.
Backend should support scalable implementations (NumPy/JAX/PyTorch optional).
Embeddings must be reusable across downstream tasks (link prediction, clustering, classification).
Design and implement the following components.
Create a new module:
py3plex/ml/embedding/
Submodules:
base.py
node2vec.py
deepwalk.py
netmf.py
line.py
metapath2vec.py
multiplex.py
trainer.py
utils.py
Define a base embedding interface.
File: base.py
Requirements:
class BaseEmbedding:
name: str
All embedding models must inherit from this base.
Add a unified embedding entry point on the network object.
Extend multi_layer_network with:
net.embed(
method="node2vec",
dimensions=128,
walk_length=40,
num_walks=10,
context_size=10,
workers=4
)
Return object:
EmbeddingResult
Implement EmbeddingResult.
Capabilities:
embedding.vectors
embedding.nodes
embedding.dimension
embedding[node_id]
embedding.to_pandas()
embedding.to_numpy()
embedding.to_arrow()
embedding.similarity(node_a, node_b)
embedding.knn(node, k=10)
Storage:
dict[(node_id, layer)] -> vector
Support multiplex nodes.
Implement Node2Vec.
File: node2vec.py
Features:
biased random walks
parameters p, q
skipgram training
negative sampling
support multiplex networks
Interface:
Node2VecEmbedding(
dimensions=128,
walk_length=80,
num_walks=10,
p=1.0,
q=1.0,
window_size=10,
negative_samples=5,
)
Steps:
generate random walks
build training corpus
train skipgram
produce embeddings
Implement DeepWalk.
File: deepwalk.py
Same interface as Node2Vec but without bias parameters.
Implement NetMF.
File: netmf.py
Requirements:
spectral approximation of DeepWalk
use sparse matrices
truncated SVD
Interface:
NetMFEmbedding(
dimensions=128,
window=10,
negative=1,
)
Implement LINE.
File: line.py
Support:
order=1
order=2
Optimization:
negative sampling
stochastic gradient descent
Implement MetaPath2Vec for multiplex networks.
File: metapath2vec.py
Features:
meta-path guided random walks
layer-aware walks
Example metapath:
["author","paper","venue","paper","author"]
Interface:
MetaPath2VecEmbedding(
metapaths=[...],
dimensions=128,
walk_length=40
)
Implement multiplex-aware embeddings.
File: multiplex.py
Add:
MultiplexNode2Vec
SupraAdjacencyEmbedding
LayerRegularizedEmbedding
Capabilities:
cross-layer transitions
layer weighting
inter-layer edge handling
Random walk must support:
(node, layer)
states.
Implement embedding trainer.
File: trainer.py
Responsibilities:
random walk generation
negative sampling
batching
parallelization
APIs:
generate_walks(network)
train_skipgram(walks)
optimize_embeddings()
Allow backend selection:
backend="numpy"
backend="jax"
backend="torch"
Add DSL integration.
Extend Q builder.
Example:
result = (
Q.nodes()
.embed("node2vec", dim=128)
.execute(net)
)
Output:
QueryResult with embedding vectors
Columns:
node
layer
embedding
Add graph_ops integration.
Example:
nodes(net)
.embed(method="node2vec", dim=64)
.mutate(cluster=lambda x: kmeans(x["embedding"]))
Embedding must appear as column:
embedding
Add pipeline step.
File:
py3plex/pipeline/steps/embedding.py
Step:
NodeEmbedding(
method="node2vec",
dimensions=128
)
Usage:
Pipeline([
("embed", NodeEmbedding(method="node2vec")),
("cluster", NodeClustering())
])
Add similarity utilities.
File:
similarity.py
Functions:
cosine_similarity
euclidean_distance
dot_similarity
Add methods:
embedding.similarity(a,b)
embedding.most_similar(node,k=10)
Add clustering helpers.
embedding.cluster(method="kmeans", k=10)
embedding.cluster(method="spectral")
Return:
node -> cluster_id
Add persistence.
EmbeddingResult must support:
embedding.save("embeddings.parquet")
embedding.load("embeddings.parquet")
Formats:
parquet
arrow
numpy
Add evaluation utilities.
File:
evaluation.py
Functions:
evaluate_link_prediction()
evaluate_node_classification()
evaluate_clustering()
Metrics:
ROC-AUC
F1
NMI
ARI
Performance requirements.
random walk generation parallelized
sparse matrix operations
avoid dense adjacency matrices
s...
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.