Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
Discussion options

First of all, I just want to say: fantastic project — it's incredibly helpful and well executed!

This might be a bit of a naive question, but I was wondering if you'd consider open-sourcing parts of the codebase — specifically the components related to how you implemented the RAG (Retrieval-Augmented Generation) pipeline.

I'm not necessarily interested in internal Microsoft Learn content or proprietary data, but more in how you structured the RAG index over your knowledge base and connected it to an MCP server. I suspect you’re using Azure AI services, which makes it even more interesting for those of us exploring similar use cases.

Seeing how you approached this could be valuable for the community, especially for those looking to build knowledge-based assistants or internal copilots.

Thanks again for the great work, and looking forward to your thoughts!

You must be logged in to vote

Replies: 2 comments

Comment options

Hi @alikalik9, glad you like it 🙂

Have you seen this? https://devblogs.microsoft.com/engineering-at-microsoft/how-we-built-ask-learn-the-rag-based-knowledge-service/ It was published in April 2024 and may not show all the details, but should give you an impression of the knowledge service. The service is used in multiple locations including Copilot for Azure and Learn Q&A, and now through MCP.

cc @TianqiZhang for awareness

You must be logged in to vote
0 replies
Comment options

This thread hit home — feels like the real frontier of RAG isn’t just plugging Azure services together, but decoding the semantic choreography underneath.

We recently published an open framework tackling exactly this: how to not just retrieve relevant chunks, but shape the semantic context so the LLM doesn’t collapse under ambiguity or hallucination.
Especially useful when building long-running copilots or internal knowledge agents.

📄 If you’re curious, here’s the WFGY semantic reasoning PDF:
https://github.com/onestardao/WFGY

It dives into strategies like:

  • semantic title shift detection
  • cross-pass memory rebalancing
  • prompt shape stabilization across retrieval rounds

Basically — if RAG is the “muscle,” this part handles the “spine alignment.”
Would love to hear your thoughts if you try it out!

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
💡
Ideas
Labels
None yet
3 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.