There is a moment every developer hits when building an AI agent: the demo works, the model responds, the tool calls fire, and everything feels magical… right up until you ask the agent something it does not know.
Then the magic gets a little less magical.
You ask about your internal documentation, your product catalog, your project notes, your support history, or that one architectural decision nobody wrote down in the “official” place. The model gives a confident answer, but confidence is not the same thing as correctness. That is where Retrieval-Augmented Generation, better known as RAG, comes into the picture.
Microsoft Agent Framework gives .NET developers a structured way to build AI agents, attach tools, manage state, and orchestrate workflows. Build5Nines.SharpVector gives .NET developers a lightweight, in-memory vector database for semantic search. Put them together, and you get something practical: an agent that can reason over user requests while retrieving relevant local knowledge from your own application data.
In this article, let’s walk through when this pairing makes sense, how to wire it up in C#, and what tradeoffs to keep in mind before you ship it to production.
Why Agents Need Retrieval
A plain AI agent is useful, but it is also limited by what the underlying model knows, what you put into the prompt, and what tools you give it. That last part is where things get interesting.
Microsoft Agent Framework supports agents that can call tools, and those tools can be ordinary C# methods exposed to the agent through AIFunctionFactory.Create. Microsoft describes function tools as custom code that agents can call when needed, which is exactly the seam we need for integrating a local semantic search engine.
The common mistake is trying to stuff everything into the system prompt. That may work for three paragraphs of instructions, but it does not work for a documentation site, internal wiki, customer knowledge base, or a folder full of markdown files. Prompts are not databases. They are more like sticky notes on a monitor: handy, visible, and very easy to overload.
The better approach is to keep the agent focused and give it a retrieval tool. When the agent needs domain-specific context, it can call that tool, retrieve relevant snippets, and then generate a response grounded in that retrieved information.
That is the core RAG pattern:
- Store your documents as searchable vector data.
- Search for relevant content using the user’s question.
- Add the retrieved content to the agent’s response process.
- Let the model produce a useful answer from the retrieved context.
SharpVector handles the local vector search portion. Microsoft Agent Framework handles the agent portion.
Where SharpVector Fits in the Agent Architecture
Build5Nines.SharpVector is a lightweight, in-memory, semantic search text vector database built for .NET applications. It supports storing text, metadata, vectorizing content, and performing similarity search over that content. The project documentation positions it for semantic search, recommendation systems, semantic analysis, AI-enhanced features, and RAG-style applications.
Microsoft Agent Framework, on the other hand, is about building agents and workflows. Microsoft’s documentation describes it as supporting individual agents that process inputs, call tools and MCP servers, and generate responses, along with graph-based workflows for multi-step orchestration.
That gives us a clean mental model:
| Layer | Responsibility |
|---|---|
| Microsoft Agent Framework | Agent runtime, instructions, tool calling, workflows |
| SharpVector | Local semantic search and vector storage |
| Your C# code | Document loading, chunking, metadata, business logic |
| LLM / model provider | Language understanding and response generation |
SharpVector is not trying to be your agent framework. Agent Framework is not trying to be your embedded vector database. That separation is a good thing. It keeps the architecture understandable.
A good architecture should not feel like a bowl of spaghetti with a chatbot sitting on top. It should feel like a set of small components doing clear jobs.
When This Combination Makes Sense
Using SharpVector with Microsoft Agent Framework is a great fit when you want local or embedded retrieval inside a .NET application. That could be a console app, worker service, ASP.NET Core API, desktop application, internal developer tool, or proof-of-concept agent.
This pairing is especially useful when you are:
- Prototyping a RAG agent without deploying a full vector database.
- Building a local-first AI assistant for documentation or notes.
- Embedding semantic search inside a .NET application.
- Creating an internal agent for a small-to-medium knowledge base.
- Testing an agentic workflow before moving to Azure AI Search, Cosmos DB, PostgreSQL with pgvector, or another larger vector store.
- Building demos, labs, samples, or developer tooling where infrastructure should stay minimal.
This is not about pretending an in-memory vector database replaces every enterprise search platform. It does not. But not every problem needs a distributed system before lunch.
Sometimes you just need a fast, simple, local retrieval layer that lets your agent answer questions from your own content. That is where SharpVector fits nicely.
The Common Way: Ask the Agent and Hope
Let’s start with the pattern many developers naturally try first.
You create an agent, give it instructions, and ask it questions about your business or application. Something like this:
using Azure.AI.Projects;using Azure.Identity;using Microsoft.Agents.AI;AIAgent agent = new AIProjectClient( new Uri("https://your-foundry-service.services.ai.azure.com/api/projects/your-foundry-project"), new AzureCliCredential()) .AsAIAgent( model: "gpt-5.4-mini", instructions: """ You are a helpful assistant for our internal developer documentation. Answer questions clearly and concisely. """);Console.WriteLine(await agent.RunAsync( "How do we configure the background worker for invoice processing?"));
This may work if the model already knows the answer or if your prompt includes enough context. But if the answer lives in your private documentation, source repository notes, runbooks, or markdown files, the model has no reliable way to know it.
That is how hallucinations sneak in. The agent is not being malicious. It is just trying to be helpful without the right data. Think of it as a very smart coworker who joined the company five minutes ago and has not been given access to the wiki yet.
The Better Way: Give the Agent a Search Tool
Instead of hoping the model knows your content, we can expose a C# method as a function tool. That method searches SharpVector and returns relevant snippets.
Microsoft Agent Framework lets you create function tools from C# methods using AIFunctionFactory.Create, and descriptions can be added with DescriptionAttribute to help the model understand when and how to call the tool.
Here is the high-level flow:
User question ↓Agent receives the question ↓Agent decides it needs domain knowledge ↓Agent calls SearchKnowledgeBase(...) ↓SharpVector returns relevant text snippets ↓Agent uses those snippets to answer
This is a clean and practical division of responsibility. The agent does the reasoning and response generation. SharpVector does the retrieval. Your code controls what data gets indexed and what metadata comes back.
Installing the Packages
For the SharpVector side, install the core package:
dotnet add package Build5Nines.SharpVector
If you want to use OpenAI-compatible embeddings with Build5Nines SharpVector, the SharpVector documentation shows the Build5Nines.SharpVector.OpenAI package as the integration package for OpenAI embedding services.
dotnet add package Build5Nines.SharpVector.OpenAI
For Microsoft Agent Framework, the exact packages depend on the agent provider you are using. Microsoft’s quickstart currently shows the Foundry package as a prerelease package:
dotnet add package Microsoft.Agents.AI.Foundry --prerelease
You will also commonly need Azure identity support when using Azure AI Foundry:
dotnet add package Azure.Identity
The examples below focus on the integration pattern. Always check the current package versions and provider-specific setup for your application.
Creating a Simple SharpVector Knowledge Base
Let’s start by building a tiny in-memory knowledge base. In a real application, you would probably load markdown files, HTML pages, database records, PDFs converted to text, support tickets, or product documentation. For this example, we will keep it simple.
using Build5Nines.SharpVector;var vectorDatabase = new BasicMemoryVectorDatabase();vectorDatabase.AddText( """ Invoice processing runs as a background worker. The worker is configured using the InvoiceWorker section in appsettings.json. The polling interval is controlled by PollingIntervalSeconds. """, """ {"source":"docs/invoice-worker.md","title":"Invoice Worker Configuration"} """);vectorDatabase.AddText( """ Failed invoice jobs are retried three times before being moved to the dead-letter queue. Operations can reprocess dead-lettered invoices from the admin portal. """, """ {"source":"docs/invoice-retry-policy.md","title":"Invoice Retry Policy"} """);vectorDatabase.AddText( """ The payment export process runs nightly at 2:00 AM UTC and writes files to the configured storage account container. """, """ {"source":"docs/payment-export.md","title":"Payment Export"} """);var results = vectorDatabase.Search("How do invoice retries work?");foreach (var item in results.Texts){ Console.WriteLine(item.Text); Console.WriteLine(item.Metadata);}
SharpVector’s basic usage follows this same pattern: create a vector database, add text with metadata, and perform semantic search over the stored text.
The metadata is important. Do not treat it as an afterthought. It lets you return source names, URLs, document titles, IDs, tenant IDs, timestamps, or other information your agent can use to cite or explain where an answer came from.
Wrapping SharpVector as an Agent Tool
Now let’s wrap the search behavior in a class that can be exposed to Microsoft Agent Framework.
The goal is not to return a giant pile of text. The goal is to return a compact, useful context block that the agent can use. In RAG applications, more context is not always better context. Five relevant snippets usually beat fifty noisy ones.
using System.ComponentModel;using System.Text;using Build5Nines.SharpVector;public sealed class KnowledgeBaseTools{ private readonly BasicMemoryVectorDatabase _vectorDatabase; public KnowledgeBaseTools(BasicMemoryVectorDatabase vectorDatabase) { _vectorDatabase = vectorDatabase; } [Description("Searches the internal knowledge base for relevant documentation snippets.")] public string SearchKnowledgeBase( [Description("The user's question or search query.")] string query) { var results = _vectorDatabase.Search(query); var response = new StringBuilder(); foreach (var item in results.Texts.Take(5)) { response.AppendLine("CONTENT:"); response.AppendLine(item.Text); response.AppendLine(); response.AppendLine("METADATA:"); response.AppendLine(item.Metadata); response.AppendLine("---"); } return response.ToString(); }}
This method is intentionally boring. That is a compliment. Agent tools should be predictable, testable, and boring enough that you trust them in production.
The description attributes matter because they help the model understand what the tool does and what the parameter means. A model choosing between tools is not reading your mind. It is reading the names, descriptions, and schema you give it.
Adding the SharpVector Tool to a Microsoft Agent Framework Agent
Now we can create an agent and provide the SharpVector-backed search method as a tool.
using Azure.AI.Projects;using Azure.Identity;using Build5Nines.SharpVector;using Microsoft.Agents.AI;using Microsoft.Extensions.AI;// Create and populate the local vector database.var vectorDatabase = new BasicMemoryVectorDatabase();vectorDatabase.AddText( """ Invoice processing runs as a background worker. The worker is configured using the InvoiceWorker section in appsettings.json. The polling interval is controlled by PollingIntervalSeconds. """, """ {"source":"docs/invoice-worker.md","title":"Invoice Worker Configuration"} """);vectorDatabase.AddText( """ Failed invoice jobs are retried three times before being moved to the dead-letter queue. Operations can reprocess dead-lettered invoices from the admin portal. """, """ {"source":"docs/invoice-retry-policy.md","title":"Invoice Retry Policy"} """);// Create the tool object.var knowledgeTools = new KnowledgeBaseTools(vectorDatabase);// Expose the C# method as an Agent Framework function tool.var searchTool = AIFunctionFactory.Create( knowledgeTools.SearchKnowledgeBase);// Create the agent with the search tool attached.AIAgent agent = new AIProjectClient( new Uri("https://your-foundry-service.services.ai.azure.com/api/projects/your-foundry-project"), new AzureCliCredential()) .AsAIAgent( model: "gpt-5.4-mini", instructions: """ You are a helpful internal documentation assistant. When the user asks about internal systems, configuration, operations, troubleshooting, or project-specific behavior, use the knowledge base search tool before answering. Base your answer on the retrieved context. If the knowledge base does not contain enough information, say so clearly. """, tools: [searchTool]);var answer = await agent.RunAsync( "What happens when invoice processing fails?");Console.WriteLine(answer);
That is the essential integration. SharpVector is now available as a semantic retrieval tool that the agent can call when it needs internal knowledge.
The important part is not the number of lines of code. The important part is the boundary. The model is not directly rummaging through your app. Your tool decides what gets searched, how results are ranked, how many snippets are returned, and what metadata is exposed.
That boundary is where responsible AI engineering starts to become real software engineering.
Using OpenAI or Azure OpenAI Embeddings with SharpVector
SharpVector includes local embedding functionality, which is useful for lightweight scenarios and local experimentation. The documentation also shows support for OpenAI and Azure OpenAI embedding clients through Build5Nines.SharpVector.OpenAI, including BasicOpenAIMemoryVectorDatabase.
Here is what setup can look like with Azure OpenAI embeddings:
using Azure;using Azure.AI.OpenAI;using Build5Nines.SharpVector.OpenAI;var endpoint = new Uri("https://your-resource-name.openai.azure.com/");var apiKey = new AzureKeyCredential("your-api-key");// This should be the name of your embedding model deployment.var embeddingDeploymentName = "text-embedding-ada-002";var azureOpenAIClient = new AzureOpenAIClient(endpoint, apiKey);var embeddingClient = azureOpenAIClient.GetEmbeddingClient(embeddingDeploymentName);var vectorDatabase = new BasicOpenAIMemoryVectorDatabase(embeddingClient);await vectorDatabase.AddTextAsync( """ The reporting API supports CSV and JSON output. CSV exports are intended for finance users, while JSON output is intended for system integration. """, """ {"source":"docs/reporting-api.md","title":"Reporting API"} """);var results = await vectorDatabase.SearchAsync( queryText: "How do I export reports for finance?", threshold: 0.001f, pageIndex: 0, pageCount: 5);
Using higher-quality embeddings can improve retrieval relevance, especially once your text becomes more varied and domain-specific. Local embeddings are great for simplicity. Hosted embedding models are often better for semantic accuracy. That is the tradeoff.
As always, be mindful of what data you send to any external service. Internal documentation, customer records, regulated data, and source code may require additional review before they are embedded using a hosted model.
Chunking Matters More Than Developers Want It To
Here is an uncomfortable truth about RAG: the vector database is often blamed for problems caused by bad chunking.
If you add an entire 40-page document as one entry, the search result may be too broad. If you split every sentence into a separate entry, the result may lack enough context to be useful. Chunking is where a lot of RAG quality is won or lost.
Fortunately, Build5Nines.SharpVector already includes text chunking support through the TextDataLoader<TKey, TValue> class. Instead of writing your own chunking helper, you can let SharpVector split documents using built-in strategies such as paragraph, sentence, fixed-length, or overlapping-window chunking. The SharpVector documentation recommends chunking large documents to improve semantic match precision and reduce noise in search results.
For RAG scenarios, the overlapping window strategy is often a practical starting point because it keeps nearby context together while still breaking large documents into searchable chunks.
using Build5Nines.SharpVector;using Build5Nines.SharpVector.Data;var vectorDatabase = new BasicMemoryVectorDatabase();var loader = new TextDataLoader<int, string>(vectorDatabase);var documentText = await File.ReadAllTextAsync("docs/invoice-worker.md");loader.AddDocument(documentText, new TextChunkingOptions<string>{ Method = TextChunkingMethod.OverlappingWindow, // Size of each chunk. ChunkSize = 180, // Number of words to overlap between chunks. OverlapSize = 30, RetrieveMetadata = chunk => { return """ {"source":"docs/invoice-worker.md","title":"Invoice Worker Configuration"} """; }});
That replaces the custom ChunkText(...) method entirely. The important part is the RetrieveMetadata delegate. SharpVector associates metadata with each generated chunk, which means your search results can still return source information back to the agent.
If your content is well-structured documentation, TextChunkingMethod.Paragraph may be a better fit:
loader.AddDocument(documentText, new TextChunkingOptions<string>{ Method = TextChunkingMethod.Paragraph, RetrieveMetadata = chunk => { return """ {"source":"docs/invoice-worker.md","title":"Invoice Worker Configuration"} """; }});
For long operational docs, runbooks, or exported knowledge base articles, overlapping windows are usually a safer default. For clean markdown or prose-style documentation, paragraph chunking may produce more natural retrieval results. The right answer depends on your content, so test with real user questions instead of synthetic examples. RAG quality is one of those places where your production data will humble your assumptions very quickly.
Building a Reusable Knowledge Base Service
As your application grows, you probably do not want vector database setup, document loading, metadata formatting, and search behavior scattered through Program.cs. A cleaner approach is to create a small service that owns indexing and retrieval, then expose that service to the agent through a focused tool method.
Build5Nines.SharpVector already includes document chunking support through TextDataLoader<TKey, TValue>, so we do not need to maintain our own ChunkText(...) helper. The TextDataLoader can automatically split documents using strategies like paragraph, sentence, fixed-length, or overlapping-window chunking, while assigning metadata to every generated chunk. That metadata is stored with the vector and returned in search results, which is exactly what we want for RAG-style agent responses.
using System.Text;using System.Text.Json;using Build5Nines.SharpVector;using Build5Nines.SharpVector.Data;public sealed class LocalKnowledgeBase{ private readonly BasicMemoryVectorDatabase _vectorDatabase = new(); private readonly TextDataLoader<int, string> _loader; public LocalKnowledgeBase() { _loader = new TextDataLoader<int, string>(_vectorDatabase); } public void AddDocument( string text, string source, string title, TextChunkingMethod chunkingMethod = TextChunkingMethod.OverlappingWindow) { _loader.AddDocument(text, new TextChunkingOptions<string> { Method = chunkingMethod, // Used by FixedLength and OverlappingWindow chunking. ChunkSize = 180, // Used by OverlappingWindow chunking. // This helps preserve nearby context between chunks. OverlapSize = 30, RetrieveMetadata = chunk => { return JsonSerializer.Serialize(new { source, title, chunkSize = chunk.Length, indexedAtUtc = DateTime.UtcNow.ToString("o") }); } }); } public string Search(string query, int maxResults = 5) { var results = _vectorDatabase.Search(query); var builder = new StringBuilder(); foreach (var item in results.Texts.Take(maxResults)) { builder.AppendLine("CONTENT:"); builder.AppendLine(item.Text); builder.AppendLine(); builder.AppendLine("METADATA:"); builder.AppendLine(item.Metadata); builder.AppendLine("---"); } return builder.ToString(); }}
This version keeps chunking inside SharpVector where it belongs. The service decides how documents are loaded, which chunking strategy to use, and what metadata should travel with each chunk. The agent does not need to know any of that. It only needs a reliable way to search for relevant context.
Then your agent tool stays small and focused:
using System.ComponentModel;public sealed class KnowledgeBaseAgentTools{ private readonly LocalKnowledgeBase _knowledgeBase; public KnowledgeBaseAgentTools(LocalKnowledgeBase knowledgeBase) { _knowledgeBase = knowledgeBase; } [Description("Searches internal documentation and returns relevant context for answering the user.")] public string SearchInternalDocs( [Description("The question or topic to search for.")] string query) { return _knowledgeBase.Search(query); }}
You can load documents into the knowledge base like this:
var knowledgeBase = new LocalKnowledgeBase();var invoiceWorkerDoc = await File.ReadAllTextAsync("docs/invoice-worker.md");knowledgeBase.AddDocument( text: invoiceWorkerDoc, source: "docs/invoice-worker.md", title: "Invoice Worker Configuration", chunkingMethod: TextChunkingMethod.OverlappingWindow);
For long runbooks, operational docs, and exported knowledge base articles, OverlappingWindow is usually a practical starting point because it keeps nearby context together. For clean markdown or documentation with well-structured paragraphs, Paragraph chunking may produce more natural retrieval results. SharpVector supports multiple chunking methods, but the right choice depends on your content and retrieval goals, so test against real developer or user questions instead of guessing.
Using SharpVector in an Agent Workflow
Microsoft Agent Framework is not limited to a single conversational agent with a handful of tools. It also supports workflow-style orchestration, which is useful when you want more control over the steps an AI application follows. Instead of letting the agent decide whether retrieval is needed, you can make retrieval an explicit part of the workflow.
That distinction matters. Tool calling is flexible, but workflows are intentional. If every answer in your application must be grounded in internal documentation, then retrieval should not be optional. It should be a required step.
With SharpVector, that means your workflow can index documents using the same LocalKnowledgeBase service we created earlier, with SharpVector’s built-in text chunking support handling the document splitting. Then, when a user asks a question, the workflow retrieves relevant context before the agent generates an answer.
The flow looks like this:
User request ↓Search SharpVector knowledge base ↓Return relevant chunked context with metadata ↓Send question + retrieved context to the agent ↓Generate grounded response
In this pattern, SharpVector is not exposed as a tool the model chooses to call. Instead, your application calls SharpVector deterministically before invoking the agent. That gives you a little less model autonomy, but a lot more predictability.
Here is a simplified example using the reusable LocalKnowledgeBase service from the previous section:
using Azure.AI.Projects;using Azure.Identity;using Microsoft.Agents.AI;var knowledgeBase = new LocalKnowledgeBase();knowledgeBase.AddDocument( text: await File.ReadAllTextAsync("docs/invoice-worker.md"), source: "docs/invoice-worker.md", title: "Invoice Worker Configuration", chunkingMethod: TextChunkingMethod.OverlappingWindow);knowledgeBase.AddDocument( text: await File.ReadAllTextAsync("docs/invoice-retry-policy.md"), source: "docs/invoice-retry-policy.md", title: "Invoice Retry Policy", chunkingMethod: TextChunkingMethod.OverlappingWindow);AIAgent agent = new AIProjectClient( new Uri("https://your-foundry-service.services.ai.azure.com/api/projects/your-foundry-project"), new AzureCliCredential()) .AsAIAgent( model: "gpt-5.4-mini", instructions: """ You are a practical internal documentation assistant. Answer the user's question using only the provided retrieved context. If the context does not contain enough information, say that the documentation does not provide enough information. """);var question = "How many times are failed invoices retried?";// Retrieval happens before the agent runs.// This makes grounding a deterministic application behavior,// not a model decision.var retrievedContext = knowledgeBase.Search(question, maxResults: 5);var prompt = $""" User question: {question} Retrieved context: {retrievedContext} Write a clear, concise answer based only on the retrieved context. """;var response = await agent.RunAsync(prompt);Console.WriteLine(response);
This approach works well for documentation assistants, support copilots, compliance workflows, and internal engineering tools where every response should be grounded in a known set of retrieved content. It also makes testing easier because retrieval is a normal C# method call. You can run the same question through knowledgeBase.Search(...), inspect the returned chunks and metadata, and tune your chunking strategy before the model ever sees the prompt.
The tradeoff is that deterministic retrieval can sometimes be unnecessary. If the user says “hello” or asks the agent to summarize a conversation, you may not need to search the knowledge base. In those cases, tool-based retrieval gives the agent more flexibility. But when correctness matters more than conversational cleverness, putting SharpVector directly in the workflow is often the better engineering choice.
A good rule of thumb is this: use a tool when retrieval is optional, and use a workflow step when retrieval is required.
Common Mistakes to Avoid
The first mistake is treating RAG as a magic accuracy button. Adding vector search improves the agent’s access to information, but it does not automatically make every answer correct. You still need good instructions, good chunking, good metadata, and realistic tests.
The second mistake is returning too much context. Developers love to solve uncertainty by adding more data. Models, however, can get distracted by irrelevant context. Return the best few snippets you can, not the entire company wiki.
The third mistake is hiding uncertainty. Your agent should be allowed to say, “I do not have enough information in the knowledge base to answer that.” That is not a failure. That is honesty, and users will trust it more than a beautifully formatted guess.
The fourth mistake is skipping source metadata. Even if you do not display citations in the first version, store metadata from day one. Source names, document IDs, URLs, timestamps, and categories become incredibly useful later.
The fifth mistake is using an in-memory database for a problem that needs durable, distributed storage. SharpVector is excellent for embedded local search, development, testing, demos, and smaller application scenarios. If you need high availability, multi-node indexing, tenant-scale filtering, or massive document volumes, you should evaluate a managed or persistent vector search platform.
Practical Checklist for Developers
Before you wire SharpVector into an Agent Framework application, ask a few practical questions:
- What content should the agent be allowed to search?
- How will documents be chunked?
- What metadata should be stored with each chunk?
- Should retrieval be optional through a tool, or mandatory in a workflow?
- Are local embeddings good enough, or do you need hosted embedding models?
- Does the data require privacy, compliance, or security review?
- How will you test answer quality?
- What happens when no relevant result is found?
- Do you need persistence, or is rebuilding the in-memory index acceptable?
That last question is especially important. In-memory search is wonderfully simple, but memory is not a filing cabinet. If your application restarts, you need a plan to rebuild the index or persist the data using SharpVector’s persistence capabilities or another storage layer.
When to Move Beyond SharpVector
SharpVector is a great local vector database option for .NET applications, but good architecture includes knowing when to outgrow a tool.
You should consider a larger search platform when you need distributed indexing, millions of records, advanced filtering, multi-tenant isolation, high availability, centralized operations, complex security trimming, or operational monitoring at enterprise scale.
That does not make SharpVector less useful. In fact, it makes its role clearer. SharpVector is a great way to build, learn, prototype, embed, and ship lightweight semantic search without dragging a large infrastructure footprint into every project.
The upgrade path can be straightforward if you keep your retrieval behind an interface:
public interface IKnowledgeRetriever{ Task<string> SearchAsync(string query, CancellationToken cancellationToken = default);}
Today that interface can use SharpVector. Tomorrow it can use Azure AI Search, PostgreSQL with pgvector, Cosmos DB, Elasticsearch, or another vector store. Your agent does not need to know. That is the kind of boring abstraction that saves teams later.
A More Complete Example
Here is a compact end-to-end example that shows the basic shape of the application after updating the knowledge base service to use SharpVector’s built-in text chunking support. The agent still sees a simple search tool. The difference is that document loading and chunking are now handled by SharpVector through TextDataLoader<TKey, TValue> and TextChunkingOptions<TValue> instead of a custom ChunkText(...) helper.
using System.ComponentModel;using System.Text;using System.Text.Json;using Azure.AI.Projects;using Azure.Identity;using Build5Nines.SharpVector;using Build5Nines.SharpVector.Data;using Microsoft.Agents.AI;using Microsoft.Extensions.AI;var knowledgeBase = new LocalKnowledgeBase();knowledgeBase.AddDocument( text: """ Invoice processing runs as a background worker. Configure it using the InvoiceWorker section in appsettings.json. PollingIntervalSeconds controls how often the worker checks for new invoices. """, source: "docs/invoice-worker.md", title: "Invoice Worker Configuration", chunkingMethod: TextChunkingMethod.OverlappingWindow);knowledgeBase.AddDocument( text: """ Failed invoice jobs are retried three times. After the third failure, the invoice is moved to the dead-letter queue. Operations users can reprocess dead-lettered invoices from the admin portal. """, source: "docs/invoice-retry-policy.md", title: "Invoice Retry Policy", chunkingMethod: TextChunkingMethod.OverlappingWindow);var tools = new KnowledgeBaseAgentTools(knowledgeBase);AIAgent agent = new AIProjectClient( new Uri("https://your-foundry-service.services.ai.azure.com/api/projects/your-foundry-project"), new AzureCliCredential()) .AsAIAgent( model: "gpt-5.4-mini", instructions: """ You are a practical internal documentation assistant. Use the SearchInternalDocs tool when answering questions about internal application behavior, configuration, operations, or troubleshooting. Answer from the retrieved context. If the retrieved context does not contain the answer, say that the documentation does not provide enough information. """, tools: [AIFunctionFactory.Create(tools.SearchInternalDocs)]);var response = await agent.RunAsync( "How many times are failed invoices retried before they go to dead-letter?");Console.WriteLine(response);public sealed class KnowledgeBaseAgentTools{ private readonly LocalKnowledgeBase _knowledgeBase; public KnowledgeBaseAgentTools(LocalKnowledgeBase knowledgeBase) { _knowledgeBase = knowledgeBase; } [Description("Searches internal documentation and returns relevant context.")] public string SearchInternalDocs( [Description("The user's question or search query.")] string query) { return _knowledgeBase.Search(query); }}public sealed class LocalKnowledgeBase{ private readonly BasicMemoryVectorDatabase _vectorDatabase = new(); private readonly TextDataLoader<int, string> _loader; public LocalKnowledgeBase() { _loader = new TextDataLoader<int, string>(_vectorDatabase); } public void AddDocument( string text, string source, string title, TextChunkingMethod chunkingMethod = TextChunkingMethod.OverlappingWindow) { _loader.AddDocument(text, new TextChunkingOptions<string> { Method = chunkingMethod, // Used by FixedLength and OverlappingWindow chunking. ChunkSize = 180, // Used by OverlappingWindow chunking. // This helps preserve nearby context between chunks. OverlapSize = 30, RetrieveMetadata = chunk => { return JsonSerializer.Serialize(new { source, title, chunkSize = chunk.Length, indexedAtUtc = DateTime.UtcNow.ToString("o") }); } }); } public string Search(string query, int maxResults = 5) { var results = _vectorDatabase.Search(query); var builder = new StringBuilder(); foreach (var item in results.Texts.Take(maxResults)) { builder.AppendLine("CONTENT:"); builder.AppendLine(item.Text); builder.AppendLine(); builder.AppendLine("METADATA:"); builder.AppendLine(item.Metadata); builder.AppendLine("---"); } return builder.ToString(); }}
This version keeps the agent-facing tool small while moving chunking and indexing into the reusable knowledge base service. That is the right separation of concerns. The agent does not need to know whether the content was split by paragraph, sentence, fixed length, or overlapping window. It only needs a reliable tool that returns useful context.
For long runbooks, operational documentation, exported knowledge base articles, and internal markdown files, TextChunkingMethod.OverlappingWindow is a strong default because it preserves nearby context across chunks. For documentation that is already cleanly organized into short paragraphs, TextChunkingMethod.Paragraph may be a better fit.
The bigger lesson is simple: keep retrieval boring, explicit, and testable. Your agent will be much easier to improve when document loading, chunking, search, and response generation each have a clear job.
Final Thoughts
The value of combining Build5Nines.SharpVector with Microsoft Agent Framework is not that it creates some overly complicated AI architecture. The value is the opposite: it gives .NET developers a simple, understandable way to build agents that can search local knowledge before answering.
Microsoft Agent Framework gives you the agent and workflow foundation. SharpVector gives you embedded semantic retrieval. Together, they let you build AI applications that are more grounded, more useful, and easier to reason about.
This matters because the future of AI development is not just “call a model and hope.” The better pattern is to connect models to the right data, the right tools, and the right workflow boundaries. That is where developers, architects, DevOps engineers, and cloud teams can bring real engineering discipline to AI systems.
Key Takeaways
- SharpVector is a strong fit for local RAG in .NET applications, especially when you want lightweight semantic search without deploying a separate vector database.
- Microsoft Agent Framework can expose C# methods as function tools, making it natural to wrap SharpVector search as an agent capability.
- Good chunking and metadata matter as much as the vector database itself.
- Use SharpVector for embedded, local, prototype, demo, and small-to-medium knowledge scenarios, but consider larger vector platforms when you need enterprise-scale durability and operations.
- Keep retrieval behind a clean abstraction so you can swap storage implementations later without rewriting your agent.
Have you started adding RAG or semantic search to your .NET agents yet? I’d love to hear what patterns, tools, and lessons you’re finding useful as the .NET AI ecosystem keeps evolving.