TL;DR
This is Part 3 of the Microsoft Agent Framework series. Part 1 built agents locally with tools, sessions, and memory. Part 2 wired them into workflow graphs, MCP servers, and AG-UI frontends. This post moves everything to Azure AI Foundry — same AIAgent / RunAsync() API, but now the agents live server-side with managed lifecycle, hosted tools (code interpreter, web search, file search), declarative workflows, and built-in evaluations.
- TL;DR
- Introduction - Why Foundry?
- First Foundry agent
- Observability - OpenTelemetry + Foundry traces
- Persistent sessions
- Function tools
- Hosted tools - Code Interpreter and Web Search
- RAG via Foundry
- Foundry workflows
- Evaluations
- Key takeaways
- Presentation
- References
Source code: https://github.com/NikiforovAll/maf-getting-started
Introduction - Why Foundry?
Parts 1 and 2 ran everything in-process. Your app created agents, held their state in memory, and managed tool execution locally. That works for development, but production asks harder questions: where do agents live between requests? Who manages the Python sandbox for code execution? Where does the vector store run?
Azure AI Foundry answers these by moving agent lifecycle, tool execution, and data storage to the cloud. The programming model stays the same – you still call RunAsync() on an AIAgent. The difference is what happens behind the call.
| MAF (local) | Azure AI Foundry | |
|---|---|---|
| Agent lifecycle | In-process only | Server-side (named + versioned) |
| Tools | Client-side AIFunction | Client-side + hosted (Code, Search, Web) |
| Memory | InMemoryChatHistoryProvider | Managed conversations |
| RAG | Build your own | Hosted vector stores + HostedFileSearchTool |
| Evaluation | N/A | Built-in quality + safety evaluators |
To switch, you swap one package and one client:
#:package Microsoft.Agents.AI.AzureAI@1.0.0-rc4
#:package Azure.AI.Projects@2.0.0-beta.1
// Before: AzureOpenAIClient → GetChatClient → AsAIAgent
// After:
AIProjectClient aiProjectClient = new(new Uri(endpoint), new DefaultAzureCredential());
Everything else – RunAsync(), RunStreamingAsync(), tools, sessions – stays the same.
First Foundry agent
CreateAIAgentAsync creates a Foundry-side agent. Foundry stores it with a name and a version number. Each call with the same name bumps the version; GetAIAgentAsync retrieves the latest.
AIProjectClient aiProjectClient = new(new Uri(endpoint), new DefaultAzureCredential());
// Foundry-side agent -- named, versioned, persisted in Foundry
AIAgent agent = await aiProjectClient.CreateAIAgentAsync(
name: "FoundryBasicsAgent",
model: deploymentName,
instructions: "You are a friendly assistant. Keep your answers brief.");
// Non-streaming
Console.WriteLine(await agent.RunAsync("Tell me a fun fact about Azure."));
// Streaming
await foreach (var update in agent.RunStreamingAsync("Tell me a fun fact about .NET."))
{
Console.Write(update);
}
// Cleanup
await aiProjectClient.Agents.DeleteAgentAsync(agent.Name);
Agent definitions are immutable after creation. To change instructions or tools, create a new version:
AIAgent v1 = await aiProjectClient.CreateAIAgentAsync(
name: "MyAgent", model: "gpt-4o-mini",
instructions: "You are helpful.");
AIAgent v2 = await aiProjectClient.CreateAIAgentAsync(
name: "MyAgent", model: "gpt-4o-mini",
instructions: "You are extremely helpful and concise.");
// Returns v2
AIAgent latest = await aiProjectClient.GetAIAgentAsync(name: "MyAgent");
Observability - OpenTelemetry + Foundry traces
With Foundry agents, traces show up in two places:
| OTEL (client-side) | Server-side (Foundry) | |
|---|---|---|
| What | Agent spans, chat calls, duration | Token counts, cost, response IDs |
| Where | Aspire Dashboard / any OTLP backend | Foundry Portal -> Traces tab |
| How | .UseOpenTelemetry() + OTLP exporter | Automatic -- built into Foundry |
The same Trace ID links both sides. You see the agent execution flow in Aspire, then jump to Foundry Portal for token counts and cost.
// OTEL setup -- exports to Aspire dashboard
using var tracerProvider = Sdk.CreateTracerProviderBuilder()
.SetResourceBuilder(ResourceBuilder.CreateDefault().AddService("FoundryBasicsDemo"))
.AddSource("FoundryBasicsDemo")
.AddSource("*Microsoft.Agents.AI")
.AddOtlpExporter()
.Build();
// Wrap agent with telemetry
AIAgent agent = (await aiProjectClient.CreateAIAgentAsync(
name: "FoundryBasicsAgent", model: deploymentName,
instructions: "You are a friendly assistant."))
.AsBuilder()
.UseOpenTelemetry(sourceName: "FoundryBasicsDemo")
.Build();
// Parent span groups related calls
using var activitySource = new ActivitySource("FoundryBasicsDemo");
using var activity = activitySource.StartActivity("foundry-basics-demo");
Console.WriteLine($"Trace ID: {activity?.TraceId}");
await agent.RunAsync("Tell me a fun fact about Azure.");
await agent.RunStreamingAsync("Tell me a fun fact about .NET.");
Print the Trace ID, then search for it in Foundry Portal to see the server-side view with token counts and cost breakdown.
Persistent sessions
In Part 1, conversation history lived in memory and died with the process. Foundry gives you server-side conversations that persist across sessions. Store only the conversation.Id in your database – Foundry keeps the full thread.
using Azure.AI.Projects.OpenAI;
// Create a server-side conversation
ProjectConversationsClient conversationsClient = aiProjectClient
.GetProjectOpenAIClient()
.GetProjectConversationsClient();
ProjectConversation conversation = await conversationsClient.CreateProjectConversationAsync();
// Session 1: establish context
AgentSession session1 = await agent.CreateSessionAsync(conversation.Id);
Console.WriteLine(await agent.RunAsync("My name is Alex.", session1));
// Session 2: new session, same conversation -- agent remembers
AgentSession session2 = await agent.CreateSessionAsync(conversation.Id);
Console.WriteLine(await agent.RunAsync("What's my name?", session2));
// -> "Your name is Alex."
The conversation is visible in the Foundry Portal too, so you can inspect the full message history without writing a single line of debugging code.
Function tools
Same pattern as Part 1 – define C# methods with [Description], register them via AIFunctionFactory.Create(). The difference: Foundry stores the tool JSON schemas server-side. The client still provides the actual implementations.
[Description("Get the current time in a given timezone.")]
static string GetTime([Description("The timezone (e.g., UTC, CET)")] string timezone) =>
$"The current time in {timezone} is {DateTime.UtcNow:HH:mm} UTC.";
AITool[] tools = [AIFunctionFactory.Create(GetTime)];
// Server stores tool schemas, client provides implementations
AIAgent agent = await aiProjectClient.CreateAIAgentAsync(
name: "TimeAgent", model: deploymentName,
instructions: "You are a helpful assistant with time tool.",
tools: tools);
Console.WriteLine(await agent.RunAsync("What's the time in Kyiv?"));
// Retrieve existing agent -- must pass tools so MAF can invoke them
AIAgent existing = await aiProjectClient.GetAIAgentAsync(name: "TimeAgent", tools: tools);
Console.WriteLine(await existing.RunAsync("What time is it in UTC?"));
When you retrieve an agent with GetAIAgentAsync, the server already knows the tool schemas. But it can’t invoke your C# methods – you have to pass the tools array so MAF can wire up the calls.
Hosted tools - Code Interpreter and Web Search
So far, all tools ran in your process. Hosted tools flip that – they run server-side in Foundry’s infrastructure. No local dependencies, no sandbox to manage.
| Client tools (Parts 1-2) | Hosted tools (Foundry) | |
|---|---|---|
| Execution | Your process | Foundry cloud sandbox |
| Setup | Define + implement | One-liner -- Foundry provides the runtime |
| Examples | AIFunctionFactory.Create(...) | HostedCodeInterpreterTool, HostedWebSearchTool, HostedFileSearchTool |
| Use case | Custom business logic | Python execution, web search, file search |
Code Interpreter
HostedCodeInterpreterTool gives the agent a Python sandbox. It writes code, Foundry runs it, and you get back the results.
AIAgent agent = await aiProjectClient.CreateAIAgentAsync(
model: deploymentName,
name: "MathTutor",
instructions: "You are a math tutor. Write and run Python code to solve problems.",
tools: [new HostedCodeInterpreterTool() { Inputs = [] }]);
AgentResponse response = await agent.RunAsync(
"Solve x^3 - 6x^2 + 11x - 6 = 0. Show the roots.");
The response contains a mix of content types. Walk through them to see the full execution flow – thinking, code, output, answer:
foreach (var content in response.Messages.SelectMany(m => m.Contents))
{
switch (content)
{
case TextContent text:
Console.WriteLine($"Text: {text.Text}");
break;
case CodeInterpreterToolCallContent toolCall:
var codeInput = toolCall.Inputs?.OfType<DataContent>().FirstOrDefault();
if (codeInput?.HasTopLevelMediaType("text") ?? false)
{
string code = Encoding.UTF8.GetString(codeInput.Data.ToArray());
Console.WriteLine($"Python: {code}");
}
break;
case CodeInterpreterToolResultContent toolResult:
foreach (var output in toolResult.Outputs ?? [])
{
if (output is TextContent tc)
Console.WriteLine($"Output: {tc.Text}");
}
break;
}
}
Web Search
HostedWebSearchTool lets the agent search the web autonomously and return answers with annotated URL citations.
AIAgent agent = await aiProjectClient.CreateAIAgentAsync(
name: "WebSearchAgent",
model: deploymentName,
instructions: "Search the web to answer questions accurately. Cite your sources.",
tools: [new HostedWebSearchTool()]);
AgentResponse response = await agent.RunAsync("What are the latest features in .NET 10?");
Console.WriteLine(response.Text);
// Extract URL citations
foreach (var annotation in response.Messages
.SelectMany(m => m.Contents)
.SelectMany(c => c.Annotations ?? []))
{
if (annotation.RawRepresentation is UriCitationMessageAnnotation urlCitation)
{
Console.WriteLine($" - {urlCitation.Title}: {urlCitation.Uri}");
}
}
RAG via Foundry
Building RAG usually means picking an embedding model, setting up a vector database, writing a chunking pipeline, and wiring it all together. Foundry collapses that into a few API calls:
| Step | API | What happens |
|---|---|---|
| 1. Upload file | filesClient.UploadFile() | File stored in Foundry |
| 2. Create vector store | vectorStoresClient.CreateVectorStoreAsync() | Auto-chunked + embedded |
| 3. Create agent | CreateAIAgentAsync(tools: [HostedFileSearchTool]) | Agent grounded on your data |
| 4. Ask questions | agent.RunAsync() | Grounded answers with citations |
| 5. Cleanup | Delete agent, vector store, file | No orphan resources |
var projectOpenAIClient = aiProjectClient.GetProjectOpenAIClient();
var filesClient = projectOpenAIClient.GetProjectFilesClient();
var vectorStoresClient = projectOpenAIClient.GetProjectVectorStoresClient();
// Upload knowledge base
OpenAIFile uploaded = filesClient.UploadFile(tempFile, FileUploadPurpose.Assistants);
// Create vector store -- auto-chunks and embeds
var vectorStore = await vectorStoresClient.CreateVectorStoreAsync(
options: new() { FileIds = { uploaded.Id }, Name = "contoso-products" });
string vectorStoreId = vectorStore.Value.Id;
// Create agent with file search grounded on the vector store
AIAgent agent = await aiProjectClient.CreateAIAgentAsync(
model: deploymentName,
name: "RAGAgent",
instructions: "Answer questions using the product catalog. Cite the source.",
tools: [new HostedFileSearchTool()
{
Inputs = [new HostedVectorStoreContent(vectorStoreId)]
}]);
// Multi-turn Q&A
var session = await agent.CreateSessionAsync();
Console.WriteLine(await agent.RunAsync("What's the cheapest product?", session));
Console.WriteLine(await agent.RunAsync("Which product supports CI/CD?", session));
Foundry workflows
Part 2 built workflows in-process with WorkflowBuilder and AddEdge(). Foundry workflows take a different approach – you declare the agent graph in YAML, register it server-side, and Foundry orchestrates the execution.
Here’s a workflow where a storyteller writes a story and a critic reviews it:
kind: Workflow
trigger:
kind: OnConversationStart
id: story_critic_workflow
actions:
- kind: InvokeAzureAgent
id: storyteller_step
conversationId: =System.ConversationId
agent:
name: StorytellerAgent
- kind: InvokeAzureAgent
id: critic_step
conversationId: =System.ConversationId
agent:
name: CriticAgent
First create the agents, then register and run the workflow:
// Create the agents that the workflow will orchestrate
await aiProjectClient.CreateAIAgentAsync(
name: "StorytellerAgent", model: deploymentName,
instructions: "You are a creative storyteller. Write a short story based on the user's prompt.");
await aiProjectClient.CreateAIAgentAsync(
name: "CriticAgent", model: deploymentName,
instructions: "You are a literary critic. Review the story and provide constructive feedback.");
// Register workflow via raw JSON (the SDK wraps the YAML in a JSON envelope)
string escapedYaml = JsonEncodedText.Encode(workflowYaml).ToString();
string requestJson = $$"""
{
"definition": { "kind": "workflow", "workflow": "" },
"description": "Storyteller writes a story, Critic reviews it."
}
""";
await aiProjectClient.Agents.CreateAgentVersionAsync(
"StoryCriticWorkflow",
BinaryContent.Create(BinaryData.FromString(requestJson)),
foundryFeatures: null, options: null);
// Run with streaming
ChatClientAgent workflowAgent = await aiProjectClient.GetAIAgentAsync(
name: "StoryCriticWorkflow");
AgentSession session = await workflowAgent.CreateSessionAsync();
ChatClientAgentRunOptions runOptions = new(
new ChatOptions { ConversationId = conversation.Id });
await foreach (var update in workflowAgent.RunStreamingAsync(
"Write a story about a robot who discovers music.", session, runOptions))
{
Console.Write(update.Text);
}
Same RunStreamingAsync API as always. Foundry handles the orchestration – each agent produces a separate message in the stream.
Evaluations
Before shipping an agent to production, you want to know: are its answers grounded in the context you provided? Are they relevant? Coherent? Safe? Foundry’s evaluation library runs all of these in a single pass.
| Dimension | Evaluator | What it measures |
|---|---|---|
| Groundedness | GroundednessEvaluator | Are answers grounded in provided context? |
| Relevance | RelevanceEvaluator | Does the answer address the question? |
| Coherence | CoherenceEvaluator | Is the response well-structured and logical? |
| Safety | ContentHarmEvaluator | Violence, self-harm, sexual, hate content |
Quality evaluators (groundedness, relevance, coherence) use an LLM as a judge. The safety evaluator uses Azure AI Foundry’s content safety service – a separate endpoint, not an LLM call.
// Evaluator LLM (the judge)
IChatClient chatClient = new AzureOpenAIClient(new Uri(openAiEndpoint), credential)
.GetChatClient(evaluatorDeployment)
.AsIChatClient();
// Safety evaluator needs the Foundry content safety endpoint
ContentSafetyServiceConfiguration safetyConfig = new(
credential: credential, endpoint: new Uri(endpoint));
ChatConfiguration chatConfiguration = safetyConfig.ToChatConfiguration(
originalChatConfiguration: new ChatConfiguration(chatClient));
// Compose all evaluators
CompositeEvaluator evaluator = new([
new GroundednessEvaluator(),
new RelevanceEvaluator(),
new CoherenceEvaluator(),
new ContentHarmEvaluator(),
]);
// Run evaluation
List<ChatMessage> messages = [new(ChatRole.User, question)];
ChatResponse chatResponse = new(new ChatMessage(ChatRole.Assistant, agentResponse.Text));
EvaluationResult result = await evaluator.EvaluateAsync(
messages, chatResponse, chatConfiguration,
additionalContext: [new GroundednessEvaluatorContext(context)]);
// Read scores
foreach (var metric in result.Metrics.Values)
{
if (metric is NumericMetric n)
Console.WriteLine($"{n.Name}: {n.Value:F1}/5 ({n.Interpretation?.Rating})");
else if (metric is BooleanMetric b)
Console.WriteLine($"{b.Name}: {b.Value} ({b.Interpretation?.Rating})");
}
Quality metrics score 0-5. Safety metrics are boolean (pass/fail). Run these in CI or as a gate before deployment – you’ll catch regressions in answer quality and safety issues before users do.
Key takeaways
AIProjectClient.CreateAIAgentAsync()– sameAIAgentAPI, server-managed lifecycle with name + version semantics.UseOpenTelemetry()+ Aspire – client spans correlated with Foundry server traces via shared Trace IDProjectConversation– server-side persistent conversations, store only the IDHostedCodeInterpreterTool/HostedWebSearchTool/HostedFileSearchTool– hosted tools that run in Foundry, zero infrastructure on your side- File upload + vector store +
HostedFileSearchTool– RAG without managing an embedding pipeline or vector database - Declarative YAML workflows – server-side multi-agent orchestration, same
RunStreamingAsyncAPI CompositeEvaluator– quality + safety scoring in a single pass before production
Presentation
References
- Topics:
- dotnet (63) ·
- ai (18) ·
- dotnet (68) ·
- maf (3) ·
- agents (12) ·
- microsoft-extensions-ai (3) ·
- microsoft-agent-framework (3) ·
- azure-ai-foundry (1)