Net+AI智能体进阶1：NET平台AI底座

2025-08-16 19:00:03

一、M.E.AI概述

1. 引言

Microsoft.Extensions.AI (MEAI) 定位于.NET 生态系统的 AI 功能基础抽象层，提供如 IChatClient 和 IEmbeddingGenerator 等核心接口，旨在统一和简化.NET 应用与各类 AI 服务的集成方式。

Microsoft.Extensions.AI (MEAI) 是一系列旨在为.NET 开发者提供与各种人工智能服务进行集成和交互的统一方法的库。它的核心目标是提供一组通用的抽象，从而简化.NET 应用程序中生成式 AI 组件的表示，并实现与不同 AI 服务的无缝集成和互操作性。 MEAI 的核心功能主要围绕两个关键接口展开：ChatClient 和 IEmbeddingGenerator<TInput, TEmbedding>。

2. MEAI 的核心功能和优势

IChatClient：用于与聊天型 AI 服务交互的客户端接口，支持多模态消息传递和流式响应。
IEmbeddingGenerator<TInput, TEmbedding>：用于生成向量嵌入的通用接口，支持多种输入类型。
依赖注入 (DI) 和中间件支持：利用.NET 的成熟 DI 和中间件模式，简化组件集成。这使得开发者可以轻松地将自动函数工具调用、遥测和缓存等功能集成到应用程序中。
服务无关性：MEAI 的设计目标是实现与特定 AI 服务的解耦，使得开发者可以在不同的 AI 提供商之间轻松切换，而无需修改应用代码。这不仅提高了代码的可移植性，还简化了测试和模拟过程。
多模态支持：IChatClient 接口支持文本、图像和音频等多种消息类型，满足现代 AI 应用的需求。
流式响应：IChatClient 支持流式响应，允许应用程序逐步处理来自 AI 服务的输出，提升用户体验。
扩展性：MEAI 的设计允许开发者为不同的 AI 服务实现自定义的客户端和嵌入生成器，促进生态系统的多样化和创新。
与 Semantic Kernel 的集成：MEAI 提供了与 Semantic Kernel 的无缝集成，允许开发者利用 SK 的高级功能，同时享受 MEAI 提供的统一接口和抽象。
与 Microsoft Agent Framework 的集成：MEAI 还与 Microsoft Agent Framework 集成，支持构建智能代理应用程序，进一步扩展了其在 AI 生态系统中的应用范围。

3. 使用指南

如果要在 .NET 应用程序中使用 Microsoft.Extensions.AI，需要在项目中安装以下NuGet 包。然后根据使用的 AI 服务提供商，安装相应的客户端实现包。

对于兼容 OpenAI API 的服务，可以使用 Microsoft.Extensions.AI.OpenAI 包中的 OpenAIClient 即可。

OpenAIClientOptions clientOptions = new OpenAIClientOptions();
clientOptions.Endpoint = new Uri(deepseekUri);

OpenAIClient aiClient = new(new ApiKeyCredential(deepseekApiKey), clientOptions);

对于Ollama 服务，可使用 OllamaSharp 包中的 OllamaApiClient，它实现了 IChatClient 接口。

var aiClient = new OllamaApiClient(arguments.Uri, arguments.Model);

对于 Azure OpenAI 服务，可以使用 Azure.AI.OpenAI 包中的 AzureOpenAIClient。

var aiClient = new AzureOpenAIClient( endpoint: endpoint, credential: new ApiKeyCredential(apiKey));

4. 使用IChatClient

IChatClient 主要定义了三个关键方法

GetResponseAsync(IEnumerable messages, ChatOptions? options = null, CancellationToken ct = default) → Task
GetStreamingResponseAsync(IEnumerable messages, ChatOptions? options = null, CancellationToken ct = default) → IAsyncEnumerable
GetService(Type serviceType, object? serviceKey = null) → object?（用于向外暴露元数据/内部服务，如 ChatClientMetadata、OpenTelemetry 组件等）

请求/响应核心对象

ChatMessage：一条消息，含 Role、Contents（多模态：TextContent、FunctionCallContent、Image 等），Text 为所有 TextContent 拼接。
ChatOptions：本次请求的行为配置，如 Temperature、TopP、TopK、StopSequences、MaxOutputTokens、ModelId、ToolMode、Tools、ResponseFormat（Text/Json/JsonSchema）、ConversationId、AllowMultipleToolCalls 等；可 Clone()。
ChatResponse：返回的一组消息+元信息（Text 汇总、FinishReason、Usage、ModelId、ConversationId、ContinuationToken）。
ChatResponseUpdate：流式增量更新，每个 update 携带 Contents（含 UsageContent）、MessageId/ResponseId 等，用于 IAsyncEnumerable。

// 1：构建 AI 服务客户端
OpenAIClientOptions clientOptions = new OpenAIClientOptions();
// Keys 是配置信息，我们使用的是Qwen的qwen-plus模型
clientOptions.Endpoint = new Uri(Keys.QwenEndpoint);
// 创建OpenAI客户端
OpenAIClient aiClient = new(new ApiKeyCredential(Keys.QwenApiKey), clientOptions);

// 2：获取 IChatClient实例
var provider = aiClient.GetChatClient("qwen-max");
// 转换为标准的IChatClient
var chatClient = provider.AsIChatClient();

// 3： 使用 IChatClient 进行聊天
// 非流式响应
var response = await chatClient.GetResponseAsync("Hello, who are you?");
response.Display();

// 流式响应
var responseStreaming = chatClient.GetStreamingResponseAsync("Hello, who are you?");
await foreach (var chunk in responseStreaming)
{
    Console.Write(chunk);
    await Task.Delay(100); // 控制输出速度
}

// 4： 使用 IChatClient 获取服务元数据
var chatClientMetadata = chatClient.GetRequiredService<ChatClientMetadata>();
chatClientMetadata.Display();

5. 使用 IEmbeddingGenerator<TInput, TEmbedding>

核心概念与方法

接口：IEmbeddingGenerator<TInput, TEmbedding>，常见形态是 IEmbeddingGenerator<string, Embedding>，输入字符串，输出向量嵌入。
主要方法：GenerateAsync(IEnumerable values, EmbeddingGenerationOptions? options = null, CancellationToken ct = default) 返回 GeneratedEmbeddings。
常用扩展方法：
- GenerateAsync(value)：对单个输入生成嵌入。
- GenerateVectorAsync(value)：直接拿到向量数据 ReadOnlyMemory。
- GenerateAndZipAsync(values)：把输入与对应的嵌入配对返回。
选项：EmbeddingGenerationOptions，可设置 ModelId、Dimensions、User、RawRepresentationFactory 等。
元数据：可通过 generator.GetRequiredService() 获取底层实现与模型信息。

// 1：获取 `IEmbeddingGenerator` 实例
OpenAIClientOptions clientOptions = new OpenAIClientOptions();
clientOptions.Endpoint = new Uri(Keys.QwenEndpoint);
OpenAIClient client = new(new ApiKeyCredential(Keys.QwenApiKey), clientOptions);
var embeddingGenerator = client.GetEmbeddingClient("text-embedding-v4").AsIEmbeddingGenerator();

// 2：准备嵌入生成上下文
var documents = new[]
{
    "Microsoft.Extensions.AI 为 .NET 提供统一 AI 抽象。",
    "向量嵌入可以用来执行语义搜索和相似度匹配。"
};

// 3：调用生成方法获取向量
// IEmbeddingGenerator 提供批量与单条生成能力。可以根据业务需要选择 GenerateAsync（返回包含元数据的聚合结果）或 GenerateVectorAsync（直接获取原始向量 ReadOnlyMemory<T>）。
GeneratedEmbeddings<Embedding<float>> generatedEmbeddings = await embeddingGenerator.GenerateAsync(documents);
var singleVector = await embeddingGenerator.GenerateVectorAsync("嵌入生成器适合语义检索");
Console.WriteLine($"向量维度: {singleVector.Length}");

// 4：读取生成器元数据
// 通过 GetRequiredService<EmbeddingGeneratorMetadata> 可以获知实际调用的模型、提供方、输入限制等信息，便于在日志、监控或诊断场景中使用。
var embeddingMetadata = embeddingGenerator.GetRequiredService<EmbeddingGeneratorMetadata>();
embeddingMetadata.Display();

二、函数调用 Function Calling

借助 Microsoft.Extensions.AI（MEAI），我们可以在 .NET 应用中为大语言模型提供自动化的函数调用能力，让模型按需触发业务逻辑或外部服务。

在智能助理或企业应用场景下，模型不仅需要回答问题，还需调用后端工具执行任务：查询库存、获取天气、提交工单等。MEAI 通过统一的工具抽象，让模型能够安全地访问这些受控能力，同时保持与服务提供商无关。

1. 核心组件

ToolCollection：用于注册可调用的函数（Tool），支持从静态方法、委托或 Tool 实例构建。
ChatOptions.ToolMode：控制模型何时可以调用工具（禁用/自动/强制），常用值为 ToolMode.Auto。
ToolContext / ToolArguments：当模型请求调用函数时，MEAI 会解析参数并传入执行上下文，返回值会再次注入对话。
FunctionCallContent / ToolChatMessage：承载模型发出的函数调用意图与工具返回的结果，MEAI 会自动串联回聊天历史。

2. 实践流程

构建模型客户端：使用 OpenAI或其他提供 IChatClient 实现的服务。
注册工具集合：将业务函数包装成工具供模型调用，可附带 JSON Schema 与描述信息。
配置 ChatOptions：开启 ToolMode.Auto、设置可复用的工具集合，还可允许多次调用。
执行业务对话：发送用户消息，MEAI 自动处理模型的函数请求，返回带有工具结果的最终回答。

3. 简单示例

创建工具函数

// 获取可以使用的工具集
public IList<AITool> GetTools()
{
    var travelTools = new TravelToolset();

    // 工具集中存在多个函数时，可以给这些函数添加描述信息，并利用反射一次性生成
    IList<AITool> batchRegisteredTools =
        typeof(TravelToolset)
        .GetMethods(BindingFlags.Instance | BindingFlags.Public | BindingFlags.DeclaredOnly)
        .Select(method => AIFunctionFactory.Create(
            method,
            travelTools,
            name: method.Name.ToLowerInvariant(),
            description: method.GetCustomAttribute<DescriptionAttribute>()?.Description))
        .Cast<AITool>()
        .ToList();

    foreach (var tool in batchRegisteredTools)
    {
        Console.WriteLine($"Registered tool: {tool.Name} - {tool.Description}");
    }
    return batchRegisteredTools;

    //// 可以与其它工具合并后交给 ChatOptions.Tools 使用
    //ChatOptions batchOptions = new()
    //{
    //    ToolMode = ChatToolMode.Auto,
    //    Tools = batchRegisteredTools
    //};
}

// 我们将业务逻辑封装成工具（Tool），提供名称、描述以及输入参数。
public record WeatherReport(string City, int TemperatureCelsius, bool WillRain);

public class TravelToolset
{
    [Description("查询指定城市的实时天气")]
    public WeatherReport QueryWeather(string city)
    {
        int temperature = Random.Shared.Next(-5, 36);
        bool willRain = Random.Shared.NextDouble() > 0.6;
        return new WeatherReport(city, temperature, willRain);
    }

    [Description("根据天气提供穿搭建议")]
    public string SuggestOutfit(string city)
    {
        var weather = QueryWeather(city);
        return weather switch
        {
            { WillRain: true } => $"{city} 可能会下雨，建议携带雨具并穿防水外套。",
            { TemperatureCelsius: < 5 } => $"{city} 温度 {weather.TemperatureCelsius}℃，请穿冬装并注意保暖。",
            { TemperatureCelsius: > 28 } => $"{city} 今天很热（{weather.TemperatureCelsius}℃），可以选择短袖和透气面料。",
            _ => $"{city} 气温 {weather.TemperatureCelsius}℃，穿上舒适的日常装束即可。"
        };
    }
}

启用函数调用：可以通过创建 ChatClientBuilder 并调用 UseFunctionInvocation 启用函数调用能力。

// 1：创建工具集
var tools = GetTools();

// 2：启用函数调用
var client = chatClient.AsBuilder()
    .UseFunctionInvocation(configure: options =>
    {
        options.AdditionalTools = tools; // 注册一些额外的工具，比如时间工具等
        options.AllowConcurrentInvocation = true; // 允许模型并发调用多个函数，默认 false
        options.IncludeDetailedErrors = true; // 包含详细错误信息，默认 false
        options.MaximumConsecutiveErrorsPerRequest = 3; // 每个请求允许的最大连续错误数，防止无限循环，默认 3次
        options.MaximumIterationsPerRequest = 5; // 每个请求允许的最大迭代次数，防止无限循环，默认 40次
        options.TerminateOnUnknownCalls = false; // 当模型调用了未知的函数时，是否终止对话
        options.FunctionInvoker = (context, cancellationToken) =>
        {
            var functionCall = context.Function;
            Console.WriteLine($"Invoking function: {functionCall.Name} with arguments: {functionCall.AdditionalProperties}");
            return context.Function.InvokeAsync(context.Arguments, cancellationToken);
        };
    })
    .Build();

// 步骤 4：配置 ChatOptions 并自动执行函数调用
var messages = new List<ChatMessage>
{
    new ChatMessage(ChatRole.System, "你是出行助手，善于调用工具给出穿搭建议。"),
    new ChatMessage(ChatRole.User, "帮我查看今天北京的天气，并告诉我需要带雨伞吗？")
};

ChatOptions options = new()
{
    ToolMode = ChatToolMode.Auto, // 自动决定是否调用工具，默认值为 Auto
    AllowMultipleToolCalls = true, // 允许模型一次调用多个工具，默认 false
    Tools = tools
};

var weatherResponse = await client.GetResponseAsync(messages, options);
weatherResponse.Display();

输出：

Registered tool: queryweather - 查询指定城市的实时天气 Registered tool: suggestoutfit - 根据天气提供穿搭建议 Invoking function: queryweather with arguments: Microsoft.Shared.Collections.EmptyReadOnlyDictionary`2[System.String,System.Object]

ChatOptions 提供了丰富的配置选项，以下是与函数调用相关的关键属性：

Tools：可调用的工具集合。
ToolMode：工具调用模式。
AllowMultipleToolCalls：是否允许一次调用多个工具。

ToolMode 详解：通过 ChatOptions.ToolMode 支持三种模式：

Auto：自动决定是否调用工具，默认值为 Auto。
RequireAny：至少需要调用Tools中的一个工具。
None：禁用工具调用。
RequireSpecific(string functionName)：必须调用指定名称的工具。

三、配置 ChatOptions

ChatOptions 是传递给 IChatClient 的统一配置容器，允许我们在一次对话请求中同时调整生成策略、工具调用行为以及服务特性。下面按能力维度梳理核心属性：

1. 对话上下文

ConversationId：为请求绑定会话标识，方便在无状态客户端里实现状态恢复。
Instructions：附加一次性的系统提示词，用于补充场景限定或对模型的额外要求。

var contextMessages = new List<ChatMessage>
{
    new(ChatRole.System, "你是贴心的行程规划助手。"),
    new(ChatRole.User, "帮我安排一个五一北京两日游的行程计划。")
};

ChatOptions contextOptions = new()
{
    ConversationId = "planner-2024-05-01",
    Instructions = "回答请保持中文，并按时间顺序给出活动安排。"
};

var contextResponse = await chatClient.GetResponseAsync(contextMessages, contextOptions);
contextResponse.Display();

2. 生成策略

ModelId / ResponseFormat：覆盖默认模型或强制输出格式（纯文本、通用 JSON 或自定义 JSON Schema）。
Temperature、TopP、TopK：调整采样策略，控制回答的多样性与确定性。
MaxOutputTokens：限制生成的最大 token 数。
FrequencyPenalty、PresencePenalty、Seed：抑制重复、增强随机性或复现结果。
StopSequences：当输出命中指定序列时立即截断结果，常用于避免模型继续执行不需要的内容。

var generationMessages = new List<ChatMessage>
{
    new(ChatRole.User, "请给出三条不同风格的励志语录，以便我放在海报上。")
};

ChatOptions generationOptions = new()
{
    ModelId = "qwen-max", // 覆盖默认模型（根据可用模型调整）
    Temperature = 0.7f,
    TopP = 0.9f,
    MaxOutputTokens = 512,
    StopSequences = new[] { "[DONE]" }
};

var generationResponse = await chatClient.GetResponseAsync(generationMessages, generationOptions);
generationResponse.Display();

3. 工具调用

ToolMode：指定工具调用策略（禁用、自动、必须调用某个/任意工具）。
Tools：传入当前请求可用的工具集合（通常来自函数调用注册）。
AllowMultipleToolCalls：允许模型在一次响应中串联多个工具调用。

结合 UseFunctionInvocation 中间件，可以在请求级别动态控制工具列表并允许模型串联多个函数调用。

string GetCurrentWeather(string city)
{
    var temperature = Random.Shared.Next(-5, 36);
    var willRain = Random.Shared.NextDouble() > 0.6;
    return $"{city} 当前 {temperature}℃，{(willRain ? "可能下雨，请带伞" : "天气晴朗")}。";
}

AITool weatherTool = AIFunctionFactory.Create(
    (string city) => GetCurrentWeather(city),
    name: "get_current_weather",
    description: "查询指定城市的实时天气");

var toolMessages = new List<ChatMessage>
{
    new(ChatRole.System, "你是穿搭顾问，善于结合天气给出建议。"),
    new(ChatRole.User, "今天北京需要带雨伞吗？")
};

ChatOptions toolOptions = new()
{
    ToolMode = ChatToolMode.Auto,
    AllowMultipleToolCalls = true,
    Tools = new[] { weatherTool }
};

var toolEnabledClient = chatClient.AsBuilder()
    .UseFunctionInvocation()
    .Build();

var toolResponse = await toolEnabledClient.GetResponseAsync(toolMessages, toolOptions);
toolResponse.Display();

4. 后台执行与恢复

AllowBackgroundResponses：开启支持后台长任务或流式中断恢复（特性处于实验阶段）。
ContinuationToken：在后台模式下用于轮询或恢复流式响应的令牌。

5. 扩展点

AdditionalProperties：向底层提供者透传自定义键值对。
RawRepresentationFactory：当你明确底层 IChatClient 的特定实现时，可构造并返回 provider 专属的选项对象。

四、会话缓存 Caching

通过 Microsoft.Extensions.AI（MEAI）的缓存功能，我们可以有效减少对大语言模型的重复调用，降低成本并提升响应速度。

我们可以使用 CachingChatClient 和 DistributedCachingChatClient 实现智能缓存。

1. 核心组件

CachingChatClient：抽象基类，定义了缓存聊天客户端的核心逻辑，包括缓存键生成、读写操作和流式响应处理。
DistributedCachingChatClient：基于 IDistributedCache 的具体实现，支持使用 Redis、SQL Server 等分布式缓存存储。
CoalesceStreamingUpdates：控制是否合并流式更新，优化缓存存储效率（默认为 true）。
EnableCaching 方法：决定是否为特定请求启用缓存，默认排除带有 ConversationId 的请求。
缓存键生成：基于消息内容、ChatOptions 和附加值，通过 JSON 序列化和哈希计算唯一标识。

2. 简单实现

启用缓存中间件：通过 ChatClientBuilder 和 UseDistributedCache 扩展方法，将缓存层添加到客户端管道中。

// 1. 创建内存分布式缓存实例
var memoryCache = new MemoryCache(Options.Create(new MemoryCacheOptions()));
IDistributedCache distributedCache = new MemoryDistributedCache(Options.Create(new MemoryDistributedCacheOptions 
{
    // 可以配置全局缓存选项
}));

Console.WriteLine("缓存存储已配置");

// 2. 启动缓存中间件
// 使用 ChatClientBuilder 构建带缓存的客户端
var cachedChatClient = new ChatClientBuilder(chatClient)
    .UseDistributedCache(distributedCache)
    .Build();

// 设置缓存行为，例如合并流式更新或添加自定义缓存键
if (cachedChatClient is DistributedCachingChatClient distributedCachingClient)
{
    distributedCachingClient.CoalesceStreamingUpdates = true; // 合并流式更新（默认 true）

    // 可选：添加额外的缓存键值来区分不同的缓存分区
    // distributedCachingClient.CacheKeyAdditionalValues = new[] { "v1", "production" };
}

Console.WriteLine("缓存客户端已构建");
cachedChatClient.Display();

测试缓存效果 - 非流式响应

var testMessage = "用一句话介绍什么是 Microsoft.Extensions.AI？";
// 第一次请求 - 会调用模型
Console.WriteLine("第一次请求（调用模型）...");
var sw1 = Stopwatch.StartNew();
var response1 = await cachedChatClient.GetResponseAsync(testMessage);
sw1.Stop();

Console.WriteLine($"响应内容: {response1.Text}");
Console.WriteLine($" 耗时: {sw1.ElapsedMilliseconds}ms");
Console.WriteLine($"Token 使用: {response1.Usage?.TotalTokenCount ?? 0}\n");

// 第二次相同请求 - 应该从缓存返回
Console.WriteLine("第二次请求（从缓存返回）...");
var sw2 = Stopwatch.StartNew();
var response2 = await cachedChatClient.GetResponseAsync(testMessage);
sw2.Stop();

Console.WriteLine($"响应内容: {response2.Text}");
Console.WriteLine($" 耗时: {sw2.ElapsedMilliseconds}ms");
Console.WriteLine($"Token 使用: {response2.Usage?.TotalTokenCount ?? 0}");

Console.WriteLine($"\n缓存加速比: {(double)sw1.ElapsedMilliseconds / sw2.ElapsedMilliseconds:F2}x");
Console.WriteLine($"两次响应内容一致: {response1.Text == response2.Text}");

测试缓存效果 - 流式响应：缓存同样支持流式响应模式。根据 CoalesceStreamingUpdates 配置：
- true（默认）：将流式更新合并为完整响应后缓存，后续从缓存读取时再拆分为流
- false：保留原始流式更新序列进行缓存

var streamTestMessage = "列出 MEAI 的三个核心接口。";
// 第一次流式请求
Console.WriteLine("第一次流式请求（调用模型）...");
var sw3 = Stopwatch.StartNew();
await foreach (var update in cachedChatClient.GetStreamingResponseAsync(streamTestMessage))
{
    Console.Write(update.Text);
}
sw3.Stop();
Console.WriteLine($"\n 耗时: {sw3.ElapsedMilliseconds}ms\n");

// 第二次相同的流式请求（从缓存返回）
Console.WriteLine("第二次流式请求（从缓存返回）...");
var sw4 = Stopwatch.StartNew();
await foreach (var update in cachedChatClient.GetStreamingResponseAsync(streamTestMessage))
{
    Console.Write(update.Text);
}
sw4.Stop();
Console.WriteLine($"\n 耗时: {sw4.ElapsedMilliseconds}ms");

Console.WriteLine($"\n流式缓存加速比: {(double)sw3.ElapsedMilliseconds / sw4.ElapsedMilliseconds:F2}x");

3. 高级配置与最佳实践

缓存键的生成机制：DistributedCachingChatClient 通过以下因素生成缓存键：
- 消息内容（ChatMessage 集合）
- 聊天选项（ChatOptions）
- 缓存版本号
- 自定义附加值（CacheKeyAdditionalValues）
这些值会被序列化为 JSON 并通过哈希算法生成唯一标识。
自定义缓存策略：通过继承 CachingChatClient 抽象类，可以实现自定义缓存逻辑：

public class CustomCachingChatClient : CachingChatClient
{
  protected override bool EnableCaching(IEnumerable<ChatMessage> messages, ChatOptions? options)
  {
    // 自定义缓存启用条件
    // 例如：只缓存不包含敏感关键词的请求
    var messageText = string.Join(" ", messages.Select(m => m.Text));
    return !messageText.Contains("机密") && base.EnableCaching(messages, options);
  }
  // 实现抽象方法...
}

使用 Redis 分布式缓存

using Microsoft.Extensions.Caching.StackExchangeRedis;

var redisCache = new RedisCache(Options.Create(new RedisCacheOptions
{
    Configuration = "localhost:6379",
    InstanceName = "MEAICache:"
}));

var cachedClient = chatClient
    .AsBuilder()
    .UseDistributedCache(redisCache)
    .Build();

缓存键分区管理：通过 CacheKeyAdditionalValues 可以为不同场景创建独立的缓存分区：

var productionClient = chatClient
    .AsBuilder()
    .UseDistributedCache(distributedCache, configure: c =>
    {
        c.CacheKeyAdditionalValues = new[] { "prod", "v2", "zh-CN" };
    })
    .Build();

这在以下场景特别有用：多语言支持（为不同语言创建独立缓存）、版本管理（新版本不会命中旧版本缓存）、环境隔离（开发、测试、生产环境独立缓存）

何时不应使用缓存：默认情况下，以下场景会自动禁用缓存：
- 设置了 ConversationId：表示存在会话状态，响应可能不同
- 包含敏感数据：应避免缓存包含个人信息的请求
- 实时性要求高：如股票报价、实时新闻等
- 随机性响应：需要每次生成不同结果的场景
可以通过重写 EnableCaching 方法自定义这些规则。

4. 注意事项与限制

重要提示：DistributedCachingChatClient 使用 JSON 序列化存储缓存数据，存在以下限制：
- ChatMessage.RawRepresentation 不会被序列化
- AdditionalProperties 中的 object 值会反序列化为 JsonElement
- 自定义类型可能无法完整往返
在应用依赖这些属性，请谨慎使用缓存或实现自定义序列化逻辑。
缓存版本管理：当缓存序列化格式发生破坏性变更时，MEAI 会更新内部的缓存版本号（当前为 v2），自动使旧缓存失效。

五、上下文窗口压缩 Chat Reducer

在多轮对话场景中，我们面临以下挑战：

上下文窗口限制：大多数 LLM 都有上下文长度限制（如 GPT-4 的 8K/32K tokens），超出限制会导致请求失败。
成本控制：输入 token 越多，API 调用成本越高，尤其在高频对话场景下。
性能优化：过长的上下文会显著增加模型推理时间，影响用户体验。
信息冗余：并非所有历史消息都对当前对话有价值，早期的闲聊内容可能不再相关。

Chat Reducer 通过智能压缩策略，在保持对话质量的前提下，有效解决上述问题。

1. Chat Reducer 核心概念

IChatReducer 接口：IChatReducer 是 MEAI 中定义的对话压缩抽象接口

public interface IChatReducer
{
    Task<IEnumerable<ChatMessage>> ReduceAsync(IEnumerable<ChatMessage> messages, CancellationToken cancellationToken);
}

该接口接收一组 ChatMessage，返回压缩后的消息列表。MEAI 提供了两种开箱即用的实现：

MessageCountingChatReducer（消息计数压缩器）：通过限制消息数量来控制对话长度，保留最新的 N 条非系统消息。

核心特性：
- 始终保留第一条系统消息（如存在）
- 保留最近的 N 条非系统消息（用户、助手消息）
- 自动排除包含函数调用/结果的消息
- 适用于固定窗口大小的场景
适用场景：
- 客服对话（只关注最近几轮交互）
- 快速问答系统
- 需要严格控制 token 预算的场景

SummarizingChatReducer（智能摘要压缩器）：利用 AI 自动生成摘要，将历史对话压缩成简洁的上下文描述。

核心特性：
- 当消息数量超过阈值时，自动调用 AI 生成摘要
- 摘要内容存储在消息的 AdditionalProperties 中
- 支持渐进式摘要（新摘要会包含旧摘要内容）
- 保留最近的消息 + 历史摘要，确保上下文连贯
适用场景：
- 长时间咨询会话（如医疗诊断、法律咨询）
- 需要保留完整上下文语义的多轮对话
- 复杂任务协作场景

2. 使用 MessageCountingChatReducer

场景说明：客服机器人只需保留最近 3 轮对话，超出的历史消息自动丢弃。

压缩规则说明：

保留 1 条系统消息（第一条 System 消息）
保留 3 条最新的非系统消息（最后 3 轮对话的 User 和 Assistant 消息）
丢弃早期的对话记录（前 3 轮的 6 条消息被移除）

// 1. 创建 MessageCountingChatReducer
#pragma warning disable MEAI001
// 创建计数器压缩器，保留最多 3 条非系统消息
var countingReducer = new MessageCountingChatReducer(targetCount: 3);
Console.WriteLine("MessageCountingChatReducer 已创建（保留 3 条消息）");

// 2. 集成 Reducer 到 Chat Client
var reducingClient = chatClient.AsBuilder()
    .UseChatReducer(reducer: countingReducer)
    .Build();
Console.WriteLine("带 Reducer 的 Chat Client 已构建");

// 3. 模拟多轮对话并观察压缩效果
var messages = new List<ChatMessage>
{
    new ChatMessage(ChatRole.System, "你是一个专业的客服助手。")
};
// 模拟 6 轮对话
var questions = new[]
{
    "我的订单号是多少？",
    "订单什么时候发货？",
    "可以修改收货地址吗？",
    "运费是多少？",
    "支持货到付款吗？",
    "如何申请退款？"
};
foreach (var question in questions)
{
    messages.Add(new ChatMessage(ChatRole.User, question));

    // 调用带 Reducer 的客户端
    var response = await reducingClient.GetResponseAsync(messages);
    messages.Add(new ChatMessage(ChatRole.Assistant, response.Text));

    Console.WriteLine($"用户: {question}");
    Console.WriteLine($"助手: {response.Text}");
    Console.WriteLine($"当前消息总数: {messages.Count} 条\n");
}
Console.WriteLine("════════════════════════════════════════");
Console.WriteLine("注意：虽然本地 messages 列表包含所有历史消息，");
Console.WriteLine("但 Reducer 会在每次调用 API 前自动压缩，");
Console.WriteLine("实际发送给模型的只有最近 3 条 + 系统消息。");
Console.WriteLine("════════════════════════════════════════");

// 4. 验证压缩效果
// 手动调用 Reducer 查看压缩后的结果
var reducedMessages = await countingReducer.ReduceAsync(messages, CancellationToken.None);
Console.WriteLine($"压缩前: {messages.Count} 条消息");
Console.WriteLine($"压缩后: {reducedMessages.Count()} 条消息\n");
Console.WriteLine("压缩后的消息列表：");
reducedMessages.Display();

3. 使用：SummarizingChatReducer

场景说明：医疗咨询场景需要保留完整的上下文语义，使用 AI 自动生成摘要，确保历史信息不会丢失。

摘要压缩工作原理：

触发条件：当消息数超过 targetCount + threshold 时启动摘要
摘要生成：AI 将前 N 条消息（当前消息数 - targetCount）生成为简洁摘要
存储机制：摘要存储在消息的 AdditionalProperties["_summary_"] 中
渐进式压缩：新摘要会包含旧摘要内容，确保上下文完整
保留最新：始终保留最近的 targetCount 条原始消息

压缩效果对比：

原始消息：7 条（System + 3 轮对话）
压缩后：3-4 条（System + 摘要标记 + 最近 2 条消息）
语义完整性：保留（通过摘要）
Token 节省：显著降低

// 1. 创建 SummarizingChatReducer
#pragma warning disable MEAI001
// 创建摘要压缩器
// targetCount: 保留最近 2 条消息
// threshold: 超过 targetCount + threshold 时触发摘要（即 2 + 1 = 3 条时触发）
var summarizingReducer = new SummarizingChatReducer(
    chatClient: chatClient,
    targetCount: 2,
    threshold: 1
);
Console.WriteLine("SummarizingChatReducer 已创建");
Console.WriteLine(" - 目标消息数: 2 条");
Console.WriteLine(" - 触发阈值: 超过 3 条时生成摘要");

// 2.集成到 Chat Client 并测试
var summarizingClient = chatClient.AsBuilder()
    .UseChatReducer(reducer: summarizingReducer)
    .Build();

// 准备测试消息
var medicalMessages = new List<ChatMessage>
{
    new ChatMessage(ChatRole.System, "你是一位专业的医疗咨询助手。"),
    new ChatMessage(ChatRole.User, "我最近经常头痛，是什么原因？"),
    new ChatMessage(ChatRole.Assistant, "头痛可能由多种因素引起，包括压力、睡眠不足、脱水、眼疲劳等。建议您注意休息，保持规律作息。"),
    new ChatMessage(ChatRole.User, "我每天睡眠时间大概 5 小时，工作压力比较大。"),
    new ChatMessage(ChatRole.Assistant, "睡眠不足和工作压力确实是导致头痛的常见原因。建议您尽量保证 7-8 小时睡眠，适当放松。"),
    new ChatMessage(ChatRole.User, "除了头痛，我还感觉眼睛很干涩。"),
};
Console.WriteLine($"初始消息数: {medicalMessages.Count} 条（未触发摘要）");
// 添加第 7 条消息，触发摘要生成（2 + 1 = 3 条阈值）
medicalMessages.Add(new ChatMessage(ChatRole.Assistant, "眼睛干涩可能与长时间使用电子设备、空调环境等有关。建议使用人工泪液，并定时休息眼睛。"));
Console.WriteLine($"当前消息数: {medicalMessages.Count} 条（已超过阈值 3 条）\n");

// 调用 Reducer 触发摘要生成
var summarizedMessages = await summarizingReducer.ReduceAsync(medicalMessages, CancellationToken.None);
Console.WriteLine($"压缩后消息数: {summarizedMessages.Count()} 条\n");
Console.WriteLine("压缩后的消息结构：");
summarizedMessages.Display();

4. 自定义摘要提示词

SummarizingChatReducer 允许自定义摘要生成的提示词，以适应不同领域的需求。

#pragma warning disable MEAI001
// 创建自定义摘要提示词的 Reducer
var customReducer = new SummarizingChatReducer(
    chatClient: chatClient,
    targetCount: 2,
    threshold: 1
);

// 设置医疗领域专用的摘要提示词
customReducer.SummarizationPrompt = """
请为以下医疗咨询对话生成简洁的临床摘要（不超过 3 句话）：

要求：
- 提取患者主诉症状和时长
- 记录已提供的初步建议
- 保留关键医学信息（用药史、过敏史等，如有）
- 使用专业医学术语
- 不要添加推测性诊断或建议

格式：【患者主诉】症状描述 | 【已知信息】关键背景 | 【初步建议】已给出的建议
""";
Console.WriteLine("已设置自定义医疗摘要提示词");

// 测试自定义提示词效果
var testMessages = new List<ChatMessage>
{
    new ChatMessage(ChatRole.System, "你是医疗助手。"),
    new ChatMessage(ChatRole.User, "我持续低烧 3 天，体温 37.8℃。"),
    new ChatMessage(ChatRole.Assistant, "建议多休息、多喝水，监测体温变化。如超过 38.5℃ 或持续不退，请就医。"),
    new ChatMessage(ChatRole.User, "我有青霉素过敏史。"),
    new ChatMessage(ChatRole.Assistant, "感谢告知过敏史，这在就医时非常重要。请向医生明确说明青霉素过敏。"),
};
var customSummary = await customReducer.ReduceAsync(testMessages, CancellationToken.None);
Console.WriteLine("压缩后的消息结构：");
customSummary.Display();

5. 实践建议和最佳实践

如何选择 Reducer 类型

场景	推荐 Reducer	原因
客服机器人	MessageCountingChatReducer	只需关注最近几轮对话，历史信息价值低
技术支持	MessageCountingChatReducer	问题通常独立，不需要长期上下文
医疗咨询	SummarizingChatReducer	需要完整病史信息，摘要确保语义连续性
法律咨询	SummarizingChatReducer	案情细节重要，不能丢失关键信息
教育辅导	SummarizingChatReducer	学习进度需要长期追踪
快速问答	MessageCountingChatReducer	对话简短，不需要复杂摘要

参数调优建议

// 保守策略：保留较多消息，适合上下文敏感场景
new MessageCountingChatReducer(targetCount: 10);
// 激进策略：仅保留最近对话，适合成本优先场景
new MessageCountingChatReducer(targetCount: 2);

// 频繁摘要：threshold = 0，每次超过 targetCount 立即摘要
new SummarizingChatReducer(chatClient, targetCount: 3, threshold: 0);
// 延迟摘要：threshold 较大，减少 API 调用次数
new SummarizingChatReducer(chatClient, targetCount: 5, threshold: 3);

与其他中间件组合使用

Chat Reducer 可以与其他 MEAI 中间件（如 UseFunctionInvocation、日志、缓存等）无缝组合：

var client = chatClient.AsBuilder()
    .UseChatReducer(reducer: summarizingReducer)  // 先压缩消息
    .UseFunctionInvocation()                      // 再处理函数调用
    .Build();
// 注意顺序：Reducer 应放在管道前端，确保在调用 API 前完成压缩。

处理函数调用消息

两种 Reducer 都会自动排除包含 FunctionCallContent 或 FunctionResultContent 的消息，避免破坏函数调用上下文。这意味着：
- 系统消息：保留
- 普通用户/助手消息：纳入压缩范围
- 函数调用消息：自动跳过（不计入 targetCount）
性能与成本考量
- MessageCountingChatReducer：
  - 无额外 API 调用
  - 零延迟
  - 可能丢失重要上下文
- SummarizingChatReducer：
  - 每次触发摘要需调用一次 LLM（额外成本）
  - 增加约 1-3 秒延迟
  - 保留完整语义信息
优化技巧：使用较小的模型（如 GPT-3.5）专门用于摘要生成，降低成本。

六、工具削减 Tool Reduction

Tool Reduction（工具削减）是 Microsoft.Extensions.AI 提供的一种智能优化策略，它可以根据用户输入和对话上下文，自动筛选和削减无关工具，只保留与当前请求相关的工具集合，从而提升模型性能和降低成本。

Tool Reduction 是一种中间件机制，在将请求发送给 AI 模型之前，根据配置的策略自动筛选和削减工具列表。

1. Tool Reduction 的工作原理

工具注册：开发者注册所有可用的工具（如 100 个工具）
用户请求：用户发起对话，提出具体问题
智能筛选：Tool Reduction 中间件根据策略（如基于 Embedding 相似度）选出相关工具
精简发送：只将筛选后的工具子集（如 5 个工具）发送给模型
模型调用：模型从精简后的工具中选择并调用

3. 应用场景

场景	说明	示例
企业知识库	数十个专业领域的查询工具	财务查询、人事查询、IT 支持等
智能客服	大量不同类型的服务工具	订单查询、退款处理、物流跟踪等
开发助手	多种编程语言和框架的工具	Python 工具、JavaScript 工具、数据库工具等
多模态应用	图像、文本、音频处理工具	根据任务类型选择对应模态工具

4. 基础实践

注册多个工具：为了演示 Tool Reduction 的效果，我们创建一组不同领域的工具函数。

// 天气工具
[Description("获取指定城市的当前天气信息")]
string GetWeather([Description("城市名称")] string city) => $"{city}的天气：晴朗，温度 25°C";

// 新闻工具
[Description("获取最新的科技新闻")]
string GetTechNews() => "最新科技新闻：AI 技术突破...";

// 股票工具
[Description("查询股票价格")]
string GetStockPrice([Description("股票代码")] string symbol) => $"{symbol} 当前价格：$150.00";

// 翻译工具
[Description("将文本翻译成英文")]
string TranslateToEnglish([Description("待翻译的文本")] string text) => $"Translation: {text}";

// 时间工具
[Description("获取当前时间")]
string GetCurrentTime() => DateTime.Now.ToString("HH:mm:ss");

// 计算器工具
[Description("执行数学计算")]
double Calculate([Description("第一个数字")] double a, [Description("第二个数字")] double b, [Description("运算符 (+, -, *, /)")] string operation)
{
    return operation switch
    {
        "+" => a + b,
        "-" => a - b,
        "*" => a * b,
        "/" => a / b,
        _ => 0
    };
}

// 邮件工具
[Description("发送电子邮件")]
string SendEmail([Description("收件人")] string to, [Description("邮件主题")] string subject) => $"邮件已发送到 {to}，主题：{subject}";

// 日历工具
[Description("添加日历事件")]
string AddCalendarEvent([Description("事件标题")] string title, [Description("事件日期")] string date) => $"事件已添加：{title} on {date}";

// 数据库工具
[Description("查询数据库")]
string QueryDatabase([Description("SQL 查询语句")] string query) => $"查询结果：[模拟数据]";

// 文件工具
[Description("读取文件内容")]
string ReadFile([Description("文件路径")] string path) => $"文件内容：[{path}]";

将工具注册到 AIFunctionFactory

Console.WriteLine("10 个工具函数已定义");

// 将工具注册到 AIFunctionFactory
var allTools = new List<AIFunction>
{
    AIFunctionFactory.Create(GetWeather),
    AIFunctionFactory.Create(GetTechNews),
    AIFunctionFactory.Create(GetStockPrice),
    AIFunctionFactory.Create(TranslateToEnglish),
    AIFunctionFactory.Create(GetCurrentTime),
    AIFunctionFactory.Create(Calculate),
    AIFunctionFactory.Create(SendEmail),
    AIFunctionFactory.Create(AddCalendarEvent),
    AIFunctionFactory.Create(QueryDatabase),
    AIFunctionFactory.Create(ReadFile)
};

Console.WriteLine($"已注册 {allTools.Count} 个工具");
allTools.Select(t => new { 工具名称 = t.Name, 描述 = t.Description }).Display();

使用 UseToolReduction() 中间件启用工具削减功能。

// UseToolReduction() 需要传入一个实现了 IToolReductionStrategy 接口的策略对象
public interface IToolReductionStrategy
{
    Task<IEnumerable<AITool>> SelectToolsForRequestAsync(
        IEnumerable<ChatMessage> messages,
        ChatOptions? options,
        CancellationToken cancellationToken = default);
}

public class EmbeddingToolReductionStrategy : IToolReductionStrategy
{
    /// <summary>
    /// 基于 Embedding 相似度的内置策略
    /// 工作原理：
    /// 1. 为每个工具的名称和描述生成 Embedding（缓存）
    /// 2. 为用户的对话消息生成 Embedding
    /// 3. 计算余弦相似度，选出相似度最高的前 N 个工具
    /// 4. 标记为的工具始终保留，不计入 toolLimit
    /// </summary>
    /// <param name="embeddingGenerator">Embedding 生成器，用于计算工具和查询的语义相似度</param>
    /// <param name="toolLimit">最多保留多少个工具（不包括必需工具）</param>
    public EmbeddingToolReductionStrategy(IEmbeddingGenerator<string, Embedding<float>> embeddingGenerator, int toolLimit)
}

创建一个中间件来观察每次调用时发送的工具数量。

public static class ChatClientBuilderExtensions
{
    // 添加一个中间件去监控每次调用时发送的工具
    public static ChatClientBuilder UseToolListLogging(this ChatClientBuilder builder)
    {
        return builder.Use(
            getResponseFunc: async (messages, options, innerClient, cancellationToken) =>
            {
                Console.WriteLine($"[Tools] {options!.Tools?.Count ?? 0} 个工具被发送到模型：");
                if (options.Tools != null)
                {
                    foreach (var tool in options.Tools)
                    {
                        Console.WriteLine($" - 工具名称: {tool.Name}, 描述: {tool.Description}");
                    }
                }
                var response = await innerClient.GetResponseAsync(messages, options, cancellationToken);

                return response;
            },
            getStreamingResponseFunc: null);
    }
}

使用 EmbeddingToolReductionStrategy

#pragma warning disable MEAI001
// 创建 Embedding Tool Reduction 策略
// 参数：embeddingGenerator（Embedding生成器）, toolLimit（最多保留5个工具）
var strategy = new EmbeddingToolReductionStrategy(embeddingGenerator, toolLimit: 5);

// 构建启用 Tool Reduction 的客户端
var reducingClient = baseChatClient.AsBuilder()
    .UseToolReduction(strategy)  // 传入策略实例
    .UseFunctionInvocation()
    .UseToolListLogging() // 添加监控中间件
    .Build();

Console.WriteLine("已创建启用 Tool Reduction 的 ChatClient");

/* 核心要点：
- UseToolReduction() 必须传入一个 IToolReductionStrategy 策略实例
- EmbeddingToolReductionStrategy 基于语义相似度自动筛选工具
- toolLimit 参数控制最多保留多少个工具（不包括必需工具）
- Tool Reduction 应该在 UseFunctionInvocation() 之前调用
*/

对比测试

测试 1：不使用 Tool Reduction

// 创建不使用 Tool Reduction 的客户端
var normalClient = baseChatClient.AsBuilder()
    .UseFunctionInvocation()
    .UseToolListLogging()
    .Build();

// 配置 ChatOptions，提供所有工具
var chatOptions = new ChatOptions
{
    Tools = [..allTools],
    ToolMode = ChatToolMode.Auto
};

Console.WriteLine("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
Console.WriteLine("     测试场景：查询北京天气（无 Reduction）    ");
Console.WriteLine("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n");

var response = await normalClient.GetResponseAsync(
    "北京今天天气怎么样？",
    chatOptions
);
new
{
    回答 = response.Text,
    发送的工具数量 = chatOptions.Tools?.Count ?? 0,
    调用的工具 = response.Messages
        .OfType<FunctionCallContent>()
        .Select(f => f.Name)
        .ToList()
}.Display();

观察结果：

模型收到了所有 10 个工具的描述

模型正确选择了 GetWeather 工具

但 9 个无关工具的描述浪费了上下文窗口

测试 2：现在使用启用了 Tool Reduction 的客户端进行相同的测试。

Console.WriteLine("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
Console.WriteLine("   测试场景：查询北京天气（启用 Reduction）    ");
Console.WriteLine("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n");

var response2 = await reducingClient.GetResponseAsync(
    "北京今天天气怎么样？",
    new ChatOptions
    {
        Tools = [..allTools],
        ToolMode = ChatToolMode.Auto
    }
);
new
{
    回答 = response2.Text,
    原始工具数量 = allTools.Count,
    调用的工具 = response2.Messages
        .OfType<FunctionCallContent>()
        .Select(f => f.Name)
        .ToList()
}.Display();

关键改进：

Tool Reduction 自动分析用户请求

只保留与天气相关的工具（如 GetWeather）

无关工具（股票、邮件、数据库等）被自动过滤

模型收到的工具数量显著减少，提升准确率

手动验证筛选结果：IToolReductionStrategy 的 SelectToolsForRequestAsync 方法允许我们预先查看筛选结果。

Console.WriteLine("\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
Console.WriteLine("         手动调用 Strategy 查看筛选结果        ");
Console.WriteLine("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n");

// 准备测试消息
var testMessages = new[]
{
    new ChatMessage(ChatRole.User, "上海的天气如何？"),
    new ChatMessage(ChatRole.User, "帮我写一封邮件给张三，告诉他明天的会议时间是上午10点。"),
};

// 准备 ChatOptions
var testOptions = new ChatOptions
{
    Tools = [..allTools]
};

// 调用策略的 SelectToolsForRequestAsync 方法
var selectedTools = await strategy.SelectToolsForRequestAsync(
    testMessages,
    testOptions,
    CancellationToken.None
);
Console.WriteLine($"原始工具数量: {allTools.Count}");
Console.WriteLine($"筛选后工具数量: {selectedTools.Count()}");
Console.WriteLine($"\n筛选后的工具列表:");
selectedTools.Select(t => new { 工具名称 = t.Name, 描述 = t.Description }).Display();

七、结构化输出

在实际的 AI 应用开发中，我们经常需要将 AI 模型的输出解析为程序可以直接使用的结构化数据。传统方式下，AI 返回的是自由文本，我们需要手动解析和提取关键信息，这个过程容易出错且维护成本高。

Microsoft.Extensions.AI 提供了强大的结构化输出（Structured Output）能力，允许我们预先定义输出格式（JSON Schema），让 AI 模型严格按照指定的结构返回数据。

1. 核心概念

响应格式（ChatResponseFormat）：ChatResponseFormat 是 Microsoft.Extensions.AI 中定义响应格式的类，支持以下几种模式：
- ChatResponseFormat.Text：纯文本格式（默认）
- ChatResponseFormat.Json：自由格式的 JSON 对象（无预定义模式）
- ChatResponseFormatJson.ForJsonSchema：符合预设 JSON Schema 的结构化输出
JSON Schema：JSON Schema 是一种描述 JSON 数据结构的标准，定义了数据的类型、字段、约束等。Microsoft.Extensions.AI 使用 JSON Schema 来约束 AI 的输出格式。
AIJsonUtilities：AIJsonUtilities.CreateJsonSchema() 是一个便捷方法，可以从 C# 类型自动生成 JSON Schema，无需手动编写 Schema 定义。
ChatOptions：通过 ChatOptions.ResponseFormat 属性指定期望的响应格式，将 JSON Schema 传递给 AI 模型。

重要提示：千问（Qwen）和 DeepSeek 等国内模型不完全支持 ChatResponseFormatJson.ForJsonSchema，推荐做法：

使用 ChatResponseFormat.Json（无 Schema 约束）
在 System Message 中详细描述期望的 JSON 格式
增强反序列化的容错处理

2. 单一对象的结构化输出

定义数据模型

/// <summary>
/// 个人信息数据模型
/// </summary>
public class PersonInfo
{
    [JsonPropertyName("name")]
    public string? Name { get; set; }
    
    [JsonPropertyName("age")]
    public int? Age { get; set; }
    
    [JsonPropertyName("occupation")]
    public string? Occupation { get; set; }
    
    [JsonPropertyName("location")]
    public string? Location { get; set; }
}

配置 JSON 响应格式（无 Schema）: 使用 ChatResponseFormat.Json 而非 ForJsonSchema：

// 使用 ChatResponseFormat.Json（国内模型推荐方式）
ChatOptions chatOptions = new()
{
    ResponseFormat = ChatResponseFormat.Json  // 注意：不是 ForJsonSchema
};

通过提示词指定 JSON 结构：由于没有 Schema 约束，我们需要在 System Message 中详细描述期望的 JSON 格式：

var systemMessage = """
你是一个信息提取助手。请严格按照以下 JSON 格式返回结果，不要添加任何其他文本：

{
    "name": "姓名（字符串）",
    "age": "年龄（整数）",
    "occupation": "职业（字符串）",
    "location": "工作地点（字符串）"
}

示例输出：
{
    "name": "张三",
    "age": 30,
    "occupation": "工程师",
    "location": "北京"
}
""";
var userMessage = "请提取：刘洋是一名42岁的数据科学家，目前在深圳工作。";
var messages = new[]
{
    new ChatMessage(ChatRole.System, systemMessage),
    new ChatMessage(ChatRole.User, userMessage)
};

发送请求并解析结果

Console.WriteLine("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
Console.WriteLine("           结构化输出测试           ");
Console.WriteLine("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n");

try
{
    var response = await chatClient.GetResponseAsync(messages, chatOptions);

    Console.WriteLine("DeepSeek 响应完成\n");
    Console.WriteLine("响应详情：");
    new
    {
        模型 = response.ModelId,
        Token使用 = response.Usage?.TotalTokenCount ?? 0,
        完成原因 = response.FinishReason
    }.Display();

    Console.WriteLine("\n原始 JSON 响应：");
    Console.WriteLine(response.Text);

    // 反序列化为强类型对象
    var personInfo = JsonSerializer.Deserialize<PersonInfo>(
        response.Text ?? "{}",
        JsonSerializerOptions.Web);

    Console.WriteLine("\n反序列化成功\n");
    Console.WriteLine("提取的个人信息：");
    personInfo.Display();
}
catch (Exception ex)
{
    Console.WriteLine($"错误：{ex.Message}");
}

3：复杂对象的结构化输出

定义产品评论分析的数据结构（包含嵌套对象）：

/// <summary>
/// 情感分析结果
/// </summary>
public class SentimentAnalysis
{
    [JsonPropertyName("sentiment")]
    public string? Sentiment { get; set; } // Positive, Neutral, Negative
    
    [JsonPropertyName("confidence")]
    public double Confidence { get; set; } // 0.0 - 1.0
}

/// <summary>
/// 产品评论分析结果
/// </summary>
public class ProductReviewAnalysis
{
    [JsonPropertyName("product_name")]
    public string? ProductName { get; set; }
    
    [JsonPropertyName("rating")]
    public int Rating { get; set; } // 1-5
    
    [JsonPropertyName("sentiment")]
    public SentimentAnalysis? Sentiment { get; set; }
    
    [JsonPropertyName("key_points")]
    public List<string>? KeyPoints { get; set; }
    
    [JsonPropertyName("recommendation")]
    public bool Recommendation { get; set; }
}

实现嵌套对象的结构化输出

ChatOptions chatOptions = new()
{
    ResponseFormat = ChatResponseFormat.Json  // 注意：不是 ForJsonSchema
};

var systemMessage = """
你是一个产品评论分析专家。请严格按照以下 JSON 格式返回分析结果：

{
"product_name": "产品名称（字符串）",
"rating": "评分（1-5的整数）",
"sentiment": {
"sentiment": "情感（Positive/Neutral/Negative）",
"confidence": "置信度（0.0-1.0的小数）"
},
"key_points": ["关键点1", "关键点2", "关键点3"],
"recommendation": "是否推荐（true/false）"
}

重要说明：
1. rating 必须是 1-5 之间的整数
2. sentiment 只能是 Positive、Neutral 或 Negative
3. confidence 是 0 到 1 之间的小数
4. key_points 是字符串数组，提取 3-5 个关键要点
5. recommendation 是布尔值
""";

var userMessage = @"
小米14真是太赞了！骁龙8Gen3性能强劲，运行大型游戏毫无压力。
徕卡镜头的拍照效果惊艳，特别是人像模式。续航也很给力，重度使用一天没问题。
价格很实惠，性价比超高！强烈推荐给预算有限但追求性能的朋友。
";

var messages = new[]
{
new ChatMessage(ChatRole.System, systemMessage),
new ChatMessage(ChatRole.User, $"请分析以下产品评论：\n\n{userMessage}")
};

Console.WriteLine("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
Console.WriteLine("       复杂对象结构化输出测试       ");
Console.WriteLine("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\n");

try
{
    var reviewResponse = await chatClient.GetResponseAsync(messages, chatOptions);

    Console.WriteLine("分析完成\n");
    Console.WriteLine("原始 JSON 响应：");
    Console.WriteLine(reviewResponse.Text);

    // 反序列化为复杂对象
    var reviewAnalysis = JsonSerializer.Deserialize<ProductReviewAnalysis>(
        reviewResponse.Text ?? "{}",
        JsonSerializerOptions.Web);

    Console.WriteLine("\n反序列化成功\n");
    Console.WriteLine("分析结果：");
    reviewAnalysis.Display();
}
catch (Exception ex)
{
    Console.WriteLine($"错误：{ex.Message}");
    Console.WriteLine($"详细信息：{ex}");
}

4. 结构化输出的最佳实践

与 OpenAI ForJsonSchema 的对比：

特性	OpenAI (ForJsonSchema)	国内模型 Qwen\|DeepSeek (Json)
Schema 定义	自动生成 JSON Schema	手动描述格式
格式保证	严格遵循 Schema	依赖提示词质量
配置方式	ChatResponseFormatJson.ForJsonSchema	ChatResponseFormat.Json
容错性	高（模型强制遵循）	中（需要详细指示）
适用场景	严格的数据约束	灵活的数据提取

国内模型使用技巧：
- 详细的格式说明
- 在 System Message 中提供完整的 JSON 模板
- 说明每个字段的类型和约束
- 提供输出示例
强调返回格式
- 明确要求"严格按照 JSON 格式返回
- 提示"不要添加任何其他文本"
- 使用"重要说明"强调约束
增强容错处理

try
{
    var result = JsonSerializer.Deserialize<T>(response, options);
}
catch (JsonException ex)
{
    // 降级处理：使用正则表达式提取 JSON 部分
    // 或提供默认值
}

使用 JsonSerializerOptions.Web

// 使用 Web 选项，支持驼峰命名和宽松解析
var result = JsonSerializer.Deserialize<PersonInfo>(
    json, 
    JsonSerializerOptions.Web
);

常见问题解决：

问题	原因	解决方案
返回带有额外文本	模型添加了说明	在提示中强调"只返回 JSON"
字段缺失	格式说明不够清晰	提供完整的 JSON 模板示例
类型不匹配	模型理解有误	明确说明字段类型（如"整数"、"布尔值"）
包含注释	模型添加了 JSON 注释	提示"不要添加注释"，或预处理去除注释

八、使用依赖注入

1. 核心组件

AddChatClient：将 IChatClient 注册到 DI 容器的扩展方法，支持配置中间件管道。
AddEmbeddingGenerator：将 IEmbeddingGenerator 注册到 DI 容器的扩展方法。
IServiceCollection：.NET 依赖注入容器的服务集合接口，用于注册服务。
ServiceProvider：服务提供者，负责解析和提供已注册的服务实例。
IChatClientBuilder：聊天客户端构建器，通过链式调用配置中间件管道。
服务生命周期：支持 Singleton、Scoped、Transient 三种生命周期模式。

3. 基础依赖注入

配置中间件管道：AddChatClient 返回一个 IChatClientBuilder，可以通过链式调用添加各种中间件。

// 创建新的服务集合
var services = new ServiceCollection();

// 配置缓存服务
services.AddSingleton<IDistributedCache>(sp => 
    new MemoryDistributedCache(Options.Create(new MemoryDistributedCacheOptions())));

// 注册 ChatClient 并配置中间件管道
services.AddChatClient(chatClient)
    .UseDistributedCache() // 添加缓存中间件
    .UseFunctionInvocation(); // 添加函数调用中间件

var serviceProvider = services.BuildServiceProvider();
var enhancedClient = serviceProvider.GetRequiredService<IChatClient>();

Console.WriteLine("带中间件管道的 ChatClient 已配置");
enhancedClient.Display();

// 测试缓存效果
Console.WriteLine("\n第一次请求（调用模型）...");
var sw1 = Stopwatch.StartNew();
var response1 = await enhancedClient.GetResponseAsync("什么是 MEAI？");
sw1.Stop();
Console.WriteLine($" 耗时: {sw1.ElapsedMilliseconds}ms");

Console.WriteLine("\n第二次请求（从缓存返回）...");
var sw2 = Stopwatch.StartNew();
var response2 = await enhancedClient.GetResponseAsync("什么是 MEAI？");
sw2.Stop();
Console.WriteLine($" 耗时: {sw2.ElapsedMilliseconds}ms");
Console.WriteLine($"缓存加速比: {(double)sw1.ElapsedMilliseconds / sw2.ElapsedMilliseconds:F2}x");

使用工厂方法动态创建客户端

var services = new ServiceCollection();

// 注册配置服务（模拟从配置文件读取）
services.AddSingleton(new AIConfiguration
{
    Provider = "Qwen",
    ModelName = "Qwen-max",
    EnableCaching = true,
    EnableFunctionCalling = true
});

// 使用工厂方法注册 ChatClient
services.AddChatClient(serviceProvider =>
{
    var config = serviceProvider.GetRequiredService<AIConfiguration>();
    Console.WriteLine($"从配置创建 ChatClient: Provider={config.Provider}, Model={config.ModelName}");

    // 根据配置创建底层客户端
    return chatClient;
})
.Use(
    getResponseFunc: (messages, options, innerClient, cancellationToken) =>
    {
        // 可以从 IServiceProvider 中获取其他服务
        Console.WriteLine($"[中间件] 处理请求，消息数: {messages.Count()}");
        return innerClient.GetResponseAsync(messages, options, cancellationToken);
    },
    getStreamingResponseFunc: null); // 简化示例，未实现流式响应;

var serviceProvider = services.BuildServiceProvider();
var configuredClient = serviceProvider.GetRequiredService<IChatClient>();

var testResponse = await configuredClient.GetResponseAsync("测试配置化客户端");
Console.WriteLine($"\n响应: {testResponse.Text?[..Math.Min(50, testResponse.Text.Length)]}...");

// 配置类定义
public class AIConfiguration
{
    public string Provider { get; set; }
    public string ModelName { get; set; }
    public bool EnableCaching { get; set; }
    public bool EnableFunctionCalling { get; set; }
}

注册多个命名客户端

var services = new ServiceCollection();

// 注册默认客户端（快速响应，无缓存）
services.AddKeyedChatClient("fast", chatClient)
    .Use(getResponseFunc: (messages, options, innerClient, cancellationToken) =>
    {
        Console.WriteLine("[Fast Client] 快速处理请求");
        return innerClient.GetResponseAsync(messages, options, cancellationToken);
    }, getStreamingResponseFunc: null);

// 注册缓存客户端（带缓存和函数调用）
services.AddSingleton<IDistributedCache>(sp =>
    new MemoryDistributedCache(Options.Create(new MemoryDistributedCacheOptions())));

services.AddKeyedChatClient("cached", chatClient)
    .UseDistributedCache()
    .UseFunctionInvocation()
    .Use(
        getResponseFunc: (messages, options, innerClient, cancellationToken) =>
        {
            Console.WriteLine("[Cached Client] 带缓存处理请求");
            return innerClient.GetResponseAsync(messages, options, cancellationToken);
        }
    , getStreamingResponseFunc: null);

var serviceProvider4 = services.BuildServiceProvider();

// 解析不同的客户端
var fastClient = serviceProvider4.GetRequiredKeyedService<IChatClient>("fast");
var cachedClient = serviceProvider4.GetRequiredKeyedService<IChatClient>("cached");

Console.WriteLine("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
Console.WriteLine("测试 Fast Client:");
var fastResponse = await fastClient.GetResponseAsync("简单问题");

Console.WriteLine("\n━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
Console.WriteLine("测试 Cached Client:");
var cachedResponse = await cachedClient.GetResponseAsync("复杂问题");

Console.WriteLine("\n多客户端配置成功");

3. 深入理解 AddChatClient 的工作原理

AddChatClient 是 MEAI 提供的核心扩展方法，它简化了 AI 客户端的注册和配置过程。

方法签名与重载

// 1. 直接注册现有实例
public static IChatClientBuilder AddChatClient(this IServiceCollection services, IChatClient chatClient)

// 2. 使用工厂方法创建
public static IChatClientBuilder AddChatClient(this IServiceCollection services, Func<IServiceProvider, IChatClient> factory)

// 3. 注册命名客户端（Keyed Service）
public static IChatClientBuilder AddKeyedChatClient(this IServiceCollection services, string name, IChatClient chatClient)

// 4. 命名客户端 + 工厂方法
public static IChatClientBuilder AddKeyedChatClient(this IServiceCollection services, string name, Func<IServiceProvider, IChatClient> factory)

返回值：IChatClientBuilder

所有 AddChatClient 重载都返回 IChatClientBuilder 接口，它提供了丰富的扩展方法来配置中间件：

public interface IChatClientBuilder
{
    // 添加自定义中间件
    IChatClientBuilder Use(DelegatingChatClient middleware);
    
    // 以下是常用的扩展方法：
    // - UseFunctionInvocation() - 函数调用
    // - UseDistributedCache() - 分布式缓存
    // - UseChatReducer() - 消息压缩
    // - UseLogging() - 日志记录
    // - UseOpenTelemetry() - 遥测追踪
}

注册生命周期管理

AddChatClient`默认将服务注册为 Singleton（单例）生命周期，这意味着整个应用程序中只会创建一个实例。

如果需要其他生命周期，可以手动注册：

// Scoped 生命周期（每个请求一个实例，适合 ASP.NET Core）
services.AddScoped<IChatClient>(serviceProvider => 
{
    var baseClient = AIClientHelper.GetDefaultChatClient().Result;
    return baseClient.AsBuilder()
        .UseDistributedCache()
        .Build();
});

// Transient 生命周期（每次解析都创建新实例）
services.AddTransient<IChatClient>(serviceProvider => 
{
    return AIClientHelper.GetDefaultChatClient().Result;
});

生命周期选择建议

生命周期	适用场景	优点	缺点
Singleton	无状态服务、配置固定	性能最佳，内存占用低	无法处理请求级状态
Scoped	ASP.NET Core Web 应用	支持请求级状态，自动释放	每个请求创建实例
Transient	需要隔离的临时任务	完全隔离，适合并发场景	开销大，频繁 GC

注意：MEAI 的 IChatClient 实现通常是无状态的，推荐使用 Singleton 生命周期。

4. 最佳实践与设计模式

推荐做法

优先使用接口类型

// 好 - 依赖抽象
public class MyService
{
    private readonly IChatClient _chatClient;
    public MyService(IChatClient chatClient) => _chatClient = chatClient;
}

// 差 - 依赖具体实现
public class MyService
{
    private readonly OpenAIChatClient _chatClient;
    public MyService(OpenAIChatClient chatClient) => _chatClient = chatClient;
}

集中配置中间件管道

// 在 Program.cs 或 Startup.cs 中统一配置
services.AddChatClient(baseChatClient)
    .UseLogging()
    .UseDistributedCache()
    .UseFunctionInvocation();

// 在业务代码中配置（难以维护）
var client = baseChatClient.AsBuilder()
    .UseLogging()
    .Build();

使用配置系统管理参数

// 从配置读取
var config = configuration.GetSection("AI").Get<AIConfig>();
services.AddChatClient(CreateClientFromConfig(config));

// 硬编码
services.AddChatClient(new OpenAIChatClient("hardcoded-key"));

为不同场景注册命名客户端

services.AddChatClient("fast", fastClient); // 快速响应
services.AddChatClient("accurate", accurateClient); // 高质量
services.AddChatClient("cached", cachedClient); // 带缓存

结合健康检查

services.AddHealthChecks().AddCheck<AIServiceHealthCheck>("ai-service");

public class AIServiceHealthCheck : IHealthCheck
{
    private readonly IChatClient _chatClient;
    
    public AIServiceHealthCheck(IChatClient chatClient) => _chatClient = chatClient;
    
    public async Task<HealthCheckResult> CheckHealthAsync(...)
    {
        try
        {
            await _chatClient.GetResponseAsync("health check");
            return HealthCheckResult.Healthy();
        }
        catch (Exception ex)
        {
            return HealthCheckResult.Unhealthy(exception: ex);
        }
    }
}

常见陷阱

忘记构建 ServiceProvider

// 错误 - 没有构建
var services = new ServiceCollection();
services.AddChatClient(client);
var chatClient = services.GetRequiredService<IChatClient>(); // 异常！

// 正确
var serviceProvider = services.BuildServiceProvider();
var chatClient = serviceProvider.GetRequiredService<IChatClient>();

生命周期不匹配

// 错误 - Singleton 依赖 Scoped 服务
services.AddSingleton<IChatClient>(...); // Singleton
services.AddScoped<IDistributedCache>(...); // Scoped

// 正确 - 保持一致
services.AddSingleton<IChatClient>(...);
services.AddSingleton<IDistributedCache>(...);

在构造函数中执行异步操作

// 错误
public class MyService
{
    public MyService(IChatClient client)
    {
        // 构造函数不支持 async
        var response = client.GetResponseAsync("test").Result; // 可能死锁
    }
}

// 正确 - 延迟到方法调用
public class MyService
{
    private readonly IChatClient _client;
    public MyService(IChatClient client) => _client = client;
    
    public async Task InitializeAsync()
    {
        var response = await _client.GetResponseAsync("test");
    }
}

忘记释放 ServiceProvider

// 可能导致内存泄漏
var serviceProvider = services.BuildServiceProvider();
var client = serviceProvider.GetRequiredService<IChatClient>();

// 使用 using 自动释放
using var serviceProvider = services.BuildServiceProvider();
var client = serviceProvider.GetRequiredService<IChatClient>();

5. 架构模式

Repository 模式

public interface IAIRepository
{
    Task<string> GenerateResponse(string prompt);
    Task<List<float>> GenerateEmbedding(string text);
}

public class AIRepository : IAIRepository
{
    private readonly IChatClient _chatClient;
    private readonly IEmbeddingGenerator<string, Embedding<float>> _embeddingGenerator;
    
    public AIRepository(
        IChatClient chatClient, 
        IEmbeddingGenerator<string, Embedding<float>> embeddingGenerator)
    {
        _chatClient = chatClient;
        _embeddingGenerator = embeddingGenerator;
    }
    
    public async Task<string> GenerateResponse(string prompt)
    {
        var response = await _chatClient.GetResponseAsync(prompt);
        return response.Message.Text ?? "";
    }
    
    public async Task<List<float>> GenerateEmbedding(string text)
    {
        var embeddings = await _embeddingGenerator.GenerateAsync(text);
        return embeddings.First().Vector.ToArray().ToList();
    }
}

Strategy 模式（多客户端切换）

public interface IAIClientFactory
{
    IChatClient GetClient(AIClientType type);
}

public enum AIClientType { Fast, Accurate, Cached }

public class AIClientFactory : IAIClientFactory
{
    private readonly IServiceProvider _serviceProvider;
    
    public AIClientFactory(IServiceProvider serviceProvider)
    {
        _serviceProvider = serviceProvider;
    }
    
    public IChatClient GetClient(AIClientType type)
    {
        return type switch
        {
            AIClientType.Fast => _serviceProvider.GetRequiredKeyedService<IChatClient>("fast"),
            AIClientType.Accurate => _serviceProvider.GetRequiredKeyedService<IChatClient>("accurate"),
            AIClientType.Cached => _serviceProvider.GetRequiredKeyedService<IChatClient>("cached"),
            _ => throw new ArgumentException("未知的客户端类型")
        };
    }
}

6. 与其他 MEAI 功能的集成

结合日志记录

services.AddLogging(builder => builder.AddConsole());

services.AddChatClient(baseChatClient).UseLogging(); // 自动记录请求和响应

结合配置验证

services.AddOptions<AIConfig>()
    .Bind(configuration.GetSection("AI"))
    .ValidateDataAnnotations() // 验证配置
    .ValidateOnStart(); // 启动时验证

结合遥测追踪

services.AddOpenTelemetry()
    .WithTracing(builder => builder.AddSource("Microsoft.Extensions.AI"));

services.AddChatClient(baseChatClient)
    .UseOpenTelemetry(); // 自动追踪调用链

7. 完整的企业级配置示例

var builder = WebApplication.CreateBuilder(args);

// 1. 配置日志
builder.Logging.AddConsole();

// 2. 配置缓存
builder.Services.AddStackExchangeRedisCache(options =>
{
    options.Configuration = builder.Configuration.GetConnectionString("Redis");
});

// 3. 配置 AI 服务
builder.Services.AddChatClient(sp =>
{
    var config = sp.GetRequiredService<IConfiguration>();
    var endpoint = config["AI:Endpoint"];
    var apiKey = config["AI:ApiKey"];
    var model = config["AI:Model"];
    
    return new AzureOpenAIClient(new Uri(endpoint), new ApiKeyCredential(apiKey))
        .GetChatClient(model)
        .AsIChatClient();
})
.UseLogging() // 日志
.UseDistributedCache() // 缓存
.UseFunctionInvocation() // 函数调用
.UseOpenTelemetry(); // 遥测

// 4. 注册业务服务
builder.Services.AddTransient<IAIRepository, AIRepository>();
builder.Services.AddTransient<CustomerSupportService>();

var app = builder.Build();

app.MapPost("/api/chat", async (IChatClient client, string message) =>
{
    var response = await client.GetResponseAsync(message);
    return Results.Ok(response.Message.Text);
});

app.Run();

分类: 读书笔记
标签: AI

XIAOSUO 记录个人学习的足迹

Net+AI智能体进阶1：NET平台AI底座

一、M.E.AI概述

二、函数调用 Function Calling

三、配置 ChatOptions

四、会话缓存 Caching

五、上下文窗口压缩 Chat Reducer

六、工具削减 Tool Reduction

七、结构化输出

八、使用依赖注入

About

随笔档案

随笔分类

随笔标签

推荐随笔

最新随笔

收藏链接