一、服务注册与安全密钥管理

1.1 依赖注入配置

在ASP.NET Core项目启动配置文件Program.cs中，需通过AddHttpClient注册大模型服务客户端。推荐采用接口抽象模式，将具体实现与业务逻辑解耦：

// Program.cs 配置示例
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddHttpClient<IAiService, GenericAiClient>();

此模式支持运行时动态切换不同大模型服务提供商，符合开闭原则。对于需要复杂配置的场景（如超时设置、重试策略），可通过TypedClient模式进一步封装：

builder.Services.AddHttpClient<GenericAiClient>()
    .ConfigurePrimaryHttpMessageHandler(() => new HttpClientHandler 
    {
        ServerCertificateCustomValidationCallback = (_, _, _, _) => true // 仅开发环境示例
    });

1.2 分级密钥管理

密钥安全是生产环境的核心考量，需遵循最小权限原则：

开发环境：使用dotnet user-secrets工具管理本地密钥
```
dotnet user-secrets set "AiService:ApiKey" "sk-dev-xxxx"
```
生产环境：推荐采用主流云服务商的密钥管理服务（KMS），通过环境变量注入：
```
var apiKey = Environment.GetEnvironmentVariable("AI_SERVICE_API_KEY") 
           ?? throw new InvalidOperationException("Missing API key");
```
对于高安全要求的场景，建议结合硬件安全模块（HSM）实现密钥的物理隔离存储。

二、标准化服务接口设计

2.1 核心接口定义

遵循单一职责原则设计服务接口，建议包含以下基础能力：

public interface IAiService : IDisposable
{
    // 文本生成接口
    Task<AiResponse> GenerateTextAsync(
        string prompt, 
        GenerationOptions options = default,
        CancellationToken ct = default);
    // 流式响应接口（适用于长文本生成）
    IAsyncEnumerable<AiChunk> GenerateStreamAsync(
        string prompt, 
        CancellationToken ct = default);
}

其中GenerationOptions可包含温度、最大生成长度等参数，AiResponse应封装模型返回的原始数据及元信息。

2.2 异常处理规范

定义统一的异常体系，区分业务异常与技术异常：

public class AiServiceException : Exception
{
    public HttpStatusCode StatusCode { get; init; }
    public string ModelId { get; init; }
}
// 使用示例
try 
{
    var result = await aiService.GenerateTextAsync(prompt);
}
catch (AiServiceException ex) when (ex.StatusCode == HttpStatusCode.TooManyRequests)
{
    // 实现限流重试逻辑
}

三、生产级客户端实现

3.1 基础实现框架

public class GenericAiClient : IAiService
{
    private readonly HttpClient _httpClient;
    private readonly string _apiKey;
    private readonly ILogger<GenericAiClient> _logger;
    public GenericAiClient(
        HttpClient httpClient,
        IConfiguration config,
        ILogger<GenericAiClient> logger)
    {
        _httpClient = httpClient;
        _apiKey = config["AiService:ApiKey"];
        _logger = logger;
        _httpClient.DefaultRequestHeaders.Authorization = 
            new AuthenticationHeaderValue("Bearer", _apiKey);
    }
    public async Task<AiResponse> GenerateTextAsync(
        string prompt, 
        GenerationOptions options, 
        CancellationToken ct)
    {
        var request = new AiRequest { Prompt = prompt, Options = options };
        var response = await _httpClient.PostAsJsonAsync(
            "v1/completions", 
            request, 
            ct);
        if (!response.IsSuccessStatusCode)
        {
            var errorData = await response.Content.ReadAsStringAsync(ct);
            _logger.LogError("AI API Error: {StatusCode} - {ErrorData}", 
                response.StatusCode, errorData);
            throw new AiServiceException 
            { 
                StatusCode = response.StatusCode,
                Message = errorData
            };
        }
        return await response.Content.ReadFromJsonAsync<AiResponse>(ct);
    }
}

3.2 高级功能扩展

3.2.1 请求重试机制

集成Polly库实现指数退避重试：

var retryPolicy = Policy
    .Handle<HttpRequestException>()
    .OrResult<HttpResponseMessage>(r => r.StatusCode == HttpStatusCode.TooManyRequests)
    .WaitAndRetryAsync(3, retryAttempt => 
        TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)));
var response = await retryPolicy.ExecuteAsync(async () => 
    await _httpClient.PostAsJsonAsync("v1/completions", request, ct));

3.2.2 响应缓存策略

对于高频查询场景，可结合内存缓存实现：

public class CachingAiClientDecorator : IAiService
{
    private readonly IAiService _innerClient;
    private readonly IMemoryCache _cache;
    public CachingAiClientDecorator(IAiService innerClient, IMemoryCache cache)
    {
        _innerClient = innerClient;
        _cache = cache;
    }
    public async Task<AiResponse> GenerateTextAsync(...)
    {
        var cacheKey = $"{prompt}:{options.MaxTokens}";
        if (_cache.TryGetValue(cacheKey, out var cachedResult))
        {
            return (AiResponse)cachedResult;
        }
        var result = await _innerClient.GenerateTextAsync(...);
        _cache.Set(cacheKey, result, TimeSpan.FromMinutes(5));
        return result;
    }
}

四、性能优化实践

4.1 连接池管理

确保HttpClient实例复用，避免DNS解析和TCP握手开销：

// Program.cs 中配置SocketsHttpHandler
builder.Services.AddHttpClient<IAiService, GenericAiClient>()
    .ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler
    {
        PooledConnectionLifetime = TimeSpan.FromMinutes(5),
        PooledConnectionIdleTimeout = TimeSpan.FromMinutes(1),
        EnableMultipleHttp2Connections = true
    });

4.2 异步流水线优化

对于高并发场景，可采用Channel实现生产者-消费者模式：

public class BatchAiClient : IAiService
{
    private readonly Channel<string> _promptChannel;
    private readonly IAiService _innerClient;
    public BatchAiClient(IAiService innerClient)
    {
        _innerClient = innerClient;
        _promptChannel = Channel.CreateUnbounded<string>();
        _ = ProcessBatchAsync(); // 启动后台处理任务
    }
    private async Task ProcessBatchAsync()
    {
        var batch = new List<string>();
        await foreach (var prompt in _promptChannel.Reader.ReadAllAsync())
        {
            batch.Add(prompt);
            if (batch.Count >= 10) // 批量阈值
            {
                var results = await _innerClient.GenerateBatchAsync(batch);
                // 处理结果...
                batch.Clear();
            }
        }
    }
}

五、监控与可观测性

5.1 日志记录规范

建议记录以下关键信息：

_logger.LogInformation("AI Request {@Request} from {UserId}", 
    new { prompt, options }, User.Identity.Name);
// 响应日志（脱敏处理）
_logger.LogInformation("AI Response with {TokenCount} tokens", 
    response.Usage.TotalTokens);

5.2 指标监控

集成App Metrics或Prometheus记录：

请求延迟分布（P50/P90/P99）
错误率（按状态码分类）
令牌消耗速率
缓存命中率

六、安全最佳实践

输入验证：对用户输入的prompt进行长度限制和敏感词过滤
输出过滤：防止模型生成恶意代码或违规内容
速率限制：基于用户/租户的QPS控制
数据隔离：确保不同租户的数据在传输和存储过程中隔离

通过以上标准化实现方案，开发者可在ASP.NET Core项目中快速构建安全、可靠、高性能的大模型服务集成架构。实际开发中应根据具体业务需求调整缓存策略、重试机制等参数，并持续监控服务运行状态进行优化迭代。

在ASP.NET Core项目中无缝集成大模型服务全攻略