一、环境搭建核心架构设计

在SpringAI与大语言模型（LLM）集成场景中，系统架构需兼顾灵活性与可扩展性。典型分层架构包含以下模块：

模型服务层：封装与LLM交互的核心逻辑，支持多模型动态切换
业务适配层：将模型能力转化为业务可用的API接口
监控管理层：实现调用日志、性能指标的采集与告警

// 示例：模型服务抽象接口
public interface ModelService {
    String generateText(String prompt, Map<String, Object> params);
    Stream<String> streamGenerate(String prompt);
    boolean validateInput(String input);
}

建议采用依赖注入模式管理不同模型实现，例如：

@Configuration
public class ModelConfig {
    @Bean
    @Qualifier("llmService")
    public ModelService llmService() {
        // 动态选择模型实现
        return new LlamaAdapter(); // 或QianWenAdapter()
    }
}

二、核心依赖与版本管理

构建稳定环境需严格管理依赖版本，推荐组合：

Spring Boot 3.2.x + Spring AI 1.1.x
HTTP客户端：WebClient（响应式）或RestTemplate
序列化：Jackson 2.15+

关键依赖示例（Maven）：

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-core</artifactId>
    <version>1.1.0</version>
</dependency>
<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-http</artifactId>
    <version>1.1.0</version>
</dependency>

版本冲突解决方案：

使用mvn dependency:tree分析依赖树
通过<exclusions>排除冲突传递依赖
统一依赖管理至父POM

三、API调用层深度实现

1. 基础请求封装

public class LlmApiClient {
    private final WebClient webClient;
    private final String apiKey;
    public LlmApiClient(String baseUrl, String apiKey) {
        this.webClient = WebClient.builder()
            .baseUrl(baseUrl)
            .defaultHeader("Authorization", "Bearer " + apiKey)
            .build();
        this.apiKey = apiKey;
    }
    public Mono<String> callCompletion(String prompt) {
        return webClient.post()
            .uri("/v1/completions")
            .bodyValue(new CompletionRequest(prompt))
            .retrieve()
            .bodyToMono(CompletionResponse.class)
            .map(CompletionResponse::getContent);
    }
}

2. 流式响应处理

对于长文本生成场景，需实现SSE（Server-Sent Events）解析：

public Flux<String> streamCompletion(String prompt) {
    return webClient.post()
        .uri("/v1/completions/stream")
        .bodyValue(new StreamRequest(prompt))
        .accept(MediaType.TEXT_EVENT_STREAM)
        .retrieve()
        .bodyToFlux(String.class)
        .map(this::parseStreamEvent);
}
private String parseStreamEvent(String event) {
    // 解析"data: {"content":"..."}"格式
    String[] parts = event.split("data: ")[1].trim().split("\n")[0].split("\\}\"");
    return parts[0].replace("{\"content\":\"", "") + (parts.length > 1 ? parseStreamEvent("data: " + parts[1]) : "");
}

四、多模型适配方案

1. 适配器模式实现

public abstract class ModelAdapter implements ModelService {
    protected final RestTemplate restTemplate;
    public ModelAdapter() {
        this.restTemplate = new RestTemplateBuilder()
            .setConnectTimeout(Duration.ofSeconds(10))
            .setReadTimeout(Duration.ofSeconds(30))
            .build();
    }
    @Override
    public boolean validateInput(String input) {
        return input != null && input.length() <= getMaxInputLength();
    }
    protected abstract int getMaxInputLength();
}
public class QianWenAdapter extends ModelAdapter {
    @Override
    public String generateText(String prompt, Map<String, Object> params) {
        // 实现特定模型调用逻辑
        HttpHeaders headers = new HttpHeaders();
        headers.setContentType(MediaType.APPLICATION_JSON);
        // ...构建请求体
    }
}

2. 动态路由策略

实现基于配置的模型路由：

@Service
public class ModelRouter {
    @Autowired
    private List<ModelService> modelServices;
    private final Map<String, ModelService> routeMap = new ConcurrentHashMap<>();
    @PostConstruct
    public void init() {
        // 从配置加载路由规则
        routeMap.put("default", modelServices.get(0));
        routeMap.put("high_quality", modelServices.stream()
            .filter(s -> s instanceof PremiumModelService)
            .findFirst()
            .orElseThrow());
    }
    public ModelService getModel(String routeKey) {
        return Optional.ofNullable(routeMap.get(routeKey))
            .orElseThrow(() -> new IllegalArgumentException("Invalid route key"));
    }
}

五、生产环境优化实践

1. 性能调优要点

连接池配置：

@Bean
public HttpClient httpClient() {
    return HttpClient.create()
        .responseTimeout(Duration.ofSeconds(30))
        .doOnConnected(conn -> 
            conn.addHandlerLast(new ReadTimeoutHandler(30))
                .addHandlerLast(new WriteTimeoutHandler(10)));
}

异步处理优化：

使用@Async注解实现非阻塞调用

配置自定义线程池：

@Configuration
@EnableAsync
public class AsyncConfig {
    @Bean(name = "modelExecutor")
    public Executor modelExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(10);
        executor.setMaxPoolSize(20);
        executor.setQueueCapacity(100);
        return executor;
    }
}

2. 异常处理机制

实现分级异常处理：

@ControllerAdvice
public class ModelExceptionHandler {
    @ExceptionHandler(ModelTimeoutException.class)
    public ResponseEntity<ErrorResponse> handleTimeout(ModelTimeoutException ex) {
        return ResponseEntity.status(429)
            .body(new ErrorResponse("MODEL_TIMEOUT", "Model response exceeded timeout"));
    }
    @ExceptionHandler(ModelRateLimitException.class)
    public ResponseEntity<ErrorResponse> handleRateLimit(ModelRateLimitException ex) {
        return ResponseEntity.status(429)
            .header("Retry-After", String.valueOf(ex.getRetrySeconds()))
            .body(new ErrorResponse("RATE_LIMITED", ex.getMessage()));
    }
}

六、安全与合规实践

敏感信息脱敏：
- 实现请求/响应日志的自动脱敏
- 使用AOP拦截模型调用日志

鉴权体系集成：

public class AuthInterceptor implements ClientHttpRequestInterceptor {
    @Override
    public ClientHttpResponse intercept(HttpRequest request, byte[] body, 
        ClientHttpRequestExecution execution) throws IOException {
        // 动态添加鉴权头
        String token = TokenProvider.getToken();
        request.getHeaders().set("X-API-KEY", token);
        return execution.execute(request, body);
    }
}

数据加密传输：
- 强制使用HTTPS
- 敏感参数加密（如使用JWE）

七、监控与运维体系

1. 指标采集方案

@Bean
public MeterRegistryCustomizer<MeterRegistry> metricsCommonTags() {
    return registry -> registry.config().commonTags("application", "llm-service");
}
@Timed(value = "model.call", description = "Time spent calling model API")
public String callModel(String prompt) {
    // 模型调用逻辑
}

2. 日志追踪实现

使用MDC实现请求ID追踪

结构化日志示例：

{
  "timestamp": "2023-11-15T10:30:45.123Z",
  "level": "INFO",
  "traceId": "abc123",
  "service": "llm-gateway",
  "message": "Model call completed",
  "model": "qianwen-v2",
  "durationMs": 452,
  "tokens": 128
}

通过上述架构设计与实现细节，开发者可构建出高可用、可扩展的SpringAI与大模型集成环境。实际开发中需特别注意：1）模型API的兼容性测试 2）异步处理的上下文传递 3）生产环境的全链路压测。建议采用蓝绿部署策略逐步上线，并通过混沌工程验证系统容错能力。

SpringAI与主流大模型环境搭建实战（二）