一、部署前准备与环境要求
1.1 基础环境配置
Docker Compose部署Kafka需满足以下条件:
- Docker版本≥20.10.0(推荐最新稳定版)
- Docker Compose版本≥1.29.0
- 服务器资源要求:
- 单机版:2核CPU/4GB内存/20GB磁盘
- 集群版(3节点):4核CPU/8GB内存/50GB磁盘
- 操作系统:Linux(Ubuntu 20.04+或CentOS 7+)
1.2 网络与存储规划
建议配置独立网络:
docker network create kafka-net --driver bridge --subnet 172.20.0.0/16
存储方案选择:
- 开发环境:使用Docker卷(推荐)
- 生产环境:绑定主机目录或使用NFS
二、单机版部署方案
2.1 基础配置文件
创建docker-compose-single.yml:
version: '3.8'services:zookeeper:image: confluentinc/cp-zookeeper:7.3.0container_name: zookeeperenvironment:ZOOKEEPER_CLIENT_PORT: 2181ZOOKEEPER_TICK_TIME: 2000ports:- "2181:2181"networks:- kafka-netkafka:image: confluentinc/cp-kafka:7.3.0container_name: kafkadepends_on:- zookeeperports:- "9092:9092"environment:KAFKA_BROKER_ID: 1KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXTKAFKA_ADVERTISED_LISTENERS: PLAINTEXT://localhost:9092KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0networks:- kafka-netnetworks:kafka-net:external: true
2.2 关键配置解析
-
Zookeeper配置:
ZOOKEEPER_TICK_TIME:基础时间单位(ms),影响心跳检测- 内存限制建议:
-Xmx1g -Xms1g(通过JVM_OPTS环境变量设置)
-
Kafka配置:
KAFKA_BROKER_ID:必须唯一标识KAFKA_ADVERTISED_LISTENERS:客户端连接地址KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR:单机版必须设为1
2.3 启动与验证
docker-compose -f docker-compose-single.yml up -d
验证命令:
# 创建测试topicdocker exec -it kafka kafka-topics --create --topic test --bootstrap-server localhost:9092 --partitions 1 --replication-factor 1# 发送消息docker exec -it kafka bash -c "echo 'test message' | kafka-console-producer --topic test --bootstrap-server localhost:9092"# 消费消息docker exec -it kafka kafka-console-consumer --topic test --from-beginning --bootstrap-server localhost:9092
三、集群版部署方案
3.1 三节点集群配置
创建docker-compose-cluster.yml:
version: '3.8'services:zookeeper:image: confluentinc/cp-zookeeper:7.3.0container_name: zookeeperenvironment:ZOOKEEPER_SERVER_ID: 1ZOOKEEPER_CLIENT_PORT: 2181ZOOKEEPER_TICK_TIME: 2000ZOOKEEPER_INIT_LIMIT: 5ZOOKEEPER_SYNC_LIMIT: 2ZOOKEEPER_SERVERS: zookeeper:2888:3888ports:- "2181:2181"networks:- kafka-netkafka1:image: confluentinc/cp-kafka:7.3.0container_name: kafka1depends_on:- zookeeperports:- "9092:9092"environment:KAFKA_BROKER_ID: 1KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXTKAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka1:19092,EXTERNAL://${HOST_IP}:9092KAFKA_INTER_BROKER_LISTENER_NAME: INTERNALKAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 3KAFKA_MIN_INSYNC_REPLICAS: 2networks:- kafka-netkafka2:image: confluentinc/cp-kafka:7.3.0container_name: kafka2depends_on:- zookeeperports:- "9093:9093"environment:KAFKA_BROKER_ID: 2KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXTKAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka2:19093,EXTERNAL://${HOST_IP}:9093KAFKA_INTER_BROKER_LISTENER_NAME: INTERNALnetworks:- kafka-netkafka3:image: confluentinc/cp-kafka:7.3.0container_name: kafka3depends_on:- zookeeperports:- "9094:9094"environment:KAFKA_BROKER_ID: 3KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXTKAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka3:19094,EXTERNAL://${HOST_IP}:9094KAFKA_INTER_BROKER_LISTENER_NAME: INTERNALnetworks:- kafka-net
3.2 集群配置要点
-
Zookeeper集群模式:
- 需要配置
ZOOKEEPER_SERVERS环境变量 - 每个节点需有唯一
ZOOKEEPER_SERVER_ID
- 需要配置
-
Kafka多节点配置:
- 每个broker必须有唯一
KAFKA_BROKER_ID - 推荐使用双协议监听:
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXTKAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka1:19092,EXTERNAL://192.168.1.100:9092
- 关键参数:
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR:建议与节点数相同KAFKA_MIN_INSYNC_REPLICAS:至少2(3节点集群)
- 每个broker必须有唯一
3.3 集群验证测试
# 创建3副本topicdocker exec -it kafka1 kafka-topics --create --topic cluster-test --bootstrap-server kafka1:19092 --partitions 3 --replication-factor 3# 查看topic详情docker exec -it kafka1 kafka-topics --describe --topic cluster-test --bootstrap-server kafka1:19092# 测试高可用# 停止一个broker后验证消息仍可正常收发docker stop kafka2
四、生产环境优化建议
4.1 性能调优参数
environment:KAFKA_NUM_PARTITIONS: 6 # 默认分区数KAFKA_LOG_RETENTION_HOURS: 168 # 消息保留时间KAFKA_LOG_SEGMENT_BYTES: 1073741824 # 1GB段大小KAFKA_MESSAGE_MAX_BYTES: 1000012 # 最大消息大小KAFKA_NUM_NETWORK_THREADS: 3 # 网络线程数KAFKA_NUM_IO_THREADS: 8 # IO线程数
4.2 监控集成方案
推荐配置:
- JMX导出:
environment:KAFKA_JMX_OPTS: "-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.port=9999 -Dcom.sun.management.jmxremote.rmi.port=9999 -Djava.rmi.server.hostname=localhost"
- Prometheus+Grafana监控:
- 使用
bitnami/jmx-exporter容器采集指标 - 配置Grafana仪表盘(ID:7589)
- 使用
4.3 备份恢复策略
-
定期备份:
# 备份元数据docker exec -it zookeeper bash -c "echo stat | nc localhost 2181 > /tmp/zookeeper_stat.log"# 备份topic数据docker exec -it kafka1 bash -c "kafka-configs --bootstrap-server localhost:9092 --entity-type topics --describe > /tmp/topics_config.log"
- 灾难恢复:
- 使用
kafka-mirror-maker进行数据迁移 - 测试恢复流程:
kafka-topics --create --topic restored-topic --bootstrap-server new-cluster:9092 --config replication.factor=3
- 使用
五、常见问题解决方案
5.1 连接问题排查
-
端口不通:
- 检查防火墙规则:
iptables -L -n - 验证端口监听:
netstat -tulnp | grep 9092
- 检查防火墙规则:
-
广告地址错误:
- 确保
KAFKA_ADVERTISED_LISTENERS配置正确 - 测试外部访问:
telnet <host-ip> 9092
- 确保
5.2 集群同步问题
-
UnderReplicatedPartitions警告:
- 检查
kafka-topics --describe输出 - 验证
ISR列表是否完整
- 检查
-
Zookeeper会话过期:
- 调整
zookeeper.session.timeout.ms(默认18000ms) - 检查网络延迟:
ping zookeeper
- 调整
5.3 性能瓶颈分析
-
生产者延迟:
- 监控
record-queue-time-avg指标 - 调整
batch.size和linger.ms参数
- 监控
-
消费者滞后:
- 监控
records-lag-max指标 - 增加消费者实例或调整
fetch.min.bytes
- 监控
六、进阶部署方案
6.1 使用Kafka Connect
配置示例:
kafka-connect:image: confluentinc/cp-kafka-connect:7.3.0container_name: kafka-connectdepends_on:- kafka1ports:- "8083:8083"environment:CONNECT_BOOTSTRAP_SERVERS: kafka1:19092CONNECT_REST_ADVERTISED_HOST_NAME: connectCONNECT_GROUP_ID: compose-connect-groupCONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configsCONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsetsCONNECT_STATUS_STORAGE_TOPIC: docker-connect-statusnetworks:- kafka-net
6.2 集成Schema Registry
schema-registry:image: confluentinc/cp-schema-registry:7.3.0container_name: schema-registrydepends_on:- kafka1ports:- "8081:8081"environment:SCHEMA_REGISTRY_HOST_NAME: schema-registrySCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: kafka1:19092networks:- kafka-net
七、总结与最佳实践
-
部署原则:
- 单机版适用于开发测试
- 生产环境至少3节点集群
- 副本因子建议设置为节点数
-
监控告警:
- 关键指标:UnderReplicatedPartitions、RequestLatency、DiskUsage
- 告警阈值:ISR收缩>10%、磁盘使用>80%
-
升级策略:
- 滚动升级:每次升级1个broker
- 版本兼容性:确保Zookeeper和Kafka版本匹配
-
安全建议:
- 启用SASL_SSL认证
- 配置ACL权限控制
- 定期轮换密钥
通过本文提供的Docker Compose配置和操作指南,开发者可以快速搭建满足不同场景需求的Kafka环境。实际部署时建议先在测试环境验证配置,再逐步迁移到生产环境。