一、集成前的准备工作
1.1 百度AI开放平台注册与认证
开发者需首先在百度AI开放平台完成注册,并通过实名认证。认证通过后进入”语音技术”板块,创建”语音识别”应用,获取API Key和Secret Key。这两个密钥是后续鉴权的核心凭证,需妥善保管。
1.2 Android项目基础配置
在Android Studio中创建或打开现有项目,确保minSdkVersion≥16(API 16对应Android 4.1)。在build.gradle(Module)中添加网络权限依赖:
dependencies {implementation 'com.squareup.okhttp3:okhttp:4.9.1' // 网络请求库implementation 'org.json:json:20231013' // JSON解析库}
同步Gradle后,在AndroidManifest.xml中添加必要权限:
<uses-permission android:name="android.permission.INTERNET" /><uses-permission android:name="android.permission.RECORD_AUDIO" /><uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE" />
二、核心集成步骤
2.1 鉴权参数生成
百度语音识别API采用AK/SK鉴权机制,需通过以下代码生成访问令牌:
public class AuthUtil {private static final String API_KEY = "your_api_key";private static final String SECRET_KEY = "your_secret_key";public static String getAccessToken() throws Exception {OkHttpClient client = new OkHttpClient();Request request = new Request.Builder().url("https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials" +"&client_id=" + API_KEY +"&client_secret=" + SECRET_KEY).build();try (Response response = client.newCall(request).execute()) {JSONObject json = new JSONObject(response.body().string());return json.getString("access_token");}}}
建议将此方法封装为异步任务,避免阻塞主线程。
2.2 语音数据采集
使用MediaRecorder或AudioRecord实现录音功能,推荐采用PCM格式:
public class AudioRecorder {private static final int SAMPLE_RATE = 16000; // 百度推荐采样率private static final int CHANNEL_CONFIG = AudioFormat.CHANNEL_IN_MONO;private static final int AUDIO_FORMAT = AudioFormat.ENCODING_PCM_16BIT;private AudioRecord recorder;private boolean isRecording = false;public void startRecording(File outputFile) throws IOException {int bufferSize = AudioRecord.getMinBufferSize(SAMPLE_RATE, CHANNEL_CONFIG, AUDIO_FORMAT);recorder = new AudioRecord(MediaRecorder.AudioSource.MIC,SAMPLE_RATE,CHANNEL_CONFIG,AUDIO_FORMAT,bufferSize);recorder.startRecording();isRecording = true;new Thread(() -> {byte[] buffer = new byte[bufferSize];try (FileOutputStream fos = new FileOutputStream(outputFile)) {while (isRecording) {int read = recorder.read(buffer, 0, bufferSize);if (read > 0) fos.write(buffer, 0, read);}} catch (IOException e) {e.printStackTrace();}}).start();}public void stopRecording() {if (recorder != null) {isRecording = false;recorder.stop();recorder.release();recorder = null;}}}
2.3 API调用实现
百度语音识别支持多种方式,推荐使用WebSocket长连接实现实时识别:
public class SpeechRecognizer {private static final String WEBSOCKET_URL = "wss://vop.baidu.com/ws_speech?token=";public interface RecognitionListener {void onResult(String text);void onError(int code, String message);}public void recognize(File audioFile, RecognitionListener listener) {new Thread(() -> {try {String token = AuthUtil.getAccessToken();OkHttpClient client = new OkHttpClient.Builder().pingInterval(30, TimeUnit.SECONDS).build();Request request = new Request.Builder().url(WEBSOCKET_URL + token).build();WebSocket webSocket = client.newWebSocket(request, new WebSocketListener() {@Overridepublic void onOpen(WebSocket webSocket, Response response) {// 发送配置信息String config = "{\"format\":\"pcm\",\"rate\":16000,\"channel\":1,\"cuid\":\"android_device\"}";webSocket.send(config);// 发送音频数据try (FileInputStream fis = new FileInputStream(audioFile)) {byte[] buffer = new byte[1024];int bytesRead;while ((bytesRead = fis.read(buffer)) != -1) {webSocket.send(Base64.encodeToString(buffer, 0, bytesRead, Base64.NO_WRAP));}webSocket.send(new byte[0]); // 结束标记} catch (IOException e) {listener.onError(500, "Audio send failed");}}@Overridepublic void onMessage(WebSocket webSocket, String text) {try {JSONObject json = new JSONObject(text);if (json.has("result")) {String result = json.getJSONArray("result").getString(0);listener.onResult(result);}} catch (JSONException e) {e.printStackTrace();}}@Overridepublic void onFailure(WebSocket webSocket, Throwable t, Response response) {listener.onError(400, t.getMessage());}});} catch (Exception e) {listener.onError(500, e.getMessage());}}).start();}}
三、优化与调试技巧
3.1 性能优化策略
- 音频预处理:使用
AudioTrack进行降噪处理,提升识别准确率 - 网络优化:设置合理的超时时间(建议15秒),添加重试机制
- 内存管理:采用流式传输避免大文件内存溢出
3.2 常见问题解决方案
| 问题现象 | 可能原因 | 解决方案 |
|---|---|---|
| 401未授权 | 无效token | 检查API Key/Secret Key是否正确 |
| 413请求体过大 | 音频过长 | 限制单次识别时长(建议≤60秒) |
| 504网关超时 | 网络不稳定 | 检查网络连接,增加重试次数 |
| 无识别结果 | 音频质量差 | 确保采样率16kHz,16位单声道 |
3.3 高级功能扩展
- 离线命令词识别:下载离线引擎包,支持本地识别
- 实时转写:通过WebSocket实现边录音边识别
- 多语言支持:在请求头中添加
"language":"zh-CN"等参数
四、最佳实践建议
- 权限动态申请:Android 6.0+需动态申请录音权限
- 错误处理机制:建立完善的错误码映射表
- 日志记录:记录关键节点日志便于问题排查
- 资源释放:确保WebSocket、AudioRecord等资源及时释放
五、完整调用示例
// 1. 初始化录音AudioRecorder recorder = new AudioRecorder();File audioFile = new File(getExternalCacheDir(), "temp.pcm");// 2. 开始录音recorder.startRecording(audioFile);// 3. 延迟3秒后停止(模拟用户说话)new Handler(Looper.getMainLooper()).postDelayed(() -> {recorder.stopRecording();// 4. 调用语音识别SpeechRecognizer recognizer = new SpeechRecognizer();recognizer.recognize(audioFile, new SpeechRecognizer.RecognitionListener() {@Overridepublic void onResult(String text) {runOnUiThread(() -> tvResult.setText("识别结果:" + text));}@Overridepublic void onError(int code, String message) {runOnUiThread(() -> tvResult.setText("错误:" + code + " - " + message));}});}, 3000);
通过以上步骤,开发者可以在Android Studio中完整实现百度语音识别API的集成。实际开发中需根据具体场景调整参数配置,并做好异常处理和资源管理。建议参考百度语音识别官方文档获取最新技术信息。