如何在我的 Android 聊天机器人应用程序中实现 OpenAI 的 GPT 3.5 Turbo API 的流响应?目前,我正在使用 Retrofit 来获取 API 响应,但大约需要 15 到 20 秒。我想通过实现流来优化响应时间。这是我当前获取 API 响应的代码:
public void callAPI(String question) {
OkHttpClient.Builder httpClientBuilder = new OkHttpClient.Builder();
httpClientBuilder.connectTimeout(60, TimeUnit.SECONDS); // Set the connect timeout
httpClientBuilder.readTimeout(60, TimeUnit.SECONDS); // Set the read timeout
httpClientBuilder.writeTimeout(60, TimeUnit.SECONDS); // Set the write timeout
Retrofit retrofit = new Retrofit.Builder()
.baseUrl("https://api.openai.com/v1/")
.client(httpClientBuilder.build())
.addConverterFactory(GsonConverterFactory.create())
.build();
ChatApiService chatApiService = retrofit.create(ChatApiService.class);
JSONObject jsonBody = new JSONObject();
try {
jsonBody.put("model", "gpt-3.5-turbo");
jsonBody.put("max_tokens",4000);
jsonBody.put("temperature",0);
jsonBody.put("stream",true);
JSONArray messageArray = new JSONArray();
JSONObject userMessage = new JSONObject();
userMessage.put("role", "user");
userMessage.put("content", question);
messageArray.put(userMessage);
JSONObject assistantMessage = new JSONObject();
assistantMessage.put("role", "assistant");
assistantMessage.put("content", SharedPreference.getString(context, BaseUrl.Key_last_answer));
messageArray.put(assistantMessage);
jsonBody.put("messages", messageArray);
} catch (JSONException e) {
e.printStackTrace();
}
RequestBody requestBody = RequestBody.create(MediaType.parse("application/json"), jsonBody.toString());
Call<ResponseBody> call = chatApiService.getChatResponse(requestBody);
call.enqueue(new Callback<ResponseBody>() {
@Override
public void onResponse(Call<ResponseBody> call, Response<ResponseBody> response) {
if (response.isSuccessful()) {
try {
JSONObject jsonObject = new JSONObject(response.body().string());
JSONArray jsonArray = jsonObject.getJSONArray("choices");
String result = jsonArray.getJSONObject(0)
.getJSONObject("message")
.getString("content");
mAnswer = result.trim();
// Handle the response
addResponse(mAnswer);
addToChatHistory();
speakAnswer();
SharedPreference.putString(context, BaseUrl.Key_last_answer, mAnswer);
} catch (JSONException | IOException e) {
e.printStackTrace();
}
} else {
if (response.code() == 429) {
addResponse("Oops, something went wrong. Please try again in a little while.");
} else {
if (response.errorBody() != null) {
try {
addResponse("Failed to load response due to " + response.errorBody().string());
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}
@Override
public void onFailure(Call<ResponseBody> call, Throwable t) {
addResponse("Failed to load response due to " + t.getMessage());
}
});
}
我尝试使用 Retrofit 在我的 Android 聊天机器人应用程序中实现 OpenAI GPT 3.5 Turbo API。但响应时间在15到20秒左右,速度太慢。为了改进它,我想实现流式传输。我正在寻找有关如何实现流式传输和优化响应时间的建议。
任何关于使用 Retrofit 实现 GPT 3.5 Turbo API 流的建议或代码示例将不胜感激。
First add this on your endpoint.
@Streaming
@POST("v1/chat/completions")
更新您的响应并将数据读取为 inputStream
RequestBody requestBody = RequestBody.create(MediaType.parse("application/json"), jsonBody.toString());
Call<ResponseBody> call = chatApiService.getChatResponse(requestBody);
call.enqueue(new Callback<ResponseBody>() {
@Override
public void onResponse(@NonNull Call<ResponseBody> call, @NonNull Response<ResponseBody> response) {
//
if (response.isSuccessful()) {
// Process the streaming data
if (response.body() != null) {
InputStream inputStream = response.body().byteStream();
BufferedInputStream bufferedInputStream = new BufferedInputStream(inputStream);
getData(bufferedInputStream);
} else {
showError(errorSomeThing);
defaultValues(message);
}
} else {
showError(errorSomeThing);
defaultValues(message);
// Handle unsuccessful response
}
}
@Override
public void onFailure(@NonNull Call<ResponseBody> call, @NonNull Throwable t) {
showError(networkError);
defaultValues(message);
}
});
创建一个处理 BufferedInputStream 的函数
void getData(BufferedInputStream inputStream) {
// Create a byte array to store the data read from the InputStream
new Thread(() -> {
byte[] buffer = new byte[1024]; // Adjust the buffer size according to your needs
int bytesRead;
StringBuilder content = new StringBuilder();
// Read data from the InputStream into the buffer until the end of the stream is reached
try {
while ((bytesRead = inputStream.read(buffer)) != -1) {
// Process the data read from the buffer
// You can convert the byte array to a String if the data is text-based
String[] data = new String(buffer, 0, bytesRead, StandardCharsets.UTF_8).split("data:");
// Log.e("tramResponseArray", Arrays.toString(Arrays.stream(data).toArray()));
for (String responseString : data) {
String tramResponse = responseString.trim();
if (!tramResponse.isEmpty()) {
// Log.e("tramResponse", tramResponse);
if (!tramResponse.equalsIgnoreCase("[DONE]")) {
OpenAIChatResponseModel openAIChatResponseModel = null;
try {
openAIChatResponseModel = gson.fromJson(tramResponse, OpenAIChatResponseModel.class);
if (openAIChatResponseModel != null && openAIChatResponseModel.getChoices() != null) {
if (openAIChatResponseModel.getChoices().get(0).getDelta().getContent() != null) {
content.append(openAIChatResponseModel.getChoices().get(0).getDelta().getContent());
runOnUiThread(() -> {
String resposnseString = content.toString();
});
}
} else {
showError("Something went wrong. Please try again later.");
}
} catch (JsonSyntaxException e) {
// Log.e("LocalizedMessage", e.getLocalizedMessage());
}
}
}
}
}
} catch (IOException e) {
// Handle any IOException that may occur during the reading process
} finally {
// Close the InputStream to release system resources
try {
inputStream.close();
} catch (IOException e) {
// Handle any IOException that may occur during the closing process
}
}
}).start();
}
Helo mooaaazz 我尝试了你的解决方案,但它对我不起作用。它每次都会返回完整的结果。