使用Retrofit在Android中实现OpenAI的GPT 3.5 Turbo API的流响应

Question

如何在我的 Android 聊天机器人应用程序中实现 OpenAI 的 GPT 3.5 Turbo API 的流响应？目前，我正在使用 Retrofit 来获取 API 响应，但大约需要 15 到 20 秒。我想通过实现流来优化响应时间。这是我当前获取 API 响应的代码：

public void callAPI(String question) {
    OkHttpClient.Builder httpClientBuilder = new OkHttpClient.Builder();
    httpClientBuilder.connectTimeout(60, TimeUnit.SECONDS); // Set the connect timeout
    httpClientBuilder.readTimeout(60, TimeUnit.SECONDS); // Set the read timeout
    httpClientBuilder.writeTimeout(60, TimeUnit.SECONDS); // Set the write timeout

    Retrofit retrofit = new Retrofit.Builder()
            .baseUrl("https://api.openai.com/v1/")
            .client(httpClientBuilder.build())
            .addConverterFactory(GsonConverterFactory.create())
            .build();

    ChatApiService chatApiService = retrofit.create(ChatApiService.class);

    JSONObject jsonBody = new JSONObject();
    try {
        jsonBody.put("model", "gpt-3.5-turbo");
        jsonBody.put("max_tokens",4000);
        jsonBody.put("temperature",0);
        jsonBody.put("stream",true);
        JSONArray messageArray = new JSONArray();

        JSONObject userMessage = new JSONObject();
        userMessage.put("role", "user");
        userMessage.put("content", question);
        messageArray.put(userMessage);

        JSONObject assistantMessage = new JSONObject();
        assistantMessage.put("role", "assistant");
        assistantMessage.put("content", SharedPreference.getString(context, BaseUrl.Key_last_answer));
        messageArray.put(assistantMessage);

        jsonBody.put("messages", messageArray);
    } catch (JSONException e) {
        e.printStackTrace();
    }

    RequestBody requestBody = RequestBody.create(MediaType.parse("application/json"), jsonBody.toString());

    Call<ResponseBody> call = chatApiService.getChatResponse(requestBody);
    call.enqueue(new Callback<ResponseBody>() {
        @Override
        public void onResponse(Call<ResponseBody> call, Response<ResponseBody> response) {
            if (response.isSuccessful()) {
                try {
                    JSONObject jsonObject = new JSONObject(response.body().string());
                    JSONArray jsonArray = jsonObject.getJSONArray("choices");
                    String result = jsonArray.getJSONObject(0)
                            .getJSONObject("message")
                            .getString("content");
                    mAnswer = result.trim();

                    // Handle the response
                    addResponse(mAnswer);
                    addToChatHistory();
                    speakAnswer();
                    SharedPreference.putString(context, BaseUrl.Key_last_answer, mAnswer);
                } catch (JSONException | IOException e) {
                    e.printStackTrace();
                }
            } else {
                if (response.code() == 429) {
                    addResponse("Oops, something went wrong. Please try again in a little while.");
                } else {
                    if (response.errorBody() != null) {
                        try {
                            addResponse("Failed to load response due to " + response.errorBody().string());
                        } catch (IOException e) {
                            e.printStackTrace();
                        }
                    }
                }
            }
        }

        @Override
        public void onFailure(Call<ResponseBody> call, Throwable t) {
            addResponse("Failed to load response due to " + t.getMessage());
        }
    });
}

我尝试使用 Retrofit 在我的 Android 聊天机器人应用程序中实现 OpenAI GPT 3.5 Turbo API。但响应时间在15到20秒左右，速度太慢。为了改进它，我想实现流式传输。我正在寻找有关如何实现流式传输和优化响应时间的建议。

任何关于使用 Retrofit 实现 GPT 3.5 Turbo API 流的建议或代码示例将不胜感激。

Answer 1

    First add this on your endpoint.
    
    @Streaming
    @POST("v1/chat/completions")

更新您的响应并将数据读取为 inputStream
RequestBody requestBody = RequestBody.create(MediaType.parse("application/json"), jsonBody.toString());

        Call<ResponseBody> call = chatApiService.getChatResponse(requestBody);
    
       call.enqueue(new Callback<ResponseBody>() {
                @Override
                public void onResponse(@NonNull Call<ResponseBody> call, @NonNull Response<ResponseBody> response) {
    //            
                    if (response.isSuccessful()) {
                        // Process the streaming data
                        if (response.body() != null) {
                            InputStream inputStream = response.body().byteStream();
                            BufferedInputStream bufferedInputStream = new BufferedInputStream(inputStream);
                            getData(bufferedInputStream);
                        } else {
                            showError(errorSomeThing);
                            defaultValues(message);
                        }
                    } else {
                        showError(errorSomeThing);
                        defaultValues(message);
                        // Handle unsuccessful response
                    }
                }
    
                @Override
                public void onFailure(@NonNull Call<ResponseBody> call, @NonNull Throwable t) {
                    showError(networkError);
                    defaultValues(message);
             
                }
            });

创建一个处理 BufferedInputStream 的函数

 void getData(BufferedInputStream inputStream) {
        // Create a byte array to store the data read from the InputStream
        new Thread(() -> {
            byte[] buffer = new byte[1024]; // Adjust the buffer size according to your needs
            int bytesRead;
            StringBuilder content = new StringBuilder();
            // Read data from the InputStream into the buffer until the end of the stream is reached
            try {
                while ((bytesRead = inputStream.read(buffer)) != -1) {
                    // Process the data read from the buffer
                    // You can convert the byte array to a String if the data is text-based
                    String[] data = new String(buffer, 0, bytesRead, StandardCharsets.UTF_8).split("data:");
//                    Log.e("tramResponseArray", Arrays.toString(Arrays.stream(data).toArray()));
                    for (String responseString : data) {
                        String tramResponse = responseString.trim();
                        if (!tramResponse.isEmpty()) {
//                            Log.e("tramResponse", tramResponse);
                            if (!tramResponse.equalsIgnoreCase("[DONE]")) {
                                OpenAIChatResponseModel openAIChatResponseModel = null;
                                try {
                                    openAIChatResponseModel = gson.fromJson(tramResponse, OpenAIChatResponseModel.class);
                                    if (openAIChatResponseModel != null && openAIChatResponseModel.getChoices() != null) {
                                        if (openAIChatResponseModel.getChoices().get(0).getDelta().getContent() != null) {
                                            content.append(openAIChatResponseModel.getChoices().get(0).getDelta().getContent());
                                            runOnUiThread(() -> {
                                                    String resposnseString = content.toString();
                                            });
                                        }
                                    } else {
                                        showError("Something went wrong. Please try again later.");
                                    }
                                } catch (JsonSyntaxException e) {
//                                    Log.e("LocalizedMessage", e.getLocalizedMessage());
                                }
                            } 
                        }

                    }
                }
           

            } catch (IOException e) {
                // Handle any IOException that may occur during the reading process
            } finally {
                // Close the InputStream to release system resources
                try {
                    inputStream.close();
                } catch (IOException e) {
                    // Handle any IOException that may occur during the closing process
                }
            }
        }).start();
    }

Answer 2

Helo mooaaazz 我尝试了你的解决方案，但它对我不起作用。它每次都会返回完整的结果。

使用Retrofit在Android中实现OpenAI的GPT 3.5 Turbo API的流响应

问题描述投票：0回答：2

2个回答

最新问题

使用Retrofit在Android中实现OpenAI的GPT 3.5 Turbo API的流响应

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2