我知道
okhttp3
库默认情况下会添加标头 Accept-Encoding: gzip
并自动为我们解码响应。
我正在处理的问题是,主机只接受如下标头:
Accept-Encoding: gzip, deflate
,如果我不添加 deflate
部分,它就会失败。现在,当我手动将该标头添加到 okhttp 客户端时,该库不再为我执行解压缩。
我尝试了多种解决方案来获取响应并尝试手动解压缩它,但我总是以异常结束,即
java.util.zip.ZipException: Not in GZIP format
,这是我迄今为止尝试过的:
//decompresser
public static String decompressGZIP(InputStream inputStream) throws IOException
{
InputStream bodyStream = new GZIPInputStream(inputStream);
ByteArrayOutputStream outStream = new ByteArrayOutputStream();
byte[] buffer = new byte[4096];
int length;
while ((length = bodyStream.read(buffer)) > 0)
{
outStream.write(buffer, 0, length);
}
return new String(outStream.toByteArray());
}
//run scraper
scrape(api, new Callback()
{
// Something went wrong
@Override
public void onFailure(@NonNull Call call, @NonNull IOException e)
{
}
@Override
public void onResponse(@NonNull Call call, @NonNull Response response) throws IOException
{
if (response.isSuccessful())
{
try
{
InputStream responseBodyBytes = responseBody.byteStream();
returnedObject = GZIPCompression.decompress(responseBodyBytes);
if (returnedObject != null)
{
String htmlResponse = returnedObject.toString();
}
}
catch (ProtocolException e){}
if(response != null) response.close();
}
}
});
private Call scrape(Map<?, ?> api, Callback callback)
{
MediaType JSON = MediaType.parse("application/json; charset=utf-8");
String method = (String) api.get("method");
String url = (String) api.get("url");
Request.Builder requestBuilder = new Request.Builder().url(url);
RequestBody requestBody;
requestBuilder.header("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0");
requestBuilder.header("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
requestBuilder.header("Accept-Language", "en-US,en;q=0.5");
requestBuilder.header("Accept-Encoding", "gzip, deflate");
requestBuilder.header("Connection", "keep-alive");
requestBuilder.header("Upgrade-Insecure-Requests", "1");
requestBuilder.header("Cache-Control", "max-age=0");
Request request = requestBuilder.build();
Call call = client.newCall(request);
call.enqueue(callback);
return call;
}
请注意,响应标头将始终返回
Content-Encoding: gzip
和 Transfer-Encoding: chunked
还有一件事,我也尝试了这个主题中的解决方案,但它仍然失败并显示
D/OkHttp: java.io.IOException: ID1ID2: actual 0x00003c68 != expected 0x00001f8b
。
任何帮助将不胜感激..
经过 6 个小时的挖掘,我找到了正确的解决方案,并且像往常一样,它比我想象的要容易,所以我基本上是在尝试解压缩一个未进行 gzip 压缩的页面,因为它失败了。现在,一旦我点击第二页(已压缩),我就会收到一个 gzip 压缩的响应,上面的代码应该处理它。另外,如果有人想要解决方案,我使用了修改后的拦截器,就像这个答案中的拦截器一样,因此您不需要使用自定义函数来处理解压缩。
我修改了
unzip
方法,使 okhttp interceptor
可以处理压缩和未压缩的响应:
OkHttpClient.Builder clientBuilder = new OkHttpClient.Builder().addInterceptor(new UnzippingInterceptor());
OkHttpClient client = clientBuilder.build();
拦截器就像这样:
private class UnzippingInterceptor implements Interceptor {
@Override
public Response intercept(Chain chain) throws IOException {
Response response = chain.proceed(chain.request());
return unzip(response);
}
// copied from okhttp3.internal.http.HttpEngine (because is private)
private Response unzip(final Response response) throws IOException {
if (response.body() == null)
{
return response;
}
//check if we have gzip response
String contentEncoding = response.headers().get("Content-Encoding");
//this is used to decompress gzipped responses
if (contentEncoding != null && contentEncoding.equals("gzip"))
{
Long contentLength = response.body().contentLength();
GzipSource responseBody = new GzipSource(response.body().source());
Headers strippedHeaders = response.headers().newBuilder().build();
return response.newBuilder().headers(strippedHeaders)
.body(new RealResponseBody(response.body().contentType().toString(), contentLength, Okio.buffer(responseBody)))
.build();
}
else
{
return response;
}
}
}
4.10.0
版本
gzip
已经可以自动执行此操作
因为
okhttp
不支持deflate
在 BridgeInterceptor.java 或 BridgeInterceptor.kt 中
if (transparentGzip &&
"gzip".equals(networkResponse.header("Content-Encoding"), ignoreCase = true) &&
networkResponse.promisesBody()) {
感谢 Aksenov Vladimir 的回复。你的回答节省了我很多时间。我将 okhttp 从 3.x 升级到 4.11 后一切正常。
以下是一些其他详细信息:
相关代码如下: okhttp3.internal.http.BridgeInterceptor
// If we add an "Accept-Encoding: gzip" header field we're responsible for also decompressing
// the transfer stream.
var transparentGzip = false
if (userRequest.header("Accept-Encoding") == null && userRequest.header("Range") == null) {
transparentGzip = true
requestBuilder.header("Accept-Encoding", "gzip")
}
if (transparentGzip &&
"gzip".equals(networkResponse.header("Content-Encoding"), ignoreCase = true) &&
networkResponse.promisesBody()) {
val responseBody = networkResponse.body
if (responseBody != null) {
val gzipSource = GzipSource(responseBody.source())
val strippedHeaders = networkResponse.headers.newBuilder()
.removeAll("Content-Encoding")
.removeAll("Content-Length")
.build()
responseBuilder.headers(strippedHeaders)
val contentType = networkResponse.header("Content-Type")
responseBuilder.body(RealResponseBody(contentType, -1L, gzipSource.buffer()))
}
}
我最近不得不自己实现这个,发现现有的答案有一些错误,所以这是我今天对其工作方式的看法。
import java.util.Collections;
import java.util.zip.Inflater;
import okhttp3.Headers;
import okhttp3.Interceptor;
import okhttp3.MediaType;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.RequestBody;
import okhttp3.Response;
import okhttp3.ResponseBody;
import okio.BufferedSource;
import okio.GzipSource;
import okio.InflaterSource;
import okio.Okio;
var client = new OkHttpClient.Builder()
.addInterceptor(
(Interceptor.Chain chain) -> {
var oldRequest = chain.request();
// If the caller has passed their own Accept-Encoding
// it's indicating they expect to handle it themself.
if (oldRequest.header("Accept-Encoding") != null) {
return chain.proceed(oldRequest);
}
// Augment request saying we accept multiple content encodings
var newHeaders =
oldRequest
.headers()
.newBuilder()
.add("Accept-Encoding", "deflate")
.add("Accept-Encoding", "gzip")
.build();
var newRequest = oldRequest.newBuilder().headers(newHeaders).build();
var oldResponse = chain.proceed(newRequest);
// Replace the response's request with the original one
var responseBuilder = oldResponse.newBuilder().request(oldRequest);
// We might not have a body to decompress
var body = oldResponse.body();
if (body != null) {
BufferedSource source = body.source();
// The body may have been wrapped in an arbitrary encoding sequence
// and the server returns them in the order it encoded them
// so we wrap them with decoders in reverse order.
var encodings = oldResponse.headers().values("Content-Encoding");
Collections.reverse(encodings);
for (var encoding : encodings) {
if ("deflate".equalsIgnoreCase(encoding)) {
var inflater = new Inflater(true);
source = Okio.buffer(new InflaterSource(source, inflater));
} else if ("gzip".equalsIgnoreCase(encoding)) {
source = Okio.buffer(new GzipSource(source));
}
}
// Strip encoding and length headers as we've already handled them
var strippedHeaders =
oldResponse
.headers()
.newBuilder()
.removeAll("Content-Encoding")
.removeAll("Content-Length")
.build();
responseBuilder.headers(strippedHeaders);
var contentType = MediaType.parse(oldResponse.header("Content-Type"));
// Construct a new body with an inferred Content-Length
var newBody = ResponseBody.create(contentType, -1L, source);
responseBuilder.body(newBody);
}
return responseBuilder.build();
})
.build();