来自 Spring 服务器的音频流

Question

我的服务器的音频流有问题。
服务器使用 grpc 从另一台服务器获取一些文件，并逐块获取它。在我的服务器获得这个块后，我将其放入输出流中，将其传递给客户端。并在外部服务器未完成此过程时执行此操作。

我正在尝试播放音频，但问题是客户端（chrome 浏览器）仅在完全下载后才开始播放音频，尽管在调试器中我可以看到字节正在流动。

我有一个简单的java（spring）服务器：

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
            <version>3.3.0</version>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-thymeleaf</artifactId>
            <version>3.3.0</version>
        </dependency>
    </dependencies>

@RestController
@RequestMapping("/api/stream")
public class StreamRestController {

    private final GrpcServive grpcService = new GrpcServive();

    @GetMapping
    public void stream(final HttpServletResponse response) throws Exception {
        StreamObserverPublisher streamObserverPublisher = new StreamObserverPublisher();
        final OutputStream os = response.getOutputStream();

        streamObserverPublisher.subscribe(new Flow.Subscriber<>() {
            @Override
            public void onSubscribe(Flow.Subscription s) {
                // Ignored...
            }

            @Override
            public void onNext(byte[] bytes) {
                try {
                    os.write(bytes);
                    response.flushBuffer();

                } catch (IOException e) {
                    throw new RuntimeException(e);
                }
            }

            @Override
            public void onError(Throwable t) {
                try {
                    os.close();
                } catch (IOException e) {
                    throw new RuntimeException(e);
                }
                throw new RuntimeException(t);
            }

            @Override
            public void onComplete() {
                try {
                    os.close();
                } catch (IOException e) {
                    throw new RuntimeException(e);
                }
            }
        });

        grpcService.synthesize(createRequest(), streamObserverPublisher);

        response.setStatus(200);
        response.setContentType("audio/wav");
        response.addHeader("Cache-Control", "no-cache, no-store");
        response.addHeader("Connection", "keep-alive");
        response.addHeader("Content-Transfer-Encoding", "chunked");
        response.addHeader("Accept-Ranges", "bytes");
    }

    private Synthesis.SynthesisRequest createRequest() {
        return Synthesis.SynthesisRequest.newBuilder()
                .setText("One, two, three, four, five, six, seven, eight, nine, ten.")
                .setContentType(TEXT)
                .setVoice("Kin_8000")
                .setAudioEncoding(WAV)
                .build();
    }
}

public class StreamObserverPublisher implements Flow.Publisher<byte[]>, StreamObserver<Synthesis.SynthesisResponse> {

    public Flow.Subscriber<? super byte[]> subscriber;

    @Override
    public void subscribe(Flow.Subscriber<? super byte[]> subscriber) {
        this.subscriber = subscriber;
    }

    @Override
    public void onNext(Synthesis.SynthesisResponse response) {
        subscriber.onNext(response.getData().toByteArray());
    }

    @Override
    public void onError(Throwable throwable) {
        subscriber.onError(throwable);
    }

    @Override
    public void onCompleted() {
        subscriber.onComplete();
    }
}

public class GrpcServive extends SmartSpeechGrpc.SmartSpeechImplBase {

    private final SslContext sslCtx;
    private final CallCredentials callCredentials;

    public GrpcServive() {
        try {
            sslCtx = GrpcSslContexts.forClient()
                    .trustManager(InsecureTrustManagerFactory.INSTANCE)
                    .build();

            callCredentials = new BearerToken(
                    "token..."
            );
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

    @Override
    public void synthesize(Synthesis.SynthesisRequest request,
                           StreamObserver<Synthesis.SynthesisResponse> responseObserver
    ) {
        ManagedChannel channel = NettyChannelBuilder.forTarget("host...")
                .sslContext(sslCtx)
                .enableRetry()
                .build();

        Iterator<Synthesis.SynthesisResponse> iterator = SmartSpeechGrpc.newBlockingStub(channel)
                .withCallCredentials(callCredentials)
                .withWaitForReady()
                .synthesize(request);

        while (iterator.hasNext()) {
            responseObserver.onNext(iterator.next());
        }
        responseObserver.onCompleted();
    }
}

@SpringBootApplication
public class Application {
    public static void main(String[] args) {
        SpringApplication.run(Application.class, args);
    }
}

还有一个非常简单的 thymeleaf 客户端：

<!DOCTYPE html>
<html lang="en" xmlns:th="http://www.thymeleaf.org">
<head>
    <title>JavaScript Progress Monitor</title>
</head>
<body>
<div>
    <audio controls preload="auto">
        <source src="/api/stream" type="audio/wav">
    </audio>
</div>
</body>
</html>

有什么办法可以解决吗？

Answer 1

我的建议是通过更改对语音合成器的请求来指定您收到 OGG 格式的响应：


    private Synthesis.SynthesisRequest createRequest() {
        return Synthesis.SynthesisRequest.newBuilder()
                .setText("One, two, three, four, five, six, seven, eight, nine, ten.")
                .setContentType(TEXT)
                .setVoice("Kin_8000")
                .setAudioEncoding(OPUS)
                .build();
    }

这是假设您正在使用此服务并且它仍然支持OGG。

然后，您需要在指定的两个位置将内容类型更改为

audio/ogg

：

response.setContentType("audio/ogg");

和

    <audio controls preload="auto">
        <source src="/api/stream" type="audio/ogg">
    </audio>

我提出此建议的原因是 OGG 是专为流媒体设计的，而WAV 的规范要求提前知道文件大小。我尝试了这个答案中的解决方案来让 Chrome 流式传输 WAV 文件，但它对我不起作用。另一方面，在 Firefox 中，WAV 文件在完全下载之前播放，无论是否对指定块大小的位进行操作。

来自 Spring 服务器的音频流

问题描述投票：0回答：1

1个回答

最新问题

来自 Spring 服务器的音频流

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1