我正在尝试使用 node-fetch 来捕获页面的内容,并遇到意外错误。我检查了类似的问题,但似乎不相关。我尝试使用 HTTPS 代理和代理获取 HTTPS 站点,但收到有关 HTTP 的意外错误。我想知道这是否是由于重定向造成的,但我看不到任何会导致它的原因。这只对这个特定的 URL 失败(例如,使用 https://www.robinhood.com 工作正常),我正在尝试找出原因。这是一个最小的例子。我注意到这使用了我在本地保存的一些证书,但我不确定重现的必要性。
//start SO example
var siteURL = "https://robinhood.com/l/privacy";
import path from 'path';
import sslrootcas from 'ssl-root-cas';
const rootCas = sslrootcas.create();
import {fileURLToPath} from 'url';
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
rootCas.addFile(path.resolve(__dirname,'intermediate.pem'));
import http from 'node:http';
import https from 'node:https';
import UserAgent from 'user-agents';
const myhttpsAgent = new https.Agent({ca: rootCas});
// const requestcheck = fetch("https://www.google.com", {
const requestcheck = fetch(siteURL, {
method: "GET"
,headers: {"User-Agent": new UserAgent() }
,agent: myhttpsAgent
})
这是我收到的错误:
node:internal/errors:477
ErrorCaptureStackTrace(err);
^
TypeError: Protocol "http:" not supported. Expected "https:"
at new NodeError (node:internal/errors:387:5)
at new ClientRequest (node:_http_client:177:11)
at request (node:http:96:10)
at file:///home/app/node_modules/node-fetch/src/index.js:94:20
at new Promise (<anonymous>)
at fetch (file:///home/app/node_modules/node-fetch/src/index.js:49:9)
at ClientRequest.<anonymous> (file:///home/app/node_modules/node-fetch/src/index.js:236:15)
at ClientRequest.emit (node:events:525:35)
at HTTPParser.parserOnIncomingClient [as onIncoming] (node:_http_client:674:27)
at HTTPParser.parserOnHeadersComplete (node:_http_common:128:17)
at TLSSocket.socketOnData (node:_http_client:521:22)
at TLSSocket.emit (node:events:525:35)
at addChunk (node:internal/streams/readable:315:12)
at readableAddChunk (node:internal/streams/readable:289:9)
at TLSSocket.Readable.push (node:internal/streams/readable:228:10)
at TLSWrap.onStreamRead (node:internal/stream_base_commons:190:23) {
code: 'ERR_INVALID_PROTOCOL'
}
我想知道这是否可能是由于重定向造成的,但我看不到任何会导致它的原因。
https://robinhood.com/l/privacy
重定向至https://robinhood.com/us/en/support/articles/privacy-policy
然后重定向到http://robinhood.com/us/en/support/articles/privacy-policy/
后一个 URL 是纯 HTTP,因此纯 https 用户代理使用了错误的协议。