从gitlab url中提取html内容

Question

我正在尝试从gitlab url获取html内容。但是我对Gitlab登录页面感到震惊，即使提供了用户名和密码，我也会获得登录页面的html内容。

码：

    from bs4 import BeautifulSoup 
    import requests
    username = "username"
    password = "password"
    url = "HTTP://gitlab.com/saikumar/webhooktslint"
    result=requests.get(url, auth=("username", "password")).content  /* 
    gets 
    content from the site */
    soup = BeautifulSoup(result,'lxml')
    for link in soup:
       print link

输出：

   Getting HTML content of sign_in page.

预期产量：

   Need to get the HTML content of the URL specified.

Answer 1

我没有在你的webhooktslint页面中看到回购gitlab.com/saikumar，所以它可能是一个私人存储库。

查看python GitLab CLI usage，确保正确设置~/.python-gitlab.cfg用户配置文件，其中包含GitLab private token：您将不必处理凭据。

gitlab python命令会为你做curl，包括get the raw data of a file。

但是，当您在代码中尝试执行私有仓库的GET时，相同的私有令牌可以帮助您进行身份验证（如果您在实际的HTML页面内容之后）。

要点是访问私人仓库，使用PAT（个人访问令牌）而不是实际的帐户密码。

从gitlab url中提取html内容

问题描述投票：1回答：1

1个回答

最新问题

从gitlab url中提取html内容

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1