如何使用Youtube API(Python)提取视频标题

问题描述 投票:0回答:1

我正在为自己制作一个小型 python 应用程序,用于从 YouTube 视频下载信息,为此我使用 YouTube api。

最近我看了这个视频,帮助我从Youtube视频中获取评论和他们的回复,并将它们导出到Excel文件,一切正常。然而,我现在遇到的问题是,我也想从 YouTube 视频中提取标题,但似乎无法让我的代码为其工作。

就像我之前提到的,我观看了链接的视频,并尝试向视频频道的作者发表评论以寻求帮助,不幸的是我没有得到回复。

我还尝试在 YouTube 上查找有关我的问题的任何其他有用视频,但找不到任何对我的问题有帮助的内容。

除了在那里询问和寻找其他视频之外,我还尝试查看文档并像这样迷惑代码,但这也不起作用。以下是我使用的代码:

# CODE FROM HERE: https://github.com/analyticswithadam/Python/blob/main/Pull_all_Comments_and_Replies_for_YouTube_Playlists.ipynb

from googleapiclient.discovery import build
import pandas as pd
import getpass

api_key = "API key here"
playlist_ids = ['Youtube Playlist Link Here']

# Build the YouTube client
youtube = build('youtube', 'v3', developerKey=api_key)

def get_all_video_ids_from_playlists(youtube, playlist_ids):
    all_videos = []  # Initialize a single list to hold all video IDs

    for playlist_id in playlist_ids:
        next_page_token = None

        # Fetch videos from the current playlist
        while True:
            playlist_request = youtube.playlistItems().list(
                part='contentDetails',
                playlistId=playlist_id,
                maxResults=50,
                pageToken=next_page_token)
            playlist_response = playlist_request.execute()

            all_videos += [item['contentDetails']['videoId'] for item in playlist_response['items']]

            next_page_token = playlist_response.get('nextPageToken')

            if next_page_token is None:
                break

    return all_videos

# Fetch all video IDs from the specified playlists
video_ids = get_all_video_ids_from_playlists(youtube, playlist_ids)

# Now you can pass video_ids to the next function
# next_function(video_ids)

'''
# Broken-ass title code

def get_vid_title(youtube, video_id):  # Added video_id as an argument
    all_titles = []
    next_page_token = None

    while True:
        title_request = youtube.channels().list(
            part="snippet",
            videoId=video_id,
            pageToken=next_page_token,
            textFormat="plainText",
            maxResults=100
        )
        title_response = title_request.execute()

        for item in title_response['items']:
            vid_title = item['snippet']['title']
            all_titles.append({
                'Title': vid_title['title']
            })
            print(vid_title['title'])

    return all_titles 
'''

# Function to get replies for a specific comment
def get_replies(youtube, parent_id, video_id):  # Added video_id as an argument
    replies = []
    next_page_token = None

    while True:
        reply_request = youtube.comments().list(
            part="snippet",
            parentId=parent_id,
            textFormat="plainText",
            maxResults=100,
            pageToken=next_page_token
        )
        reply_response = reply_request.execute()

        for item in reply_response['items']:
            comment = item['snippet']
            replies.append({
                'Timestamp': comment['publishedAt'],
                'Username': comment['authorDisplayName'],
                'VideoID': video_id,
                'Comment': comment['textDisplay'],
                'likeCount': comment['likeCount'],
                'Date': comment['updatedAt'] if 'updatedAt' in comment else comment['publishedAt']
            })

        next_page_token = reply_response.get('nextPageToken')
        if not next_page_token:
            break

    return replies


# Function to get all comments (including replies) for a single video
def get_comments_for_video(youtube, video_id):
    all_comments = []
    next_page_token = None

    while True:
        comment_request = youtube.commentThreads().list(
            part="snippet",
            videoId=video_id,
            pageToken=next_page_token,
            textFormat="plainText",
            maxResults=100
        )
        comment_response = comment_request.execute()

        for item in comment_response['items']:
            top_comment = item['snippet']['topLevelComment']['snippet']
            all_comments.append({
                'Timestamp': top_comment['publishedAt'],
                'Username': top_comment['authorDisplayName'],
                'VideoID': video_id,  # Directly using video_id from function parameter
                'Comment': top_comment['textDisplay'],
                'likeCount': top_comment['likeCount'],
                'Date': top_comment['updatedAt'] if 'updatedAt' in top_comment else top_comment['publishedAt']
            })

            # Fetch replies if there are any
            if item['snippet']['totalReplyCount'] > 0:
                all_comments.extend(get_replies(youtube, item['snippet']['topLevelComment']['id'], video_id))

        next_page_token = comment_response.get('nextPageToken')
        if not next_page_token:
            break

    return all_comments

# List to hold all comments from all videos
all_comments = []


for video_id in video_ids:
    video_comments = get_comments_for_video(youtube, video_id)
    all_comments.extend(video_comments)

# Create DataFrame
comments_df = pd.DataFrame(all_comments)


# Export whole dataset to the local machine as CSV File
csv_file = 'comments_data.csv'  # Name your file
comments_df.to_csv(csv_file, index=False)

任何帮助将不胜感激。

python python-3.x web-scraping youtube-api youtube-data-api
1个回答
0
投票

get_all_video_ids_from_playlists
使用:

part='snippet,contentDetails'

在这一行:

all_videos += [item['contentDetails']['videoId'] for item in playlist_response['items']]

更改如下:

for pl_item in playlist_response 
    all_videos.append({"Video_ID": item['contentDetails']['videoId'], {"Title" : {item['snippet']['title']}})

然后,在这一行:

for video_id in video_ids:
    video_comments = get_comments_for_video(youtube, video_id)

您正在通过

item
- 即
{"Video_ID: "xxxx", "Title": "xxxx"}
:

最后,关于你的

get_comments_for_video
功能:

# Function to get all comments (including replies) for a single video
def get_comments_for_video(youtube, video_id):

用途:

videoId=video_id["Video_ID"]

和:

'VideoID': video_id["Video_ID"],  # Directly using video_id from function parameter
'Title': video_id["Title"], # The title of the video.
© www.soinside.com 2019 - 2024. All rights reserved.