Use Python to Scrape Data on a YouTube Video

If you are researching a youtube video, there are two good python scripts for scraping data on it. To use these scripts (or any scripts that scrape youtube data) you first need to get a youtube api key. This article will walk through getting a youtube api key and finding and running the two python scripts on a given video.

Getting a YouTube API key:

First, go to “https://console.developers.google.com/apis/credentials”.

click on agree to terms and services and then click on “agree and continue”

Next click “select a project”

Then click on “new project”

In the next screen click on “create”, and then in the screen click on “select project”, on the left slick on “api & services” and then “credentials”

Then click on “create credentials” and then “API key”

You are not done yet, next click on “library” on the left and in the next screen scroll down to youtube and then click on “youtube data api v3”

In the next screen click “enable”. Now your api has access to youtube data!

To return to your api key, click on “credentials” on the left and you are brought to a page that has your key.

The api documentation is located here – “https://developers.google.com/youtube/v3/docs”

Python Script for Scraping Video Info

There is a python script on Github that will scrape the video’s info and present it in a csv or excel file for you.

To find and run the script, first go here to get set up for the script – “https://github.com/lamthuyvo/social-media-data-scripts/blob/master/README.md”

You will need git, this website will show you how to install it – “https://git-scm.com/book/en/v2/Getting-Started-Installing-Git”

Then go to Terminal or Command Prompt and type the following 3 commands (without the quotes):

“git clone https://github.com/lamthuyvo/social-media-data-scripts.git”

“cd social-media-data-scripts”

“pip install -r requirements.txt”

Then, in Terminal / Command Prompt, navigate to “social-media-data-scripts/01-apis/scripts/”

Find the file named “secrets.py.example” and put your youtube api key in there where it says “youtube api key =”, and then change the file name to “secrets.py”

Now for the video you are researching you need to find it’s video ID. See the screenshot below for an explanation for finding the id.

this is from – “https://gist.github.com/jakebellacera/d81bbf12b99448188f183141e6696817”

Next, go to the python script “youtube-get-video-info.py” and input the video id in line 12 where it says “video_ids = “

Run the script with the id of the video (or videos) that you are researching and then an excel file wil appear in the “output” folder in the “01-apis” folder. The excel with have the video’s information for each of the following paramenters:

  • youtube_id,
  • publishedAt,
  • channelId,
  • channelTitle,
  • title,
  • description,
  • tags,
  • viewCount,
  • likeCount,
  • dislikeCount,
  • favoriteCount,
  • commentCount, and
  • topicCategories

Python Script to Scrape a Video’s Comments

Also, if you want to scrape all of the comments from a video, you can go here – “https://github.com/Jabrils/Download-All-YouTube-Comments-From-Any-Video?files=1”

Then go to the script Dumpallcomments here- “https://github.com/Jabrils/Download-All-YouTube-Comments-From-Any-Video/blob/master/DumpAllComments.py”

To run the script you need to install pytube by typing into terminal “pip install pytube”

copy and paste the script into Sublime Text or whichever python interpreter you are using and put your api key and the video id in lines 7 and 8 as shown below:

Run the script and you will have a tsv (tab separated values) file in the same folder as the script and it will contain all of the video’s comments.

Alternatively, you can click on the “gitpod” button below the script in github and run the script in gitpod. This is pretty simple. You need to first install pytube by typing “pip install pytube” in terminal (at the bottom of the screen in the picture below). Then click on the file “dumpallcomments.py” on the left and then Input the api key and video id into the script.

save the file and then click on the little green arrow on the top right

And a file will appear above “dumpallcomments.py” with your comments. The file contents will look like this screenshot below:

That’s it! You are done!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s