Written in Python and utilizes the Reddit API ( PRAW ). Learn how to scrape data from any subreddit on Reddit including comments, votes, submissions and save the data to Google Sheets Published in: Google Apps Script - Google Sheets Reddit offers a fairly extensive API that any developer can use to easily pull data from subreddits. Reddit data in Bigquery: For those who do not know what Bigquery is, Google BigQuery is an enterprise data warehouse that solves this problem by enabling super-fast SQL queries using the processing power of Google’s infrastructure.. Best part is querying this data would be free. Sign up to join this community. It only takes a minute to sign up. Scraping reddit comments works in a very similar way. Code Review Stack Exchange is a question and answer site for peer programmer code reviews. Sign up to join this community. submission_id is the small, alphanumeric code that Reddit uses to uniquely identify submissions. I believe some of it … You can do this by simply adding “.json” to the end of any Reddit URL. Someone back in 2015 used the RedditAPI to scrape all past comments. import praw r = praw.Reddit('Comment parser example by u/_Daimon_') subreddit = r.get_subreddit("python") comments = subreddit.get_comments() However, this returns only the most recent 25 comments. First, you need to understand that Reddit allows you to convert any of their pages into a JSON data output.

Contribute to wyattshapiro/reddit_scraper development by creating an account on GitHub. These lists are where the posts and comments of the Reddit threads we will scrape are going to be stored. To do this, we need to use some more Python. Scraping Reddit Comments. For downloading the comments of a single thread in the /r/todayilearned/ subreddit you can use reddit_url() to get the URLs of all threads, extract e.g. Hey Pompe, Reddit’s API gives you about one request per second, which seems pretty reasonable for small scale projects — or even for bigger projects if you build the backend to limit the requests and store the data yourself (either cache or build your own DB). the most commented thread (I intentionally chose the second most commented thread), and then download the comments with reddit_content().

scrape_comments -u reddit-username -p reddit-password submission_id Here reddit-username and reddit-password are your actual Reddit username and password, respectively. scrape_comments --help To scrape a given Reddit thread, run. You'll need the Reddit API if you want things truly current - like within the past few days. I have also separately managed to scrape the comments of a post but that operation needs the url of the post. RedditExtractoR provides an easy way to access Reddit comments and statistics. Anybody can ask a question Anybody can answer The best answers are voted up and rise to the top Home ; Questions ; Tags ; Users ; Unanswered ; Scrape a reddit post submission for comments and save comments to JSON file. First, we will choose a specific posts we’d like to scrape. I have searched online and consulted Reddit's API documentation and I have managed to scrape the top 999 posts from a subreddit.