9Gag Botblocker | Webscraper
I’ve been an 9Gag user since I can remember, but the last year or so the site has been flooded by bots reposting content over and over again, since the Admins of the site don’t take care of this problem I decided to take matters into my own hands.
That’s why I created this web scraper that identifies those bot accounts and automatically blocks them. How it works, it loads the last 500 post (can be changed to any x number) on the fresh page and then counts how many of those 500 posts are of the same creators/users, if a creator/user has posted more then 4 times in the last 500 posts, it’s marked as a bot and BLOCKED!
Mistakes I’ve made in the beginning:
1. I thought it was a smart idea to re-use a web scraper I’ve made in the past for an exercise, however that one used BeautifulSoup, which is okay for static sites. 9Gag however, loads its content dynamically, meaning, you have to scroll to load more posts, so BeautifulSoup wouldn’t work for this project.
2. My initial plan was to make a web-scraper that only flags the bot accounts so that I could block them by hand, then a little voice said: why don’t you just AUTOMATE it? I could’ve thought of that sooner and planned it out better.
Things I want to adjust in the future:
For now it only works in Chrome, I also want it to work in other browsers. It only works if you have an ad blocker active, otherwise the scraper loads an ad after a while and then gets stuck.
Read more about on my Github.
Author: Jim Versteeg
Date: January 21, 2025