Scraping the web with ♥ Python ♥

I have a bunch of reference folders where I dump stuff that inspires me. I use those images as my screensaver which pops up after 3mins or so. Once in a while, I go on an inspiration finding frenzy and fill those up with new stuff.

Here are some of my sources (but not all):
National Geographic POD
Flickr explore last 7 days
Inspire Me Now – Random
Astronomy POD
Earth POD
ConceptArt.org

My favorite must be National Geographic POD. Amazing photography, every single day. But instead of going through manual labor of saving that picture to my ref folder every day (you lazy bastard!), I thought I’d automate it.

Here’s the script I wrote in python that visits the NPOD page each time my PC boots up and saves the image to my reference folder. You’ll need to get the BeautifulSoup python module, which greatly simplifies the code.

import os
import urllib2
import time
from urllib import urlretrieve
from BeautifulSoup import BeautifulSoup

# Make sure we have an internet connection (10 attempts)
#  because we might not be connected when this script runs at startup
for i in range(0, 10):
	try:
		print "Connection to interwebs attempt:", i
		urllib2.urlopen("http://google.com", timeout=2)
		print "...succes!"
		break
	except:
		print "...failed! Retrying."
		time.sleep(2)

# Find today's image in the html doc
url = "http://photography.nationalgeographic.com/photography/photo-of-the-day"
soup = BeautifulSoup(urllib2.urlopen(url).read())
npod_image = None
try:
	# Is there a high resolution wallpaper of today's image available?
	npod_image = soup('div', {'class' : 'download_link'})[0].a['href']
except:
	print "Getting wallpaper image failed"
	try:
		# No wallpaper, get the picture
		npod_image = soup('div', {'class' : 'primary_photo'})[0].a.img['src']
	except:
		print "Getting main image failed"
		exit(1)

# Save the image
script_path = os.path.abspath(os.path.dirname(__file__))
npod_image_dest = os.path.join(script_path, os.path.basename(npod_image))
print "Saving", npod_image, "to...\n   ", npod_image_dest
urlretrieve(npod_image, npod_image_dest)

To make it run at start-up, create a batch file in your startup dir that runs the python script.

C:\Python27\python.exe "E:\Waldo\Pictures\Reference\scripts\npod_scraper.py"

I have a bunch of these scripts that pull stuff in that inspires me every day.

I love python!

Hello world!

After I heard the news, it gave me a reason to renew my portfolio website. And hey, why not add a blog while I’m at it?

Disney hasn’t exactly confirmed they want to close the studio, but that’s just legal bullshizzle. We all know what’s happening here, there’s nothing we can do to change their mind. Any effort put into saving the studio is useless. Anyway, I’m moving on and I’m definitely staying in Brighton!

The great news is that I now have a full month off, and it’s payed for. I can do whatever I want for 30 days, whooooooo! I’m probably mainly going to be working on Athmos, which is a game I’ve been working on for a while with a bunch of guys I studied with.

The game I'm currently working on (Click to visit the blog)

Also check out my portfolio, there’s lots of stuff to see!