When you use Python to download a file cached by CloudFlare CDN it’s mandatory that you inform a good “User Agent” and also support SSLv3 connections, otherwise you will get a file with the name you requested but with the HTML below inside:

Error 1010 Ray ID: 29xxxxxxxxxed • 2016-05-08 19:20:51 UTC

Access denied

What happened?

The owner of this website (website.domain.org) has banned your access based on your browser’s signature (xxxxxxxxxxxxxxxx-ua47).

Realizei testes com urllib, urllib2, urllib3 e requests (que usa urllib), com todos ocorreram situações que ou envolveram instalar pacotes demais ou não havia bom suporte à SSLv3.

I’ve tested implementations using urllib, urllib2, urllib3 and requests (that uses urllib), with all of them I had situations related to install a lot of packages or no support to SSLv3.

The solution was to use PyCurl, that if you use Tornado and consume any API with its asynchronous HTTP client you probably has it installed.

Check the example below:

[code language=”python”] import pycurl

with open(‘testimage.jpg’, ‘wb’) as f:
c = pycurl.Curl()
c.setopt(pycurl.USERAGENT, ‘Mozilla/5.0 (Windows; U; Windows NT 6.1; it; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 (.NET CLR 3.5.30729)’)
c.setopt(c.URL,’https://website-in-cloudflare-cdn.domain.extension/imagex.jpg’)
c.setopt(c.WRITEDATA, f)
c.perform()
c.close()[/code]

After this I found (in here) a setting (if you are the site owner. The website admin can turn this feature off by doing the following:
Settings->CloudFlare Settings->Browser Integrity Check->Toggle Off.