2016-11-08

How to fix Python SSL errors when downloading https pages

This blog post explains how to fix Python SSL errors when downloading web pages using the https:// protocol in Python (e.g. by using the urllib, urllib2, httplib or requests. This blog post has been written because many other online sources haven't given direct and useful advice on how to fix the errors below.

How to fix SSL23_GET_SERVER_HELLO unknown protocol

This error looks like (possibly with a line number different from 504):
    self._sslobj.do_handshake()
SSLError: [Errno 1] _ssl.c:504: error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol

The fix is to upgrade your Python:

  • If you are using Python 2, upgrade to at least 2.7.7. It's recommended to upgrade to the latest (currently 2.7.11) though, or at least 2.7.9 (which has backported the ssl module (including the ssl.SSLContext customizations from 3.4.x). I have tested that the error above disappears when upgrading from 2.7.6 to 2.7.7. If you can't easily upgrade the Python 2 on the target system, you may want to try StaticPython on Linux (the stacklessco2.7-static and stacklessxx2.7-static binaries have OpenSSL and recent enough Python) or PyRun on Linux, macOS, FreeBSD and other Unix systems.
  • If you are using Python 3, upgrade to at least 3.4.3. It's recommended to upgrade to the latest (3.5.2 or later) though.
  • If you are unable to upgrade from Python 2.6.x or 2.7.x, try this workaround, it works in some cases (e.g. on Ubuntu 10.04) and on some websites:
    import ssl
    from functools import partial
    ssl.wrap_socket = partial(ssl.wrap_socket, ssl_version=ssl.PROTOCOL_TLSv1)

    There is a similar workaround for ssl.sslwrap_simple which also affects socket.ssl.

  • If you are unable to upgrade from Python 2.6.x, 2.7.x, 3.2.x or 3.3.x, use backports.ssl.
  • If you are unable to upgrade from Python 1.x – 2.5.x or 3.0.x – 3.1.x, then probably there is no easy fix for you.

Typically it's not necessary to upgrade your OpenSSL library just to fix this error, ancient versions such as OpenSSL 0.9.8k (released on 2009-03-25) also work if Python is upgraded. The latest release from the 0.9.8 series (currently 0.9.8zh) or from the 1.0 series or from the 1.1 series should all work. But if you have an easy option to upgrade, then upgrade to at least the latest LTS (long-term-support) version (currently 1.0.2j).

How to fix SSL CERTIFICATE_VERIFY_FAILED

This error looks like (possibly with a line number different from 509):
    self._sslobj.do_handshake()
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)

Server certificate verification by default has been introduced to Python recently (in 2.7.9). This protects against man-in-the-middle attacks, and it makes the client sure that the server is indeed who it claims to be.

As a quick (and insecure) fix, you can turn certificate verification off, by at least one of these:

  • Set the PYTHONHTTPSVERIFY environment variable to 0 before the ssl module is loaded, e.g. run export PYTHONHTTPSVERIFY=0 before you start the Python script.
  • (alternatively) Add this to your code before doing the https:// request (it affects all requests from then on):
    import os, ssl
    if (not os.environ.get('PYTHONHTTPSVERIFY', '') and
        getattr(ssl, '_create_unverified_context', None)):
      ssl._create_default_https_context = ssl._create_unverified_context

The proper, secure fix though is to install the latest root certificates to your computer to a directory where the OpenSSL library used by Python finds it. Your operating system may be able to do it conveniently for you, for example on Ubuntu 14.04, running this usually fixes it: sudo apt-get update && sudo apt-get install ca-certificates).