2016-11-08

How to fix Python SSL errors when downloading https pages

This blog post explains how to fix Python SSL errors when downloading web pages using the https:// protocol in Python (e.g. by using the urllib, urllib2, httplib or requests. This blog post has been written because many other online sources haven't given direct and useful advice on how to fix the errors below.

How to fix SSL23_GET_SERVER_HELLO unknown protocol

This error looks like (possibly with a line number different from 504):
    self._sslobj.do_handshake()
SSLError: [Errno 1] _ssl.c:504: error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol

The fix is to upgrade your Python:

  • If you are using Python 2, upgrade to at least 2.7.7. It's recommended to upgrade to the latest (currently 2.7.11) though, or at least 2.7.9 (which has backported the ssl module (including the ssl.SSLContext customizations from 3.4). I have tested that the error above disappears when upgrading from 2.7.6 to 2.7.7. If you can't easily upgrade the Python 2 on the target system, you may want to try StaticPython on Linux (the stacklessco2.7-static and stacklessxx2.7-static binaries have OpenSSL and recent enough Python) or PyRun on Linux, macOS, FreeBSD and other Unix systems.
  • If you are using Python 3, upgrade to at least 3.4.3. It's recommended to upgrade to the latest (3.5.2) though.
  • If you are unable to upgrade from Python 2.x, try this workaround, it works in some cases (e.g. on Ubuntu 10.04) and on some websites:
    import ssl
    from functools import partial
    ssl.wrap_socket = partial(ssl.wrap_socket, ssl_version=ssl.PROTOCOL_TLSv1)

    There is a similar workaround for ssl.sslwrap_simple which also affects socket.ssl.

  • If you are unable to upgrade from Python 2.6.x, 3.2.x or 3.3.x, use backports.ssl.
  • If you are unable to upgrade from Python 1.x — 2.5 or 3.0.x &mdash 3.1.x, then probably there is no easy fix for you.

Typically it's not necessary to upgrade your OpenSSL library just to fix this error, ancient versions such as OpenSSL 0.9.8k (released on 2009-03-25) also work if Python is upgraded. The latest release from the 0.9.8 series (currently 0.9.8zh) or from the 1.0 series or from the 1.1 series should all work. But if you have an easy option to upgrade, then upgrade to at least the latest LTS (long-term-support) version (currently 1.0.2j).

How to fix SSL CERTIFICATE_VERIFY_FAILED

This error looks like (possibly with a line number different from 509):
    self._sslobj.do_handshake()
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590)

Server certificate verification by default has been introduced to Python recently (in 2.7.9). This protects against man-in-the-middle attacks, and it makes the client sure that the server is indeed who it claims to be.

As a quick (and insecure) fix, you can turn certificate verification off, by at least one of these:

  • Set the PYTHONHTTPSVERIFY environment variable to 0 before the ssl module is loaded, e.g. run export PYTHONHTTPSVERIFY=0 before you start the Python script.
  • (alternatively) Add this to your code before doing the https:// request (it affects all requests from now it):
    import os, ssl
    if (not os.environ.get('PYTHONHTTPSVERIFY', '') and
        getattr(ssl, '_create_unverified_context', None)):
      ssl._create_default_https_context = ssl._create_unverified_context

The proper, secure fix though is to install the latest root certificates to your computer to a directory where the OpenSSL library used by Python finds it. Your operating system may be able to do it conveniently for you, for example on Ubuntu 14.04, running this usually fixes it sudo apt-get update && sudo apt-get install ca-certificates).

2016-07-15

How to compile Lepton (JPEG lossless recompressor by Dropbox) without autotools

This blog post explains how to compile Lepton, the recently released JPEG lossless recompressor by Dropbox without autotools on Linux.

You'll need a fairly recent C++ compiler with the development libraries installed. g++-4.4 is too old, g++-4.8 is good enough.

Compile with the following command (without the leading $):

$ git clone https://github.com/dropbox/lepton
$ cd lepton
$ # Optional: git reset --hard e3f6c46502d958aaba17fbe7c218f8ce2b8b3f48
$ g++ -std=c++0x -W -Wall -Wextra -Wno-unused-parameter -Wno-write-strings \
    -msse4.1 \
    -o lepton \
    -fno-exceptions -fno-rtti \
    -s -O2 \
    -DGIT_REVISION=\"fake\" \
    -I./src/vp8/decoder \
    -I./src/vp8/encoder \
    -I./src/vp8/model \  
    -I./src/vp8/util \ 
    dependencies/md5/md5.c \
    src/io/MemMgrAllocator.cc \
    src/io/MemReadWriter.cc \  
    src/io/Seccomp.cc \
    src/io/Zlib0.cc \  
    src/io/ZlibCompression.cc \
    src/io/ioutil.cc \
    src/lepton/bitops.cc \
    src/lepton/fork_serve.cc \
    src/lepton/idct.cc \
    src/lepton/jpgcoder.cc \
    src/lepton/lepton_codec.cc \
    src/lepton/recoder.cc \
    src/lepton/simple_decoder.cc \
    src/lepton/simple_encoder.cc \
    src/lepton/socket_serve.cc \  
    src/lepton/thread_handoff.cc \
    src/lepton/uncompressed_components.cc \
    src/lepton/validation.cc \
    src/lepton/vp8_decoder.cc \
    src/lepton/vp8_encoder.cc \
    src/vp8/decoder/boolreader.cc \
    src/vp8/decoder/decoder.cc \   
    src/vp8/encoder/boolwriter.cc \
    src/vp8/encoder/encoder.cc \   
    src/vp8/model/JpegArithmeticCoder.cc \
    src/vp8/model/model.cc \
    src/vp8/model/numeric.cc \
    src/vp8/util/billing.cc \ 
    src/vp8/util/debug.cc \  
    src/vp8/util/generic_worker.cc \
    src/vp8/util/memory.cc \
    -lz \
    -lpthread \
;

$ ./lepton --help

It works both on i386 (force it with -m32) and amd64 (force it with -m64) architectures. Other architectures are not supported, because Lepton uses the SSE4.1 CPU instructions.

It also works with -std=c++11, but it doesn't work with -std=c++98.

Lepton also has a copy of zlib embedded, you can compile it with gcc instead of g++, and use it instead of -lz.

These instructions were tested and found working with the following version of Lepton:

$ git rev-parse HEAD
# e3f6c46502d958aaba17fbe7c218f8ce2b8b3f48

2016-06-20

How to back up original photo files as is in Google

This blog post explains how to back up photos on Google Photos and Google Drive as is, keeping the original images files, bit-by-bit identical, without any scaling or reencoding.

TL;DR If you want to keep the original image files, upload the photos to Google Drive, which keeps the original files (bit-by-bit identical as uploaded), and their size counts against your Google storage quota (see your usage). Don't upload any image file to Google Photos.

TL;DR If you want unlimited image uploads with the option of downloading the original image file (bit-by-bit identical), consider options other than Google Photos (such as Flickr and Deviantart).

On Google Photos you can upload some images for free (i.e. those images don't count against your Google storage quota). This is the most important advantage for uploading to Google Photos (rather than Google Drive). But there are some important caveats:

  • You need to decide before uploading if you want free (it's called high quality) or not (original). Select it in the Google Photos settings. This setting won't effect photos uploaded from your Android devices by the photo backup app.
  • If you decide non-free (original), future uploads will be counted against your quota, no matter the size, the file format or the quality. That is, even small, low-quality JPEGs will count against your quota.
  • If you choose free and you upload a PNG file of at most 16 megapixels, the original file is kept, and you'd be able to download it later.
  • If you choose free and you upload a PNG file of more than 16 megapixels, then it will be scaled down and reencoded.
  • If you choose free and you upload a JPEG file, then the photo gets scaled down to 16 megapixels (no change if already small enough), and then reencoded with a quality loss (which is small enough so that most humans don't notice), removes or rearranges some metadata (e.g. EXIF), and only the scaled and reencoded JPEG file is available for download.
  • Google Photos does deduplication of your images. This has an unintended consequence. If you upload a photo 3 times to 3 different album, and you move the photo to the trash, it will be removed from all 3 albums. There is no way to move some photos in an album to the trash without affecting other albums.
  • Google Photos does deduplication even across qualities. Thus if you upload an image as original first, and the upload it again as high quality, the high quality version will be ignored, and the original version will be present in both albums. It's also true the other way round: if you upload high quality first, then subsequent original uploads of the same image will be ignored.
  • Even with non-free (original), Google doesn't remember the original file name, as uploaded: it converts e.g. the .JPG extension to .jpg (lower case).
  • Immediately after the upload, the image info page shows incorrect information, and the Download link serves an incorrect (lower resolution) version of the image. For example, when I uploaded a new 1.1 MB JPEG file in high quality mode, the image info was showing 1.1 MB, but when downloading it, it became a 340 kB JPEG file. After reloading the image page, the image info was showing 550 KB, and downloading it yielded a file of that size. This makes experimenting with image upload sizes confusing.
  • This is as of 2016-06-20, the behavior of Google Photos may change in the future.

Because of these caveats and unexpected behavior, to avoid quality loss, my recommendation is not to use Google Photos for backing up JPEG image files.

2016-05-30

How to install Hungarian spell checking and hyphenation for LibreOffice and OpenOffice on Linux

This blog post explains how to install Hungarian spell checking and hyphenation for LibreOffice and OpenOffice on Linux. The instructions were tested on Ubuntu Trusty, but they should work well on other Linux distributions as well with small modifications.

  1. The installation consists of downloading the right files, copying them to the right location, and restarting the LibreOffice and/or OpenOffice.
  2. Install LibreOffice or OpenOffice with your favorite package manager if not already installed.
  3. Download the files from http://magyarispell.sf.net/ (no need to click now) by running this commands (without the leading $) in a terminal window:
    $ wget -O /tmp/hu_HU-1.6.1.tar.gz https://sourceforge.net/projects/magyarispell/files/Magyar%20Ispell/1.6.1/hu_HU-1.6.1.tar.gz/download #
    $ wget -O /tmp/huhyphn_v20110815_LibO.tar.gz https://sourceforge.net/projects/magyarispell/files/OOo%20Huhyphn/v20110815-0.1/huhyphn_v20110815_LibO.tar.gz/download #
    

    The commands are long, please make sure to copy-paste the entire line (both ending with #),

    Be patient, you may have to wait for 30 seconds for each download.

  4. Run these commands to copy the files to the right location:
    $ (cd /tmp && tar xzvf ~/Downloads/hu_HU-1.6.1.tar.gz)
    $ (cd /tmp/hu_HU-1.6.1 && sudo cp hu_HU.aff hu_HU.dic /usr/share/hunspell/)
    $ (cd /tmp/ && tar xzvf ~/Downloads/huhyphn_v20110815_LibO.tar.gz)
    $ (cd huhyphn_v20110815_LibO && sudo cp hyph_hu_HU.dic /usr/share/hyphen/)
  5. If it's already running, exit from LibreOffice and OpenOffice.
  6. Start LibreOffice or OpenOffice, type asztall, select it, change the language to Hungarian in Format / Character. Now the text should be underlined with read, and when you right-click, the suggestion asztal should be offered.

2016-04-06

How to disable (reject) any root password on Debian and Ubuntu

This blog post explains how to disable (reject) any root password on Debian and Ubuntu, thus rejecting login attempts as root. Becoming root with sudo (by typing the calling user's password) or ssh (using a public key) remains possible.

TL;DR Run as root: passwd -d -l root

How to become root if password-based root logins are (or will be) disabled?

Before disabling password-based root logins, make sure you have other ways to become root. One possible way is running sudo (without arguments) from a non-root user. To make this work, first you have to install sudo by running (without the leading #) as root:

# apt-get install sudo

as root. (Ubuntu systems come with sudo preinstalled, Debian systems don't have it by default.) Then run as root, replacing MyUser with your non-root login name:

# adduser MyUser sudo

After running this, running sudo as that user will ask for the user's password (not the root password), and when typed correctly, you will get a root shell, and will be able to run commands as root. (Type exit to exit from the root shell.)

An alternative to sudo for becoming root without a password is running ssh root@localhost. For that you need a properly configured sshd (with PermitRootLogin without-password or PermitRootLogin yes in /etc/ssh/sshd_config), creating an SSH key pair and appending the public key to /root/.ssh/authorized_keys. If you need help setting this up or using it, then please ask a Unix or Linux guru friend.

How to disable password-based root logins

To disable (reject) any root password on Debian and Ubuntu, run this (without the leading #) as root:

# passwd -d -l root

This effectively changes the 2nd field line starting with root: in /etc/shadow to !, thus the line will start with root:!:, making login, su, ssh (when using password authentication, i.e. no public key) reject login attempts as root. Typically the password wouldn't even be asked for, but if it is, any password would be rejected. An alternative to the command above is editing the /etc/shadow file manually (as root), and adding the !. Also the -d flag is not necessary, without it the password hash is still kept in /etc/shadow (but a ! is prepended to disable it).

Ubuntu comes with this default (root:!: in /etc/shadow), Debian doesn't.

If you want to disable the root password in ssh only (and allow password-based root logins in login and su), then instead of running the command above, add (or change) the line

PermitRootLogin without-password

to /etc/ssh/sshd_config (as root), and then run (as root):

# /etc/init.d/ssh restart

Please note that there are ways to permit a root login without a password (or with an empty password), but this is very bad security practice, so this blog post doesn't explain how to do it.

How to enable password-based root logins

To enable password-based root logins again, run this as root:

# passwd root

It will ask you to specify the new password for root.

2016-02-22

keybase.txt

Please disregard this post, it's a cryptographic proof checked by keybase.io/pts.

==================================================================
https://keybase.io/pts
--------------------------------------------------------------------

I hereby claim:

  * I am an admin of https://ptspts.blogspot.com
  * I am pts (https://keybase.io/pts) on keybase.
  * I have a public key with fingerprint 537D 2E8D 6FAB 3265 9A1F  8767 33BB 974C 2FE0 93F2

To do so, I am signing this object:

{
    "body": {
        "key": {
            "eldest_kid": "01110f1fe32d1d6263e9674e11d7249ac66f46d9f8c54c896c16205dcf68203493b90a",
            "fingerprint": "537d2e8d6fab32659a1f876733bb974c2fe093f2",
            "host": "keybase.io",
            "key_id": "33bb974c2fe093f2",
            "kid": "01110f1fe32d1d6263e9674e11d7249ac66f46d9f8c54c896c16205dcf68203493b90a",
            "uid": "51f5f6cc0a558175f85f29dc164fe900",
            "username": "pts"
        },
        "service": {
            "hostname": "ptspts.blogspot.com",
            "protocol": "https:"
        },
        "type": "web_service_binding",
        "version": 1
    },
    "ctime": 1456178821,
    "expire_in": 157680000,
    "prev": "b14bb983d61e5b2097e313d7b246fa7e17030b598d6ae6a7a1545f36b5ef75c2",
    "seqno": 27,
    "tag": "signature"
}

which yields the signature:

-----BEGIN PGP MESSAGE-----
Version: GnuPG v1

owGtUj1rVEEUfSYaMSCYxk6L6dRlMx9vvhZsLLQRtBAbi2U+7mwe2bx5vveyMYRt
JYWFIGil4t9QG20sUu4fEG20CtrYWDizKDaWDgPDvXPOmTP33sdnV4uVDfbm2c3N
j0++nzj6aou7Rw8PD5CNfh+NDtA2LA+Yeuj68Xbl0QhhQggOJACjnnhBBQMtZAmE
eElLbZwQoRReB+V46ZQWjgiKuXdBKIpZqZnV2KABClU9gbZpq7pPspxJT0F5EYxl
VHBtSFBSSMas1bJ0NADWLNBE3IpdZiRz1nQwrGLKpWC8tPcP/H/2vbuU4yTwIJzD
hnNFJA+KB6p9IpUBNMYZ2EFbmx1I6Kbv0HyAUmJWOcg1zZ/4e5n20E7jpGtiP3Rx
J7GbNvbRxWkCbPV9042yQL/fZMYe2PFvrbGtap8qmRgzaLsq1mhEEtL1VRYnJRdE
KkXJAMGDpmphXGUEl0LhtPI7MEuSlpTW3tOKeUGAW4q1BEaYl5aWqScSiMQMW65T
iwwIIw3hJQ9MWA5BcpcL3cH9OqIRlcmomSTRrprUpt9tAc3XD6+fLDZWirVTK3nG
ivUz5/4MXnereH7xx+3zcfH05+LT26ub7+68evR+/q14celaWDv+fGXR3Fg9Xny5
fPrlh9cXfgE=
=N5V9
-----END PGP MESSAGE-----

And finally, I am proving ownership of this host by posting or
appending to this document.

View my publicly-auditable identity here: https://keybase.io/pts

==================================================================

micropython for Linux i386, statically linked

This blog post is to announce statically linked binaries for Linux i386 of MicroPython.

MicroPython (Python for microcontrollers) is an open source reimplementation (see sources on GitHub) of the Python 3.4 language for microcontrollers with very little RAM (as low as 60 kB). The CPython interpreter is not used at all, MicroPython has a completely separate implementation in C, supporting the full Python 3.4 language syntax, but with a much smaller standard library (i.e. much fewer modules and classes, and existing classes have fewer methods). Unicode strings (i.e. the str class) are supported though.

MicroPython can be cross-compiled to many different platforms, including multiple microcontrollers (including the ESP8266 ($5) and the pyboard ($40)) and to Unix systems (including Linux). The micropython binary seems to be 17.56 times smaller than the python binary for Linux i386 (both binaries were statically linked against uClibc using https://github.com/pts/pts-clang-xstatic/blob/master/README.pts-xstatic.txt, and optionally compressed with UPX). The detailed file sizes are:

The script for recompiling MicroPython for Linux i386, statically linked is also open source.

Please note that neither StaticPython nor MicroPython open any external files (such as .so or .py or .zip) when starting up, all the Python interpreter and the Python standard library (and the libc as well) are statically linked in to the binary executable.

2016-01-06

How to extract comments from a JPEG file

This blog post explains how to extract comments from a JPEG file. Each JPEG file consists of segments. Each segment describes parts of the image data or metadata. The comments are in segments with marker COM (0xfe), there can be any number of them, anywhere (usually before the SOS segment) in the file.

Use the rdjpgcom command-line tool to extract comments. The tool is part of libjpeg, and on Ubuntu and Debian systems it can be installed with (don't type the leading $):

$ sudo apt-get install libjpeg-progs

Once installed, use it like this to print all comments in the JPEG file, with a terminating newline added to each:

$ rdjpgcom file.jpg

If the file doesn't have any comment, the output of rdjpgcom is empty. Here is how to add comments:

$ wrjpgcom -c 'COMfoo' com0.jpg >com1.jpg
$ wrjpgcom -c 'COMbar' com1.jpg >com2.jpg
After adding the comments, it will look like this:
$ rdjpgcom com2.jpg
COMfoo
COMbar

If you also want to see the unprintable characters (unsafe on a terminal), pass the -raw flag:

$ rdjpgcom -raw file.jpg

If you need a library, use the following C code, which is a minimalistic reimplementation of rdjpgcom -a:

/*
 * getjpegcom.c: Get comments from JPEG files.
 * by pts@fazekas.hu at Wed Jan  6 20:48:07 CET 2016
 *
 *   $ gcc -W -Wall -Wextra -Werror -s -O2 -o getjpegcom getjpegcom.c &&
 *   $ ./getjpegcom <file.jpg
 */

#include <stdio.h>
#if defined(MSDOS) || defined(WIN32)
#include <fcntl.h>  /* setmode. */
#endif

/* Get and print all comments in a JPEG file. Comments are written to of,
 * with a newline appended as terminator.
 *
 * Returns 0 on success, or a negative number on error.
 */
static int get_jpeg_comments(FILE *f, FILE *of) {
  int c, m;
  unsigned ss;
#if defined(MSDOS) || defined(WIN32)
  setmode(fileno(f), O_BINARY);
  setmode(fileno(of), O_BINARY);
#endif
  if (ferror(f)) return -1;
  /* A typical JPEG file has markers in these order:
   *   d8 e0_JFIF e1 e1 e2 db db fe fe c0 c4 c4 c4 c4 da d9.
   *   The first fe marker (COM, comment) was near offset 30000.
   * A typical JPEG file after filtering through jpegtran:
   *   d8 e0_JFIF fe fe db db c0 c4 c4 c4 c4 da d9.
   *   The first fe marker (COM, comment) was at offset 20.
   */
  if ((c = getc(f)) < 0) return -2;  /* Truncated (empty). */
  if (c != 0xff) return -3;
  if ((c = getc(f)) < 0) return -2;  /* Truncated. */
  if (c != 0xd8) return -3;  /* Not a JPEG file, SOI expected. */
  for (;;) {
    /* printf("@%ld\n", ftell(f)); */
    if ((c = getc(f)) < 0) return -2;  /* Truncated. */
    if (c != 0xff) return -3;  /* Not a JPEG file, marker expected. */
    if ((m = getc(f)) < 0) return -2;  /* Truncated. */
    while (m == 0xff) {  /* Padding. */
      if ((m = getc(f)) < 0) return -2;  /* Truncated. */
    }
    if (m == 0xd8) return -4;  /* SOI unexpected. */
    if (m == 0xd9) break;  /* EOI. */
    if (m == 0xda) break;  /* SOS. Would need special escaping to process. */
    /* printf("MARKER 0x%02x\n", m); */
    if ((c = getc(f)) < 0) return -2;  /* Truncated. */
    ss = (c + 0U) << 8;
    if ((c = getc(f)) < 0) return -2;  /* Truncated. */
    ss += c;
    if (ss < 2) return -5;  /* Segment too short. */
    for (ss -= 2; ss > 0; --ss) {
      if ((c = getc(f)) < 0) return -2;  /* Truncated. */
      if (m == 0xfe) putc(c, of);  /* Emit comment char. */
    }
    if (m == 0xfe) putc('\n', of);  /* End of comment. */
  }
  return 0;
}

int main(int argc, char **argv) {
  (void)argc; (void)argv;
  return -get_jpeg_comments(stdin, stdout);
}

Here is how to compile and run it:

$ gcc -W -Wall -Wextra -Werror -s -O2 -o getjpegcom getjpegcom.c
$ ./getjpegcom com2.jpg
COMfoo
COMbar