2012-11-04

How to use Unicode accented characters in PuTTY in UTF-8 mode

This blog post explains how to set up PuTTY to connect to a Linux server in UTF-8 mode, so that all Unicode characters (including symbols and accented characters) will be transferred and interpreted correctly.

Follow these steps:

  • Download and install the newest PuTTY (0.62 or later).
  • Before connecting, configure PuTTY like this:
    • Window → Translation → Remote character set: UTF-8
    • Connection → Data → Environment variables: add Variable: LC_CTYPE, Value: en_US.UTF-8 , and click on Add.
    • Save these settings in Session (e.g. in Session → Default Settings → Save).
  • Connect to the SSH server with Putty.
  • If everything (including typing, reading and copy-pasting non-ASCII Unicode characters) already works, stop here.
  • Run the following commands without the leading $ (you will need the root password). These commands set up the UTF-8 en_US locale on the server. The commands have been verified and found working on Debian and Ubuntu servers.
    $ if test -f /etc/locale.gen; then sudo perl -pi -e 's@^# *(en_US.UTF-8 UTF-8)$@$1@' /etc/locale.gen; grep -qxF 'en_US.UTF-8 UTF-8' /etc/locale.gen || (echo; echo 'en_US.UTF-8 UTF-8') >>/etc/locale.gen; fi
    $ if test -f /var/lib/locales/supported.d; then grep -qxF 'en_US.UTF-8 UTF-8' || (echo; echo 'en_US.UTF-8 UTF-8') >>/var/lib/locales/supported.d/en; fi
    $ sudo perl -pi -e 's@^(?=LC_CTYPE=|LC_ALL=)@#@' /etc/environment
    $ sudo /usr/sbin/locale-gen
    $ sudo /usr/sbin/update-locale LC_CTYPE LC_ALL
    
  • Close the SSH connection window and open a new connection.
  • Verify that everything (including typing, reading and copy-pasting non-ASCII Unicode characters) works.

Contrary to the information found elsewhere on the net, just setting Window → Translation → Received data assumed to be in which character set or Window → Translation → Remote character set to UTF-8 is not always enough. Setting the server-side environment variables (e.g. setting LC_CTYPE to en_US.UTF-8 above) properly is also required (unless they are already correct). Generating the UTF-8 locale definitions (using locale-gen) on the server is also required (unless they are already generated).

2 comments:

Erick said...

Thank you! I thought only setting the "Translation" value would be enough.

Media Vince said...

Thanks so much!!
This is by far the definitive resource for the issue on the web!
Cheers,
Vinz