Monospaced Thoughts

creativity in machine calligraphy

Encoding utilities

Permalink

Very short post, just a tip on encoding/charset utilities to use with files.

iconv

The holy grail of encoding, if you didn’t know it before, just follow the link on the title and read about it. Every other thing in universe uses it to perform encoding conversion.

detox

This one is my favorite, it will rename files and make them “safe” (this pretty much depends on the filesystem you’re using. You put a unicode char in a filename on a FAT32 partition and the universe collapses in a singularity). It supports sequences of filters that you can create and pass the filenames through. Very handy.

convmv

man page

convmv also renames files, but deals specifically with encoding, so it’s more powerful and supports a wide range of encodings, very good too.

recode

I couldn’t find its home page, Debian package is “recode”

“recode converts files between character sets and usages.” It supports sequences like detox, is powerful but a little cumbersome. Note that it converts file contents, not filenames.

uni2ascii

Converts file contents from unicode to ascii-encoded formats, like HTML encoding. Very nice to put in source code, pages, etc.

dos2unix

Not exactly an encoding utility, but useful to get rid of \r\n in files created on lesser OSes.

Comments