James Slocum

Blog

A random collection of thoughts on a variety of topics


“A tour of the lesser known coreutils - part 2”

2013-06-06

In part 1 I introduced 6 of the lesser known core utilities. One thing I forgot to mention is that OSX prefixes all of the coreutil commands with the letter ‘g’. So head is ghead on a Mac. With that out of the way let’s keep the momentum going and dive right in with more commands!

factor

factor will perform a prime number factorization on the input number. You can either specify the number you want to factor on the command line after the factor command, or you can just execute the command and keep typing numbers onto the command line. To exit, just hit CTRL+d.

$ factor 5234
5234: 2 2617

$ factor
52893
52893: 3 3 3 3 653
1234567891234567891234567898
factor: `1234567891234567891234567898' is too large

factor can be built with or without the gmp library. gmp is the GNU Multiple Precision library. It is a “big number” library that can handle precisions greater than what the native processor can handle (in my case 64 bit). The default version that comes with Ubuntu Linux is not built with gmp, and therefore limited in the size of the number it can decompose.

To get around this limitation, you can install the gmp library (either from the apt archive, or from source) and recompile the coreutils package from source. Once you have re-compiled it you can replace the old factor with the new one, or name it something else and put it side by side with the old one.

$ echo "2^128-1" | bc | factor
factor: `340282366920938463463374607431768211455' is too large

# I recompiled the coreutils with gmp, and created big-factor
$ echo "2^128-1" | bc | big-factor
340282366920938463463374607431768211455: 3 5 17 257 641 65537 274177 6700417 67280421310721

On FreeBSD, when you install the coreutils you will be prompted to link against the gmp libraries. If you choose that option it will install the library for you, then build the coreutils with the proper linkage.

Fedora Linux users are lucky, as the default version of factor is built with gmp support.

If you are using OSX, remember that all of the coreutils commands installed by brew are prefixed with a letter 'g’. You can either run gfactor, or create an alias.

alias factor=/usr/local/bin/gfactor

To make this permanent you should add this alias to your .bash_profile. Also, the default brew script builds coreutils without gmp support. To remedy this you must edit the script and change the line that says --without-gmp to --with-gmp. You must also have gmp installed. Just follow these steps.

$ brew install gmp
$ brew edit coreutils

#edit the line that says --without-gmp to --with-gmp (line 14 for me)
$ brew install coreutils

#alternatively you could use 'brew reinstall coreutils'

#confirm that gfactor is linked to gmp
$ otool -L /usr/local/bin/gfactor
/usr/local/bin/gfactor:
   /usr/local/lib/libgmp.10.dylib
   /usr/lib/libiconv.2.dylib
   /usr/lib/libSystem.B.dylib

base64

base64 is a simple, but very useful program that allows you to encode files into base64 text, and decode base64 text back into files. By default, it will print the encoded base64 text to standard out, so you will need to redirect it to a file if you want to save it.

base64 hyphen.jpg
/9j/4AAQSkZJRgABAgEBkAGQAAD/4QBoRXhpZgAATU0AKgAAAAgABQESAAMAAAABAAEAAAEaAAUA
AAABAAAASgEbAAUAAAABAAAAUgEoAAMAAAABAAIAAIdpAAQAAAABAAAAWgAAAAAAAAGQAAAAAQAA
AZAAAAABAAAAAAAA/+0UylBob3Rvc2hvcCAzLjAAOEJJTQPtClJlc29sdXRpb24AAAAAEAGQAAAA
AQABAZAAAAABAAE4QklNBA0YRlggR2xvYmFsIExpZ2h0aW5nIEFuZ2xlAAAAAAQAAAAeOEJJTQQZ
EkZYIEdsb2JhbCBBbHRpdHVkZQAAAAAEAAAAHjhCSU0D8wtQcmludCBGbGFncwAAAAkAAAAAAAAA
AAEAOEJJTQQKDkNvcHlyaWdodCBGbGFnAAAAAAEAADhCSU0nEBRKYXBhbmVzZSBQcmludCBGbGFn
cwAAAAAKAAEAAAAAAAAAAjhCSU0D9RdDb2xvciBIYWxmdG9uZSBTZXR0aW5ncwAAAEgAL2ZmAAEA
...
...
kzT4Htaf89P+acVU4Y/yb/SlqYJYxfiRfqwQT7vX4eicOuKvTcVf/9k=

Using base64 can be a clever way to get around attachment limitations placed on email, or to embed a program or other files into a script. I have used this method in my packup.sh script that can be found on my github page.

On FreeBSD you must install base64 separately from the coreutils. The base64 package is in /usr/ports/converters/base64. You can install the base64 package from there by running make install as root.

On OSX the base64 package comes packaged with the openssl utilities. Although the binary is different from the GNU version, it still functions mostly the same. The difference is the flags that the command takes. The GNU version uses the ’-d’ flag for decoding, while the openssl version uses ’-D’. The GNU version also has a ’-i’ flag to ignore newline characters, the openssl version does not. To use the GNU version of base64 you can run the gbase64 command.

truncate

truncate can be used to adjust the size of a file. If you make a file smaller, it will chop the data off the end, and if you make it bigger, it will produce a hole. A hole is a null byte filled section of a file that does not actually get stored on the disk. It is transparently stored as meta-data to save space in the file system.

# Create and list size of empty file
$ touch empty.file
$ ls -las empty.file
0 -rw-rw-r-- 1 james james 0 Apr 19 21:03 empty.file

# set the size to 100 Megs and list it again
$ truncate -s 100M empty.file
$ ls -las empty.file
0 -rw-rw-r-- 1 james james 104857600 Apr 19 21:04 empty.file

Notice above that the reported size of the file has increased to 104857600 bytes, but the number of blocks (the first number) is still 0. Thats because this file still takes no space on the physical storage medium, despite reporting it’s size as 100 Megs. The file is just one big hole.

If you are using OSX, you can run the truncate program by running the gtruncate command.

tsort

tsort is one of the stranger core utilities because it is so specialized. tsort will perform a topological sort (or topsort) on it’s input. A topological sort is used on directed graphs. For every directed edge uv from vertex u to vertex v, u comes before v in the ordering. One use of this algorithm is to determine the order tasks should be performed to avoid any conflicts. Lets take a look at a simple example.

daily tasks graph

now, we can enter each uv directed edge into tsort, and it will output the order in which to perform these tasks.

$ tsort <<-EOF 
> take_a_shower make_shopping_list > take_a_shower go_to_bank > take_a_shower get_hair_cut > go_to_bank go_to_store > go_to_store buy_food > buy_food cook_dinner > go_to_bank get_car_wash > go_to_bank get_hair_cut > make_shopping_list buy_food > EOF take_a_shower go_to_bank make_shopping_list get_hair_cut get_car_wash go_to_store buy_food cook_dinner

OSX comes with its own version of tsort. The GNU version and the default version behave identically so you can use either. If you want to use the GNU version from the coreutils package, use the gtsort command.

Next time I will wrap up the series with a few more useful commands. Ever need to make it look like a file was created in the year 4000? Check back for part 3 to find out how!


comments powered by Disqus