Sunday, December 11, 2011

Introduction to the core utilities, part 2.

It should come as no suprise that there are dozens of small utilities that are included with every Linux distribution.  These are the core utiliites, and this is a brief introduction to the displaying and formating tools.  Part 2 covers some utilities to output parts of files:

head
head displays the first n lines (or n bytes) of a file (or standard input).  By default head displays the first 10 lines of a file, but that is easily changed:

Display the first 10 lines of /etc/passwd

    ~/projects/pt2> head /etc/passwd
    root:x:0:0:root:/root:/bin/bash
    daemon:x:1:1:daemon:/usr/sbin:/bin/sh
    bin:x:2:2:bin:/bin:/bin/sh
    sys:x:3:3:sys:/dev:/bin/sh
    sync:x:4:65534:sync:/bin:/bin/sync
    games:x:5:60:games:/usr/games:/bin/sh
    man:x:6:12:man:/var/cache/man:/bin/sh
    lp:x:7:7:lp:/var/spool/lpd:/bin/sh
    mail:x:8:8:mail:/var/mail:/bin/sh
    news:x:9:9:news:/var/spool/news:/bin/sh

The first 10 lines of /etc/passwd  passed to a script to list the usernames and their shells:

    ~/projects/pt2> head /etc/passwd | cut -d: -f1,7
    root:/bin/bash
    daemon:/bin/sh
    bin:/bin/sh
    sys:/bin/sh
    sync:/bin/sync
    games:/bin/sh
    man:/bin/sh
    lp:/bin/sh
    mail:/bin/sh
    news:/bin/sh
Display the first 5 lines of the output of dmesg
    ~/projects/pt2> dmesg | head -n 5
    [    0.000000] Initializing cgroup subsys cpuset
    [    0.000000] Initializing cgroup subsys cpu
    [    0.000000] Linux version 2.6.32-5-686 (Debian 2.6.32-39) (dannf@debian.org) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Thu Nov 3 04:23:54 UTC 2011
    [    0.000000] KERNEL supported cpus:
    [    0.000000]   Intel GenuineIntel

Display the first 2 GigaBytes of the 1st partition of the second hard drive, then passed to strings to get a listing of all printable characters
    ~/projects/pt2> head -c 2G /dev/sdb1 | strings
    [output omitted for brevity]

tail
tail displays the last n lines (or n bytes) of a file (or standard input)

Display the last 10 lines of a syslog, and then each line that comes into said file (usefull watching log files, for instance)

    ~/projects/pt2# tail  /var/log/syslog
    Dec  2 09:16:27 valhalla dhclient: DHCPACK from 192.168.200.1
    Dec  2 09:16:27 valhalla dhclient: bound to 192.168.200.106 -- renewal in 42062 seconds.
    Dec  2 20:57:29 valhalla dhclient: DHCPREQUEST on eth0 to 192.168.200.1 port 67
    Dec  2 20:57:29 valhalla dhclient: DHCPACK from 192.168.200.1
    Dec  2 20:57:29 valhalla dhclient: bound to 192.168.200.106 -- renewal in 39735 seconds.
    Dec  3 07:30:01 valhalla anacron[31196]: Anacron 2.3 started on 2011-12-03
    Dec  3 07:30:01 valhalla anacron[31196]: Will run job `cron.daily' in 5 min.
    Dec  3 07:30:01 valhalla anacron[31196]: Jobs will be executed sequentially
    Dec  3 07:35:01 valhalla anacron[31196]: Job `cron.daily' started
    Dec  3 07:35:01 valhalla anacron[31202]: Updated timestamp for job `cron.daily' to 2011-12-03

Display lines 450-500 (or the last 50 lines of a file if it doesn't have at least 500 lines in it)
     ~/projects/pt2# head -n 500 /var/log/messages | tail -n 50
     [output omitted for brevity]

GNU tail can also keep track of multiple files, output the last n bytes of a file, and can be made to auto terminate nate.     Monitor both /var/log/messages and /var/log/syslog
    ~/projects/pt2> tail -f /var/log/{syslog,messages}
    ==> /var/log/syslog <==
    Dec  9 15:26:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 15:36:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 15:46:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 15:56:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 16:06:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 16:16:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 16:17:01 valhalla /USR/SBIN/CRON[18156]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
    Dec  9 16:26:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 16:36:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 16:46:05 valhalla mpt-statusd: detected non-optimal RAID status

    ==> /var/log/messages <==
    Dec  9 15:16:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 15:26:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 15:36:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 15:46:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 15:56:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 16:06:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 16:16:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 16:26:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 16:36:05 valhalla mpt-statusd: detected non-optimal RAID status
    Dec  9 16:46:05 valhalla mpt-statusd: detected non-optimal RAID status

split
split takes a file and splits it into several smaller files. For example, say you have an email limit, where each file can only be 1 mb in size, and you need to email a 5 mb jpg.  You could split the file thusly:

     ~/projects/pt2> split -b 750000 Blinds.jpg Blinds.jpg.part.
     ~/projects/pt2> ls -gG Blinds*
     -rw-r--r-- 1 1157513 Dec  9 17:25 Blinds.jpg
     -rw-r--r-- 1  750000 Dec  9 17:26 Blinds.jpg.part.aa
     -rw-r--r-- 1  407513 Dec  9 17:26 Blinds.jpg.part.ab

And then email the parts one by one.  Why 750,000 bytes?  MIME encoding the file will increase each part to about 133% of its starting size, and 750,000 * 133 = 997,500.  This leaves us some room for the email message and header.

To join the files back into 1 piece, simply cat them together, like so:
     ~/projects/pt2> cat Blinds.jpg.part.* > Blinds.jpg

If the system that you are joining them together on is Windows, you can join together files splitted with a comm and like this:

     C:\> copy /b Blinds.jpg.part.aa+Blinds.jpg.part.ab Blinds.jpg

split can also run a filter on the parts as it splits them.

Split the weather.log file into parts, compressing them as they go.
     ~/projects/pt2> split --filter 'gzip -9c > $FILE.gz' weather.log weather-part-

Next week we'll continue on with our tour of the core utilities, working with file hashes and various ways to sort files.

No comments:

Post a Comment