head
head displays the first n lines (or n bytes) of a file (or standard input). By default head displays the first 10 lines of a file, but that is easily changed:
Display the first 10 lines of /etc/passwd
~/projects/pt2> head /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh
man:x:6:12:man:/var/cache/man:/bin/sh
lp:x:7:7:lp:/var/spool/lpd:/bin/sh
mail:x:8:8:mail:/var/mail:/bin/sh
news:x:9:9:news:/var/spool/news:/bin/sh
The first 10 lines of /etc/passwd passed to a script to list the usernames and their shells:
~/projects/pt2> head /etc/passwd | cut -d: -f1,7
root:/bin/bash
daemon:/bin/sh
bin:/bin/sh
sys:/bin/sh
sync:/bin/sync
games:/bin/sh
man:/bin/sh
lp:/bin/sh
mail:/bin/sh
news:/bin/sh
Display the first 5 lines of the output of dmesg
~/projects/pt2> dmesg | head -n 5
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 2.6.32-5-686 (Debian 2.6.32-39) (dannf@debian.org) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Thu Nov 3 04:23:54 UTC 2011
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
Display the first 2 GigaBytes of the 1st partition of the second hard drive, then passed to strings to get a listing of all printable characters
~/projects/pt2> head -c 2G /dev/sdb1 | strings
[output omitted for brevity]
tail
tail displays the last n lines (or n bytes) of a file (or standard input)
Display the last 10 lines of a syslog, and then each line that comes into said file (usefull watching log files, for instance)
~/projects/pt2# tail /var/log/syslog
Dec 2 09:16:27 valhalla dhclient: DHCPACK from 192.168.200.1
Dec 2 09:16:27 valhalla dhclient: bound to 192.168.200.106 -- renewal in 42062 seconds.
Dec 2 20:57:29 valhalla dhclient: DHCPREQUEST on eth0 to 192.168.200.1 port 67
Dec 2 20:57:29 valhalla dhclient: DHCPACK from 192.168.200.1
Dec 2 20:57:29 valhalla dhclient: bound to 192.168.200.106 -- renewal in 39735 seconds.
Dec 3 07:30:01 valhalla anacron[31196]: Anacron 2.3 started on 2011-12-03
Dec 3 07:30:01 valhalla anacron[31196]: Will run job `cron.daily' in 5 min.
Dec 3 07:30:01 valhalla anacron[31196]: Jobs will be executed sequentially
Dec 3 07:35:01 valhalla anacron[31196]: Job `cron.daily' started
Dec 3 07:35:01 valhalla anacron[31202]: Updated timestamp for job `cron.daily' to 2011-12-03
Display lines 450-500 (or the last 50 lines of a file if it doesn't have at least 500 lines in it)
~/projects/pt2# head -n 500 /var/log/messages | tail -n 50
[output omitted for brevity]
GNU tail can also keep track of multiple files, output the last n bytes of a file, and can be made to auto terminate nate. Monitor both /var/log/messages and /var/log/syslog
~/projects/pt2> tail -f /var/log/{syslog,messages}
==> /var/log/syslog <==
Dec 9 15:26:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 15:36:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 15:46:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 15:56:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 16:06:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 16:16:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 16:17:01 valhalla /USR/SBIN/CRON[18156]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Dec 9 16:26:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 16:36:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 16:46:05 valhalla mpt-statusd: detected non-optimal RAID status
==> /var/log/messages <==
Dec 9 15:16:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 15:26:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 15:36:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 15:46:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 15:56:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 16:06:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 16:16:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 16:26:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 16:36:05 valhalla mpt-statusd: detected non-optimal RAID status
Dec 9 16:46:05 valhalla mpt-statusd: detected non-optimal RAID status
split
split takes a file and splits it into several smaller files. For example, say you have an email limit, where each file can only be 1 mb in size, and you need to email a 5 mb jpg. You could split the file thusly:
~/projects/pt2> split -b 750000 Blinds.jpg Blinds.jpg.part.
~/projects/pt2> ls -gG Blinds*
-rw-r--r-- 1 1157513 Dec 9 17:25 Blinds.jpg
-rw-r--r-- 1 750000 Dec 9 17:26 Blinds.jpg.part.aa
-rw-r--r-- 1 407513 Dec 9 17:26 Blinds.jpg.part.ab
And then email the parts one by one. Why 750,000 bytes? MIME encoding the file will increase each part to about 133% of its starting size, and 750,000 * 133 = 997,500. This leaves us some room for the email message and header.
To join the files back into 1 piece, simply cat them together, like so:
~/projects/pt2> cat Blinds.jpg.part.* > Blinds.jpg
If the system that you are joining them together on is Windows, you can join together files splitted with a comm and like this:
C:\> copy /b Blinds.jpg.part.aa+Blinds.jpg.part.ab Blinds.jpg
split can also run a filter on the parts as it splits them.
Split the weather.log file into parts, compressing them as they go.
~/projects/pt2> split --filter 'gzip -9c > $FILE.gz' weather.log weather-part-
Next week we'll continue on with our tour of the core utilities, working with file hashes and various ways to sort files.