Shell HOWTO: Remove Duplicate Elements from a Variable
If you are a seasoned Unix admin, you’ve been doing stuff like this for years:
cat $file | sort | uniq
Which is a handy way to eliminate duplicate lines in a file, or a collection of files. The uniq -c
will even tell you how many duplicated lines there are, and you might even do:
cat $file | sort | uniq -c | sort -n
For example, you could run a command like this to see who is receiving the most mail on your system:
awk ’{print $7}’ < /var/log/maillog | grep ^to= | sort | uniq -c | sort -rn
Anyway, even if you are a seasoned Unix admin, you probably aren’t a big expert on shell scripting. It is something I like to do not only because automation makes my life better, but to prove that I’m not a Perl weenie. Today I have a script that copies a bunch of files into a directory hierarchy to set things up for a chroot()ed environment:
manifest="/usr/bin/scp /usr/libexec/sftp-server /usr/local/libexec/rssh_chroot_helper /bin/sh" # Copy manifest for file in $manifest; do mkdir -p $1/`dirname $file` cp -p $file $1/$file libs="$libs `ldd $file | awk '{print $3}'`" done
Before you copy your file to $1
, you have to ensure that the target directory exists, hence the dirname
. But, in order for the target executable to run, you will also need your shared libraries, which I sniff out with ldd
and awk
. I could then copy the $libs
in much the same way that I am copying $manifest
, above. But there are surely duplicates, ya? So, the question is, how do I “uniq” a shell variable?
Twenty seconds of thinking later, and I have an idea, and it works. Very briefly, it is just a question of slicing the spaces into newlines so you can use your everyday Unix admin tools to finish the work:
0-10:00 djh@mito ~> sh $ libs="a b c d" $ libs="$libs b c d f g" $ echo $libs a b c d b c d f g $ libs=`echo $libs | tr ’ ’ ’\n’ | sort | uniq` $ echo $libs a b c d f g
So, my script continues:
# Copy libs libs=`echo $libs | tr ’ ’ ’\n’ | sort | uniq` libs="$libs /libexec/ld-elf.so.1" for lib in $libs; do mkdir -p $1/`dirname $lib` cp -p $lib $1/$lib done
And of course, I drop some science here, just in case someone gets stuck in a similar situation.