I have had bad luck trying to coax this out of Google, so here’s a Perl one-liner:
perl -pi -e 's/[\x80-\xEF]//g' file.txt
Where file.txt is a file you want to clean up.
Why this comes up is because we have a web application that was set up to hit a MySQL database, which is incorrectly configured to store text as ASCII instead of UTF-8. The application assumes that all text is Unicode and that the database is correctly configured, and every week or two someone asks me why they are getting this weird gnarly error. Typically they are pasting in some weird UTF-8 whitespace character sent to us from Ukraine.
Eventually the database will be reloaded as UTF-8 and the problem will be solved. Until then, I can tell folks to use the Perl command above. It just looks for anything with the high bit set and strips it out.
Old SysAdmin tip: keep your frequent-but-long-running cron jobs from running concurrently by adding some lightweight file locking to your cron entry. For example, if you have:
Read up on the lockf or flock man pages before you go putting this in. This can be a bit tricky because these can also be system calls. Try “man 1 lockf” or the like to nail it down to the manual for the user-executable command.
I have run in to this a zillion times. You SSH to a Unix server, type your password, and then wait a minute or two before you get the initial shell prompt, after which everything is reasonably zippy.
The short answer is “probably, something is wrong with DNS . . . your server is trying to look up your client but it can not, so it sits there for a couple of minutes until it times out, and then it lets you in.”
Yesterday I was working with an artist who had a hosting account, and when he got in, I said:
sudo vim /etc/resolv.conf
He admitted that he had just copied the DNS configuration from his previous server. How to fix this? Well, he could check what nameservers are provided by his current hosting company . . . . or, I changed his file to read:
The other day I was working on a shell script to be run on several hundred machines at the same time. Since the script was going to download a file from a central server, and I did not want to overwhelm the central server with hundreds of simultaneous requests, I decided that I wanted to add a random wait time. But how do you conjure a random number within a specific range in a shell script?
Updated: Due to much feedback, I now know of three ways to do this . . .
1) On BSD systems, you can use jot(1): sleep `jot -r 1 1 900`
2) If you are scripting with bash, you can use $RANDOM: sleep `echo $RANDOM%900 | bc`
3) For portability, you can resort to my first solution: # Sleep up to fifteen minutes
sleep `echo $$%900 | bc`
$$ is the process ID (PID), or “random seed” which on most systems is a value between 1 and 65,535. Fifteen minutes is 900 seconds. % is modulo, which is like division but it gives you the remainder. Thus, $$ % 900 will give you a result between 0 and 899. With bash, $RANDOM provides the same utility, except it is a different value whenever you reference it.
Updated yet again . . . says a friend:
nah it’s using `echo .. | bc` that bugs me, 2 fork+execs, let your shell do the math, it knows how
so $(( $$ % 900 )) should work in bsd sh
For efficiency, you could rewrite the latter two solutions:
2.1) sleep $(( $RANDOM % 900 ))
3.1) sleep $(( $$ % 900 ))
The revised solution will work in sh-derived shells: sh, bash, ksh. My original “portable” solution will also work if you’re scripting in csh or tcsh.
I wanted to know what time it was in UTC, but I forgot my local offset. (It changes twice a year!) I figured I could look in the date man page, but I came up with an “easier” solution. Simply fudge the time zone and then ask.
0-20:57 djh@noneedto ~$ env TZ=UTC date
Tue May 6 03:57:07 UTC 2008
The env bit is not needed in bash, but it makes tcsh happy.
I have been playing with Google Trends, which will be happy to generate a pretty graph of keyword frequency over time. A rough gauge to the relative popularity of various things. This evening, I was riffing off a post from the Royal Pingdom, regarding the relative popularity of Ubuntu and Vista, among other things.
This is just a note which I contributed to a thread on sage-members, to get something off my chest, as to where people should maintain their crontab entries. I sincerely doubt that reading what I have to say will bring you any great illumination.
I’d say, any reasonable SysAdmin should default to /etc/crontab because every other reasonable SysAdmin already knows where it is. If anything is used in addition to /etc/crontab, leave a note in /etc/crontab advising the new guy who just got paged at 3:45am where else to look for crons.
For production systems, I strongly object to the use of per-user crontabs. I’m glad to hear I’m not alone. One thing I have to do in a new environment tends to be to write a script that will sniff out all the cron entries.
And then there was the shop that used /etc/crontab, user crons, and fcron to keep crons from running over each other. This frustrated me enough that I did a poor job of explaining that job concurrency could easily be ensured by executing a command through (something like) the lockf utility, instead of adding a new layer of system complexity.
So, assuming you are a SysAdmin, you really want to get a basic understanding of public key cryptography and the rest. But then, there’s a lot of stuff you need to learn and sometimes you just need to apply a patch, and would like some decent assurance that the patch hasn’t been compromised.
Today, I am patching–a few weeks too late–a FreeBSD system to reflect recent legislative changes to Daylight Saving Time. The procedure is very simple, and covered in FreeBSD Security Advisory FreeBSD-EN-07:04.zoneinfo. It starts:
a) Download the relevant patch from the location below, and verify the detached PGP signature using your PGP utility.
For many yers I have used FreeBSD nearly exclusively. In the BSD tradition, root is pretty well protected — root can not log in from remote unless you put some effort into hooking that up, and local users can only run su if they are members of the wheel group. Because of the nifty sudo tool and my own disinterest in memorizing any more passwords than necessary, I have tended to remain unconcerned with the root password, setting it and storing the thing somewhere, which is a pain, or setting it to something dumb, or just not setting it, depending on the security needs of a given system.
I recently learned a painful lesson from Fedora: not all unices are as protective of the root user. Sure, I knew that in Linux any local user can run su, but OpenSSH isn’t going to allow people to log in as root, right? Wrong!(more…)
If you’re like me, you run Firefox on FreeBSD, or maybe Linux. And you use a classy nice window environment like fvwm2. And every time you start Firefox it asks can it be the default browser, and you say yes … like you use anything else? (MSIE4-Solaris, anyone?) And every time you start, it asks again . . . stupid!
I just saw this solution posted to FreeBSD-questions: (more…)
I recently had a need for two quick temperature conversion algorithms in a Perl script. I asked Google, but did not immediately get a great answer, so here’s my answer:
# Two quick helper functions: CtoF and FtoC
sub CtoF { my $c = shift; $c =~ s/[^\d\.]//g; return (9/5)*($c+32); }
sub FtoC { my $f = shift; $f =~ s/[^\d\.]//g; return (5/9)*($f-32); }
The regex is to untaint the input datum, and could be eliminated if you know that your variable is clean. This code has been incorporated into a systems health and data trend monitoring script for FreeBSD. For the vaguely interested, here’s today’s perldoc: (more…)
Q: How do you measure swap utilization in FreeBSD? (Assuming you are writing a script to gather performance metrics.)
A: If you are writing a C program, check kvm_getswapinfo(3) and maybe take a gander at the bottom of /usr/src/usr.bin/top/machine.c.
A: If you are writing a Perl script:
Measure swap activity: sysctl vm.stats.vm.v_swapin vm.stats.vm.v_swapout vm.stats.vm.v_swappgsin vm.stats.vm.v_swappgsout
(I believe these results are COUNTER type values, like you get from netstat -inb. You could establish “swap activity” by plotting changes in this value.)