Enable caching in collectd!

23 February 2010

If you decide to run collectd on your system, don’t do what I did: I ran it for 8 months without caching enabled. Basically, that means that not only was collectd monitoring load, it was generating a huge amount of load itself — way more than anything else running on the system.

Case in point:

collectd load graph. "No caching" shows high iowait, while "CacheTimeout" and "CacheFlush" shows the CPU load more than halved.

The “no caching” bit is what it’s been running like for 8 months, using this config in /etc/collectd/collectd.conf:

<Plugin rrdtool>
        DataDir "/var/lib/collectd/rrd"
</Plugin>

To more than half my idle CPU usage, and get my load averages back down to near 0.00, all I had to do was add:

<Plugin rrdtool>
        DataDir "/var/lib/collectd/rrd"
       CacheTimeout 120
       CacheFlush 900
</Plugin>

I don’t get why they don’t ship it like that by default. Enable caching. Your server will thank you. Thanks to the folks on #slug who held my hand while fixing this. :)

SCLUG site redesigned

21 February 2010

The South Coast Linux Users Group (SCLUG) website has had a lick of paint by yours truly.

The old site didn’t look bad (in particular, I liked the penguin header image), but was getting a little long in the tooth, had a few issues (such as broken category styles, and images being linked to the wrong domain), and didn’t really reflect much about Wollongong, which is where we are based.

So a little Inkscaping later, I came up with this:

SCLUG site in Wollongong livery

Check it out. No, like, seriously. Check it out. A bit more polish to come, but the idea is there.

Source IP weirdities with irssi and IPv6

28 January 2010

I’m having a weird problem with irssi and IPv6. The long and the short of it is that irssi is trying to connect to an IRC server on the Internet with a source IP address of ::1, which is incorrect, as ::1 is the loopback address.

My server, glenstorm, is the IPv6 router, which contains the ppp0 interface that connects it to the IPv6 Internet. I am also running irssi on the same machine. It’s a router, so /proc/sys/net/ipv6/conf/all/forwarding is 1.

So, basically, when I fire up irssi, and type “/connect irc.ipv6.freenode.net“, it hangs when connecting. And for good reason: here’s the (edited for clarity) tcpdump output:

IP6 ::1.34823 > 2001:19f0:feee::dead:beef:cafe.6667
IP6 ::1.34823 > 2001:19f0:feee::dead:beef:cafe.6667
IP6 ::1.34823 > 2001:19f0:feee::dead:beef:cafe.6667

So obviously that’s wrong. And in violation of RFC 4291, I might add (“The loopback address must not be used as the source address in IPv6 packets that are sent outside of a single node.”).

I can hack around it by typing “/connect -host 2001:44b8:7df3:b970::14 irc.ipv6.freenode.net” into irssi, which forces it to use the source IP that I specified. But that’s just a hack — I’d like to get to the bottom of what actually causes it.

Update: Finally solved this. It’s because in my irssi config, I had the following directive:

core = {
    real_name = "Jeremy Visser";
    user_name = "jeremy";
    nick = "jayvee_zzZZ";
    resolve_prefer_ipv6 = "yes";
    host = "glenstorm";
};

It was being told to use “glenstorm” as the “host”, which translates to “resolve the IP address of glenstorm and use that as the source IP address” (I think I misunderstood the meaning of the directive when I put that configuration flag in).

Of course, in /etc/hosts, I had the following entry:

::1 glenstorm

So, naturally, irssi decided to use ::1 as the source IP address. So removing the “host” line from the irssi config fixed the problem. While I’m sure that because of the aforementioned RFC, that shouldn’t have resulted in the subsequent symptoms, at the end of the day, it was simply Unix allowing me to shoot myself in the foot.

No DRI on X.Org with a Radeon? Check your Virtual size.

28 October 2009

After I installed Fedora Rawhide on the eMac this week, I fired up X.Org, only to discover that…

(II) AIGLX: Screen 0 is not DRI2 capable
(II) AIGLX: Screen 0 is not DRI capable

So it had fallen back to a software 3D renderer, which is pretty crap. So to make a long story short, it was because my ‘Virtual’ screen size was too big. I typed xrandr, and got the following:

$ xrandr
Screen 0: minimum 320 x 200, current 1280 x 960, maximum 2048 x 2048

Because of various technical reasons, when the Virtual size is too big (which, evidently, 2048×2048 is), DRI gets disabled. So, to re-enable it, I put this into my xorg.conf:

Section "Screen"
        Identifier "Main Screen"
        Device "Radeon 7500"
        Monitor "eMac CRT"
        SubSection "Display"
                Virtual 1280 960 # put the highest resolution you intend to use here
        EndSubSection
EndSection

Obviously, edit the values to suit.

How terminals work

18 October 2009

Via @toros on Identi.ca, I stumbled across this awesome explanation by Scott James Remnant on how terminals (real terminals, virtual terminals, and pseudo terminals) in Unix work.

Because it is pasted from an IRC conversation, it is a little hard to follow, so I present it to you reformatted to be more readable. (I have tried to remain as faithful as possible to the original.)

How Terminals Work

By Scott James Remnant

In Linux, we have consoles, but really we mean Virtual Terminals (VTs), TTYs, and Pseudo-Terminals (PTYs). It’s all a bit of the kind of jumble sale you get after 40 years of different solutions to different problems. We can lump them together under the description “terminals” for the next bit.

We also have processes. Now, processes have a lot of odd little details: they have a parent, and they have a session. A session has a foreground process group and a background process group. (Stevens devotes an entire chapter for this and nobody but me apparently understands it. And hopefully the guy who maintains the kernel side. ;) )

So, each process is part of a session — a process may begin a new session by calling setsid(). Every process that init creates is in its own session. The process is also then the leader of the foreground process group of that session. New processes are also in that session, and in that process group, unless otherwise placed into a new process group (setpgrp()). Any new process group is a background process group.

So now, you have a bunch of sessions. Each session has a bunch of process groups, one of which is the foreground process group. Each process group has a bunch of processes, one of which is the leader.

So this all has to do, fundamentally, with terminals, and who gets the signals.

When the leader of the foreground process group of a session opens a terminal device (without O_NOCTTY) that becomes the controlling terminal of that process group and session. The terminal and the session become bound to each other.

You can fake this another way by opening a terminal device without O_NOCTTY (or having one passed to you) and then calling the TIOCSCTTY ioctl().

Okay, so: terminals, controlling terminals and processes — here’s where this gets fun.

If the controlling terminal is hung up, SIGHUP is sent to the foreground process group. If ^C is pressed on the controlling terminal, SIGINT is sent to the foreground process group. If ^Z is pressed on the controlling terminal, SIGTSTP is sent to the foreground process group. And so on and so forth.

So this is how the relationship between magic key presses and signals gets established. Shells care about this a lot (and yes, when you use command & that becomes a background process group, and when you use command | command they are all in the same process group).

Now, this controlling terminal business applies to all terminals, whether they be true terminals (which Linux doesn’t have), virtual terminals, or pseudo terminals. So this is as true for your SSH login as your VT1.

You can always access your controlling terminal using /dev/tty — (it’s a badly named device node); it may also be called /dev/ttyS0 or /dev/pts/4, etc.

So, Linux has a bunch of virtual terminals. These are the things we think of when we say “console” but we’re using that wrongly. Virtual terminals behave just like ordinary terminals: they can be the controlling terminal for a process group, but unfortunately, stacked on top is the linux VT API — they didn’t think to make it separate.

So the stuff to set fonts — to place it in raw or graphics mode, create new VTs, switch VTs, etc. — is all loaded into the TTY API, so in order for X to function, it needs a VT. X needs access to that VT in interesting and familiar ways to place it into raw and graphics mode, and so on.

X also needs to know if the current VT is switched, so it opens the VT device it wants (/dev/tty7), and that becomes its controlling terminal. So if you were to delete VT7, X would get SIGHUP. :)

Now, on VT1-6 you have getty, and on VT8 you have usplash, and so on. This is all fine and dandy, except there’s this last mystical piece: /dev/console. /dev/console is, like /dev/tty, a fake device: it points at the currently active VT, whatever that is. But it behaves like a terminal in its own right (whereas /dev/tty just behaves as a proxy for the underlying terminal).

Now, Upstart has a few knobs to customise the standard input, output and error file descriptors. Normally it just starts all jobs with them as /dev/null, but for emulation of sysvinit, it has two other options:

  1. Set them to /dev/console
  2. Set them to /dev/console and issue the TIOCSCTTY ioctl() (console output, console owner)

Now, if your current VT is 7 (X), and you start a job that has console owner in it, the new process will take the terminal away from X! X gets SIGHUP, and either hits 100% CPU or crashes. Solving this problem for good requires jobs that need user input to be rewritten.

So this is clearly bad. The problem really is that things need a “console” at all — jobs that require interaction should do it themselves: they should open a VT, switch to it, and ask there. Or they should use usplash to do it — or they should use X.

Link to original source