Standalone Sysadmin

Syndicate content
A blog for IT Admins who do everything by an IT Admin who does everything
Updated: 4 days 7 hours ago

Intermittent Problems Suck (your time)

Thu, 09/02/2010 - 11:51am

For the past few days, our NYC office has had incredibly irritating problems with the internet connection. We’ve got service through a local Metro-E provider, but they’re a CLEC, which means they don’t own the lines, they just lease them from the ILEC, who is in this case, Verizon.

The root of the issue is that the wiring at the building we’re in is crap. It’s a small 5 story building that used to be apartments and has been converted to offices, and the wiring is just not up for the job. We went through several pairs of copper pairs looking for one that was good enough to carry the metro-E signal, and it was all we could do. Before metro-E, we had DSL, where we capped out at just over 1Mb/s…and this is in Manhattan.

Unfortunately, the circuit is currently in the middle of dying, so it’s working sometimes and failing others. I first opened this ticket on Monday, and have exchanged emails with our provider a dozen times or so. They’ll see the issue, but symptoms are vague as to whether it’s their equipment, our equipment, or the line running between our equipment, or (what I’m fairly sure the problem is), the lines entering the building from Verizon.

It wasn’t until last night when they finally saw enough errors on the bridge to have Verizon to commit to a service call tomorrow evening to add a loop. Every other time, everything on the line was hunky-dory. This is why intermittent problems take so long to solve…because all the stake holders have to be monitoring at exactly the right time for anything to get done.

Meanwhile, I’ve been having to apologize to my users, and give them instructions on how to forward their desk phones to their cells.

Even though the problem isn’t actually with my provider, I would love to get a secondary network connection, because the lines here are just too unreliable. No cable companies will give us service, no fiber companies will touch the building…it’s pretty much just Verizon and their CLECs at this point.

I think we’ve only got 2 more years on the lease?


SysAdmin Spirit Animal?

Wed, 09/01/2010 - 11:30am

There’s an amusing thread on the LOPSA Discuss list going on right now. It’s called “What Animal is a System Administrator“.

I was leaning toward the beaver until I saw the post by Paul Graydon, who recommends the Pooka, aka the Púca:

The púca has the power of human speech, and has been known to give
advice and lead people away from harm. Though the púca enjoys
confusing and often terrifying humans, it is considered to be
benevolent.

It’s like I’m looking in a mirror.


Linux machines with no rebooting…? Is this what we want?

Wed, 09/01/2010 - 8:14am

The other day, I caught a message that KSplice was available for Fedora. I thought I’d be a wiseguy and I replied “Yeah, great. Call me in 20 years when it’s available for for RHEL”. Well, as several people pointed out, it turns out the joke is on me.

As you can see, it’s actually available for many Linux-based OSes at various prices. I suppose my confusion stemmed from the fact that I misunderstood what ksplice was.

My impression from a long time ago, when it first came out on Ubuntu, was that it was essentially a kernel patch that dynamically loaded patches and provided the ability to rebootstrap a kernel that was already loaded. As it turns out, it’s a commercial product that offers the ability to not have to reboot your machine to update the kernel. Let me be frank: I’m all about that.

The part that I kind of object to is in the press release, of all things. It’s the opening line of the company profile:

Ksplice is an enterprise software company making reboots a thing of the past.

Please, lets be honest. Reboots are inevitable. Using this product as a stop-gap for untimely reboots may be handy (at the low low price of $50 per year per server), but it can’t (and shouldn’t!) replace regular reboots.

The reasons for scheduled rebooting of machines are numerous. The primary one is that regular reboots assure that the machine is configured to boot correctly. If you’ve got a machine that’s got over 100 days of uptime, how do you know it will start correctly? You last booted it last quarter…what has happened to that machine since then? Changes in installed services, mountpoints, etc…it’s hard to tell if it’s going to be in a known-good state when it comes back up after a power failure.

Another reason to reboot occasionally is to clean up the running state of the machine. What’s that you say? Your machine is running fine? Well, sure, it may be, but how much cruft is left hanging that isn’t obvious? Have you ever used kill -9? Do you know for sure that there aren’t any memory leaks in your running services? Any processes hang while reading I/O and is now stuck in uninterruptible sleep?

Yes, there are lots of things that happen to servers over the course of doing their jobs. A reboot fixes many of them. The only argument against it is uptime.

I’ve written about uptime before, and I still feel the same way. Modern system administration has advanced beyond a single server providing a service. Uptime needs to be measured from the outside in, and according to the availability of the service, not the individual servers comprising that pool.

Feel free to disagree. Let me know if you’ve got an uptime of a year plus and you’re proud of it, or if you would be ashamed to be in that position.

Edit
This entry is causing quite a stir on Reddit. Cxunix from twitter also weighed in on his blog, servermanaged.it (link is in Italian, English translation here).


Conference News (LISA and PICC and more!)

Mon, 08/30/2010 - 11:10am

This is apparently the “time to schedule your conference trips” part of the year, because there is news on the SysAdmin conference front.


First, and most pressing, the LISA10 conference schedule has been released! I’ve got to say, I’m digging the theme of the website, too. More important, though, is the content. Interestingly, all sessions and tutorials are available in half-day increments this year. This means that you can attend the first half of one session then migrate to another session after lunch. I’ve got mixed feelings about this, but I’m interested in how it will pan out. More flexibility is nice, though, and sometimes the first half of a session is really review (though there are a lot of arguments against that, too).

As always, there are discounts available for certain groups, and you do get a lower admission price if you’re a member of LOPSA, USENIX, or SAGE.

Check out the registration page for the fees. There’s an early-bird special going on until October 18th, so make sure you register soon. The return on investment for this conference is amazing.

I’m going to be there as a conference blogger, along with Matthew Sacks, Ben Cotton, and Marius Ducea. We’ll be publishing entries on the USENIX blog (which I’ll be linking to from here as well, of course).

Come to LISA and have a great time. And if you do decide to come, find me and say hello. I always love meeting readers.

Shifting gears a little bit, I’m sure you remember the PICC conference that LOPSA-NJ hosted. Well, we had a blast, and last year’s conference chair, William Bilancio, did an amazing job. It’s a bit much to do that twice in a row, though, so he was looking for someone to take the responsibility for this year’s conference, and after running it through my head a while, I decided that I’d take the job if he thought I’d do alright. Here’s his email announcing it:

It is with a great sigh of relief that Matt Simmons has decided to be
the Program Chair for PICC ‘11.

Last year Matt was the head of the marketing team and did a great job
at getting the word out about the conference and was a key person in
making last years conference a success.

Tom and I feel that he will do a great job as the Program Chair and
will make PICC ‘11 a great conference.

In other news I will be getting in contact with the hotel and get the
date locked in, in the next few weeks and then we can start really
working on the conference.

Please start thinking about sponsor ideas as well as any new people
you think will be able to help make PICC ‘11 another great conference.

Again thank you Matt for taking PICC ‘11 Program Chair job and good luck.

William

I want to thank William and everyone who was involved with last year’s conference. Everyone I’ve talked to had a great time and has been looking forward to this coming year. I’m going to work hard to try to improve on William’s example, and really grow the community of system administrators in New Jersey and the rest of the northeast. I’m going to need help, though, so if you helped out last year, I’ll be calling on you now. If you weren’t involved last year, now is a great time. Drop me an email or comment on this story to let me know that you’re interested in volunteering. We can definitely use the help.

In addition, I was talking to Lee Damon, who let me know about a SysAdmin conference called “Cascadia IT Conference” (aka “CasITConf”), and it’s happening in the Pacific Northwest. It’s being put on by SASAG, the Seattle-Area System Administrators’ Guild.

So there you go. Three sysadmin conferences in one post. It’s going to be a busy year for everyone, so get involved and lend a hand to someone in your area!


On the road again…

Fri, 08/27/2010 - 7:06am


My datacenter migration (or renovation, as I’m referring to it) includes a fair amount of added virtualization. We’ll be maxing out the memory and processor power of three machines at each site, and those will act as a VMware HA cluster (we’re buying the vSphere Essentials Plus license kit for each site).

Of course, I’ve got to have some VMs to run. I could reinstall all of my machines using cobbler (which would invoke the gods of trial and error, not to mention incur Murphy’s Wrath), or I could convert the machines that already exist from physical to virtual (p2v). That second option sounds much less error prone.

That being said, converting a physical machine to a VM isn’t exactly a fast process. Hoping to get it done the weekend of the move would be foolish, so I need to get it done beforehand. That’s why I’m driving to Philadelphia today.

Last week, I threw a couple of terabyte SATA drives into a spare PowerEdge 1950 server, upped the RAM a bit, and installed a freshly minted copy of vCenter Hypervisor 4.1 (formerly known as ESXi). I’m trucking this machine down to our secondary data site today so that I can begin the p2v conversion process. I’ve got enough disk space that I won’t run out (I’m only putting the root partitions in the VMs, since all the data is stored on the SAN), and I don’t need to actually run the machines, so RAM won’t be a problem. This will just be a holding tank until I get the VM hosts setup during the conversion weekend.

The actual conversion will be done using VMware Converter, a free tool by VMware that I’ve been really impressed with. It does want an ESXi…err..vCenter Hypervisor server to connect to, but that’s free too.

Once this is down there, I’ve got some decisions to make. Namely, I need to decide how long to wait until I do the conversion. Not a lot of data changes on the root partition. It’s going to be limited to logs, really (since I haven’t gotten a centralized syslog server running yet). The exception to this rule is the domain controller at that site. That needs to be the absolutely last machine I convert, and once I do it, I’ve got to turn off the source, because if the image becomes too far out of sync, well…that’s sort of like crossing the streams.

So, has anyone else pre-converted VMs like this in preparation for a move? Any advice or caveats to watch for?

Edit
Fixed the mistaken Ghostbusters quote. Did I seriously say “crossing the beams”? I am disappoint.


My take on DevOps

Thu, 08/26/2010 - 8:27am

Alright, several people have asked me why I haven’t weighed in on the current “devops” movement. Mostly because no two people can absolutely agree on what DevOps is. I’m outside of that particular community, although I read a lot of the blogs of the key members, so maybe I’m in a good position to comment on my perspective.

First, lets define DevOps. If you strip away all of the touchy-feely stuff that gets associated with the name, devops is, at its core, DevOps is an increased interaction and interdependency between developers and operations staff, whether that operations staff is specifically system administrators or whatever.

This means that the people who develop code no longer have willful ignorance of operational environments, and the people who operate the environments can’t do so in a vacuum of knowledge about the software itself. This increased communication and reliance IS DevOps. That’s it. Nothing more. It’s a methodology. It’s not a panacea and it’s not for everyone. How can you tell if it’s for you?

Let’s answer some questions…

  1. Does your organization have programmers?
  2. Developers are necessary for the DevOps relationship…otherwise you’ve just got Ops

  3. Do you provide Software as a Service?
  4. DevOps grew up in the web world, around places like Flickr, who provide applications over the web. Other people may just think of them like websites, but in actuality, they’re applications with incredibly large code bases. Since a solid application depends on well-developed code running in a known stable environment, it’s natural that this kind of biosphere would produce methods like DevOps

  5. Do you release software updates frequently?
  6. If you’re in an environment where something is broken and gets fixed immediately, then you can say yes here, but it’s not just bug fixes. Features get rolled out, pulled in, and switched around. Agility of this nature isn’t possible without everyone working from the same playbook. It’s also not possible with an environment that can’t change rapidly to match the code.

For the 90% of companies out there without that particular environment, then you probably aren’t using DevOps, and that’s fine, because there’s almost nothing it can do for you. Especially if you don’t have programmers. Because hey, no dev, right?

You’ll notice that nowhere in the preceding text did I mention the tools that DevOps uses. That’s because the tools are completely separate. Using “puppet” doesn’t mean you subscribe to the DevOps methods (or even the mentality), and although DevOps may not be necessary for your environment, you might find puppet extremely useful. Let me say that again, Using the same tools as DevOps shops use does not tie you to the DevOps methodology.

As alluded to in the last answer up above, the shops that run DevOps need environments that can change quickly and absolutely. They needed tools that could do it, because you can’t manually change hundreds of application servers. Because of their need to change that many machines, and have it happen nearly instantaneously, tools to automate this kind of change were developed and implemented.

Other technologies that get lumped into DevOps, cloud computing and virtualization, are also natural off-shoots of the type of environment where you have hundreds of application servers. Of course that kind of environment is going to be heavily into virtualization (if they’ve got an existing large infrastructure) or cloud computing (if they don’t).

Again, DevOps doesn’t “own” these technologies. They just use them (and advance them by writing tools to improve them, in many cases).

So there, that’s my take. For the people who can use it, DevOps is developing into an exciting methodology to ensure increased availability and stability of IT resources.

It’s not for everyone, but you owe it to yourself to take a look at the tools that too many people have been misbranded “DevOps”. There’s a lot of functionality there, and it can decrease the amount of time you spend slogging through administrative tasks.

Edit
It looks like I’m not the only one who’s been thinking about this, too. Benjamin Smith wrote his take as well, and it seems like we agree quite a bit.


Ohio Linux Fest is coming up in Columbus, OH!

Mon, 08/23/2010 - 10:24am


I have been to the Ohio Linux Festival once, four years ago. I had a really great time, met interesting people, and made plans to come back the next year. Then I got married the next year on the same weekend that OLF was being held. As much fun as I had at OLF06, I couldn’t really choose it over my own wedding (though frankly, I’m surprised some of our guests picked us over the show). The next year, they had the nerve to schedule it on my anniversary. Jeez!

This year, though, it’s scheduled earlier in September (from the 10th to the 12th), which means I can go! Except that I can’t. I’ve had other stuff scheduled for that weekend for almost a year. Ugh!

On the other hand, you CAN go (and I’m jealous). The schedule looks great, and I’ll let you in on a little tip. It’s directly across the street from Barley’s, home to some of the best burgers and beer in the city. That alone might be worth the trip!

As for the Linux Fest itself, it’s free admission, but if you’re coming (and you are, right?), you should really consider some of the OLFU classes, which are available for a fee. OLFU is the Ohio LinuxFest University, and it’s a day of training put on by LOPSA, the League of Professional System Administrators. I’m a (too-often absentee) member of the committee that is responsible for the classes, and I want to tell you that I’m very excited to see some of the things we’ve got lined up.

The thing that I’m most thrilled about is a class called Datacenters: Planning, Expanding, and Migrating. Finally, a physical infrastructure class! Holy Cow! If I could make it, I would sign up for this class in a heartbeat. How many times have you needed to make changes to the infrastructure, and were told, “Sorry, we can’t have any downtime”. I’m doing a big migration soon myself, and I would love to be part of this class. This alone may be worth the trip.

A course that sounds intriguing is Black Magic: Linux Troubleshooting and System Administration. I’ve talked to the instructor, John Billings, on IRC, and he really knows his stuff. I’m hoping that there are some notes or slides from this class (or maybe you could write a guest blog / review of the class, and I could post it here).

There are a ton of other courses as well. Check out the course list and decide what you want to take. As always with these things, the hard part is narrowing it down.

So go to Linux Fest and have a good time for me. Make sure to bring back all kinds of stories and let me know how it goes. Oh, and LOPSA is looking for volunteers to help them man the booth there, so if you want to volunteer some time, comment on this entry (or drop me an email) and let me know. We appreciate any help we can get!


Great tips on server rack filling

Mon, 08/09/2010 - 10:19am

Greg Ferro at the Ethereal Mind blog has some great tips up today on filling a server rack. Definitely check it out. It’s great to see someone mentioning physical infrastructure!

If you’re interested in this stuff, way back in 2002, I wrote an entry on Racks and Rackmounting and a piece on Server Cable Management that you may enjoy.


Cobbler or just straight kickstart for VMware ESXi?

Mon, 08/09/2010 - 8:55am

I’m working on automating some installs that are going to happen during the infrastructure upgrade, and I need to decide what I want to use for automation.

I have used Kickstart before, and it’s essentially a single file that contains instructions for the RedHat installer (although Debian is in on that action, too). The idea with Kickstart is that your “normal” installation (whether that be through DVD, USB key, PXE, or whatever) points to the kickstart file, and the installation proceeds according to those instructions.

Cobbler goes the extra steps and becomes the installation server, PXE/DHCP boot provider, etc etc, in addition to working with kickstart files. In fact, it can even do crazy kickstart templating. It certainly seems full featured, and I’ve heard people recommend it before.

One of the coolest things I’ve seen it be able to do is automate new virtual machines. As I understand it, you basically hit a button and a VM is created, powered on, and installed according to the kickstart templates. That’s slick.

Unfortunately, the best support for Koan (the Cobbler client) is on Qemu/KVM. The site mentions support for VMware Server, but that’s anathema. There doesn’t appear to be support for ESXi (certainly not 4.1, which was just released last month), but I was hoping for something more recent than a question on VMware Communities from 2007.

So I come to you. If you’ve got an ESXi infrastructure, do you automate rollouts? Am I just doing this wrong? I’m leaning toward manually spinning up machines and using Cobbler / Kickstart to perform the installs (maybe with customized boot media, in the case of just kickstart). What do you suggest?


Xenophobia and Elitism in the Community

Thu, 08/05/2010 - 9:59pm

The reason that I started this blog was to share what little information I had, learn from others, and build a community of system administrators who were interested in improving themselves and their peers.

It was for identical reasons that I joined the League of Professional System Administrators. It is very important to me that I contribute, in some way, to the betterment of our profession, and I’ve tried to do just that in every effort on this blog, as a member of LOPSA, and in my interactions with sysadmins of all types.

The way that I’ve found that seems to work best for me in understanding other people, and having them understand me, is to put myself in their place, and consider the situation from their perspective. Doing this requires humility, because it supposes that my way may not be the best or only way. This is difficult, because I have an ego and admitting I may not be right requires swallowing my pride. But I do it, because to work together for mutual improvement, it’s necessary to maintain an open mind.

Not everyone makes this effort, however. There is, and probably always will be, a minority of people who are xenophobic. That is, they are afraid of things and ideas that are different from theirs. As I was explaining to someone on twitter earlier, this xenophobia manifests itself in a complete antipathy toward methods and opinions which differ from their own. You’ve seen it. We’ve all seen it. You see it whenever someone decides an idea is wrong because it belongs to someone else, and if you challenge that stance, the person attacks you.

Let me be crystal clear. This condition is harmful. It’s harmful to the administrator who holds it, and to all of the systems that they deal with. Someone blindly refusing to implement the right solution because they’re prejudiced and superstitious is like a parent who refuses to inoculate their child. The solution doesn’t get applied because of irrational fear and mistrust.

This xenophobia, if left unchecked, advances to elitism. Let me say it again. Elitism is really just an advanced stage of xenophobia. Not only is your solution right, your solution is the best. In fact, anyone who doesn’t use your solution is inferior, obviously, and deserves derision, or at best, sympathy. You’re using $X? Oh, I’m sorry…

Elitism is what happens when your opinions are not only held for a long period of time, but encouraged by the people around you. If there aren’t dissenting voices, then obviously you’re not doing anything wrong, right? These kinds of questions fade away eventually into the assumption of correctness. By default, I’m right unless proven wrong. Younger members of the community see elders take these airs, but they mistake it for competency, which eventually produces more elders who feel the same wrong sense of entitlement, the same biases, the same assumptions, and the same elitism.

It’s very fortunate that not all, or even most of the community isn’t like this. In fact, even the ones who are aren’t usually this bad. As with all humans, we’re not black or white, we’re shades of grey. Even better, because we’re humans, eventually we can change, improve ourselves, and get over these petty biases which hold us back and weaken our communities.

I urge you. Take this as a charge to evaluate yourself for your biases, your own little pockets of xenophobia. They’re there, trust me. I have them too. Examine them, and make yourself aware of them, and then when you recognize the urge to respond with them, just stop and critically evaluate your position. You may be right, but it’s possible you’re wrong too. None of us will probably ever get rid of them entirely, but each one we eradicate will stop us from making the wrong decision at one point or another, or stop us from needlessly tearing down relationships that we worked so hard to build.

Take a moment to dig in deep and think of a couple biases you have, and that you should get rid of. It takes courage to admit that you have them, but it’s worth it, and by sharing it, you are forced to admit it, which will make it easier to get rid of.