Saturday, December 27, 2008

Solaris Virtualization

I'm doing quite a lot with virtualization at work. I've recently been looking at virtualization of our Sun Sparc servers. Sun has a very good story on virtualization. Sun has positioned itself to cover the entire virtualization market - with some very good software. Sun's strategy, which in fairness is similar to many other vendors, i.e. VMware & Microsoft, has been to make the hypervisor or "VM player" free and to charge for the management software.

Sun's virtualization software encompasses:
  • LDoms
  • Zones/Containers
  • VirtualBox

LDom are similar to VMs under VMware ESX.
Zones/Containers are similar to VMs on VMServer
VirtualBox is, for me, more intuitive than VM player/workstation. At least in the current version.

There is a lot of information out there on the web about Sun's virtualization. The OpenSolaris and BigAdmin sites are good starting points.

Over the next few weeks I'll be creating a cookbook of procedures for using Sun's virtualization. Some of these I found on the Internet and am reproducing for my own benefit to have in a single place. Others are the result of my own perspiration.

Darik's Boot and Nuke

I came across Darik's Boot And Nuke (DBAN) a couple of months ago. I'm probably late to the party here, but I just wanted to say it is fantastic. It does exactly what it says on the tin!!

All I had to do was:
  • Download the iso image.
  • Burn the disk.
  • Put the disk in a PC.
  • Boot the PC.
  • Accept the defaults.

Result: a wiped PC

It fulfilled my requirement and best of all it was free.

Thursday, December 25, 2008

Linux Magazines

For the longest time I had preferred Linux Format. But the latest issue of Linux Magazine was really useful. At least for this month's edition I prefer it!

The Linux Magazine introduced me to new software that I had previously not heard of. Also, it seems to have caught up with the latest distributions. Previously it always seemed to be two or three months behind the times. But this month it has Ubuntu 8.10 at the same time as Linux Format!

The review of Virtual Machine management software was timely. I'm heavily involved in virtualization of Linux, Windows and Solaris (both x86 & Sparc) systems. I had been unaware of the software reviewed. As I've used UNIX for approx 25 years, the command line nature of the most interesting tool, MLN, won't worry me, but for SysAdmins who have only ever used the Windows interface of VMware Virtual Center server or the Virtual Infrastructure client (the "Click Next to continue" generation - or to be ruder lets call them SysAdmin Lites!) it will probably seem quite daunting.

The review of the O'Reilley book "Python for Unix and Linux System Administrators" was interesting. Although why anyone would want to use Python, when Perl is not only available but frequently seems designed for the job, is a mystery to me. Thats probably 13-ish years of prejudice coming to the fore!

The regular SysAdmin section is also usually very useful. This month's feature on Siege was particular interesting. I'll be looking into seeing if I can use that.

The websites for both magazines closely reflect the characters of the magazines themselves. The site of Linux Format is flashier, but the content of Linux Magazine is more detailed. Which sort of reminds me of a report I heard about about 15 or so years ago. The professors at an University examined the quality of thesis produced by their students. The best work from an academic point of view was produced by PC users. Mac users spent too much time chosing the just the right font or font size or slightly tweaking the layout and so didn't have enough time for "thinking".

Wednesday, December 24, 2008

Software configuration management software

I've been using ClearCase for approx. 13 years. I've been there, done that, and got quite a few Atria, Pure Atria, Rational and IBM T-Shirts over the years. Not to mention a few conference rucksacks and even a pint glass! Just recently I was reading Linux Magazine and read about another new software configuration management (SCM) tool, PureCM. Which interested because...

I was at an IBM seminar recently and saw some presentations about Rational Team Concert (RTC) for the first time. I was reminded of a bunch of presentations given 6, 7 or 8 years ago by Rational/IBM - my account manager at the time had told me that they were going to consolidate their tools so that there would only be a couple of tools left. Everything went quiet for such a long time I thought that the idea had been shelved or that someone had realized that having half a dozen to a dozen point tools at a slightly lower price would generate more revenue than a couple of more expensive tools. So it does now look like ClearCase, ClearQuest, ReqPro and a few other tools have now effectively been superseded by RTC. Although you can plugin ClearCase, ClearQuest and ReqPro to RTC the best reason I can think of for doing so is to ease the migration to the new tool.

So given that you are going to start again with a new tool, why not play the field and see what else is out there.

If you are going to pay for the tools then the two SCM tools that I would consider in addition to Team Concert would have been Accurev and PlasticSCM.

I've long wanted to actually put Accurev through its paces in anger. Its implementation of streams is far superior to that of ClearCase UCM, which can be considered some sort of half-*rsed after-thought. Although UCM has gotten smoother recently. From friends who have used Accurev in anger, I've heard that it does enable multi-site use of a single branch without resorting to some of the nasty hacks that have become second nature to ClearCase old-timers. The reduction of merges and regression testing builds that must provide might well be sufficient to consider or investigate a migration.

The feature of PlasticSCM that first caught my attention was the security model, which again was far superior to that of ClearCase. ClearCase, and DSEE before it, were designed by old time UNIX engineers and so utilized the octal user, group,others builtin permissions. PlasticSCM utilizes ACLs, like you'll find in Active Directory, which can be applied to almost any item: elements; streams/branches; individual versions; labels; etc. So much more flexibility.

But looking at the documentation for PureCM, it looks like it might be an acceptable alternative. Given that the tool apparently runs on Apples OS X, Linux, Solaris and Windows it is interesting that the feature comparison they chose to make was with VSS. I suspect that Windows may have been their most successful platform to date. The changeset functionality seems similar to that of Perforce. The builtin issue tracker is similar to Trac. The plugins for CruiseControl and FinalBuilder (a new tool for me - one I'm going to have to take a look at) are also a step in the right direction as RTC has built in build management - actually rather sophisiticated build management.

I do not have a conclusion. There isn't a prescription that all can take. Which is one reason there are so many SCM tools out there. This article is really another starting point for further blogs I'll be writing.

Saturday, December 20, 2008

Dimdim

Very impressive software. Shame about the name. I mean, you go to the CEO and tell him to use dimdim! In the whole Web 2.0 naming scheme of things, Dimdim is one of the oddest.

Anyhow, I downloaded the free VMware version of the software, which had been created for VM player, so I had to recreate the disk for ESX. There is very little configuration that needs or even can be done. Just the name of an SMTP server, which is necessary to enable meeting invitations to be emailed out! Of course, you have have to disappear deep into the directory structure to find the dimdim.properties file and <FX: Shock, Horror> open an editor to change the value. </FX>

My company's requirement for collaboration software which Dimdim came very close to satisfying will almost certainly be mirrored by many other companies. In the current economic climate, travel budgets are restricted and the ability to "meet virtually" over the internet/internal WAN would be regarded as valuable. So a requirement to try and save money whilst still enabling staff in disparate geographies to communicate face-to-face

How did Dimdim fail, then? Simply put, it was insufficiently configurable. Perhaps this is a fault of the free version. If it is, it isn't demonstrated elsewhere on their website. There is no configuration into an LDAP or any other naming service. So your CEO (or more likely his PA) has to remember the email addresses of everyone he wants to invite.

Another fault is perhaps the lack of adequate documentation.

The purpose of downloading the free version was to assess whether the company should consider the Enterprise version of the software. The inability to integrate Dimdim into the company's infrastructure really did for it.

Which in a way I guess also goes to prove the point made int this infoworld article. A couple of years ago, I was certain that "cloud computing"/"software as a service" was about to take over the IT world. Two years later, I can still see its potential to be a real game changer. However, the rate of change seems to have slowed considerable. Google's roll-out of new features seems to have slowed. Perhaps they are concentrating these days on reliability, availability, security, uptime and scalability. Which can be no bad thing.

From the point of view of the company I work with, there is a distrust of hosting critical systems externally. My company has a large Chinese subsidiary. Everything, all transfers between the China and anyway else, has to be approved by Trade & Compliance and IT Security. The Chinese R&D department is on a completely isolated network - no access to the Internet at all. Consequently, despite the attractions of some of the cloud computing applications available, my company would almost certainly not be able to deploy them. Dimdim included.

Thursday, December 18, 2008

Sendmail Relaying and Masquerading

The requirement was simple. Relay email from the CentOS Web Server in the DMZ back through the FireWall to the European corporate SMTP server for onward relay out to external customers.

Simple, huh?
Pah!

The default sendmail configuration that comes with CentOS is pretty good. But whilst the FireWall on the server would allow SMTP out the FireWall controlling the DMZ would only allow that SMTP traffic back throught the internal facing FireWall to a specific internal SMTP server. Also since the server is out in the DMZ, not only must the company FireWall "whitelist" every allowed port on each server, but each server must also only whitelist the bare minimum of required ports to function properly. So this server is only listening on http, https, SMTP and SSH.

With all that in mind, I sent a test email:
# echo "Hello, World" | mailx -s "Test" me@company.com

And that worked.

However
# echo "Is there anybody out there?" | mailx -s "Test" me@gmail.com
didn't work.

Although, I wasn't allowing DNS through the FireWall, /etc/resolv.conf contained
search emea.company.com
server 10.10.10.10

(All domain names and IPs are fictitious.)

Changing the resolv.conf file for an empty file will cause the email to me@company.com to fail to relay. It will simply be queued locally. Sendmail is trying to use DNS to look up the MX records of the email recipients. As this server is in the DMZ and we employ a split horizon DNS, this situation can't be resolved by just openning up port 53 on the server FireWall to talk to the external facing DNS server. This server isn't allowed to send email directly to the internet, and it wouldn't be able to relay email to the companies main SMTP servers as they are in a different DMZ and the network routing between the two DMZ is internal.

The DNS lookup needs to be turned off. Reading the documentation, you might think that just defining a SMART_HOST in the sendmail.mc, regenerating sendmail.cf and restarting the sendmail service would be sufficient. But it is not. DNS would still rear its ugly head.

In addition to adding
define(`SMART_HOST',`mailhost.emea.company.com') dnl
to sendmail.mc (and adding an entry for mailhost into /etc/hosts) it is also necessary to add
FEATURE(`accept_unresolvable_domains')dnl
FEATURE(`nocanonify')dnl

These two directives tells sendmail to accept email for domains that it cannot resolve and to not to canonify provided email addresses.

Command to generate the sendmail.cf file

m4 /etc/mail/sendmail.mc > /etc/mail/sendmail.cf

Command to restart Linux Sendmail service

service sendmail restart

Debugging

It is very useful to increase the log level temporarily for debugging purposes. This can be changed in sendmail.mc by changing the value of the following definition
define(`confLOG_LEVEL', `15')
dnl
The default value is 9. The documentation lists 15 as the maximum for administration with the values of 16 up to 99 being of interest only to developers.

The logfile location is /var/log/maillog

Masquerading

There was an additional problem. An upstream SMTP server at our Data centre provider was performing a reserve lookup up on the originating relay server. Our SLA with the external company only allowed us to utilize specific sub-domains, and emea.company.com wasn't one of them. It was necessary to configure masquerading, too.

The following settings were added to sendmail.mc:

FEATURE(always_add_domain)dnl
MASQUERADE_AS(`company.com')dnl
MASQUERADE_DOMAIN(`company.com')dnl
FEATURE(masquerade_envelope)dnl
FEATURE(masquerade_entire_domain)dnl
FEATURE(`allmasquerade')dnl


The following feature was also commented out.
dnl EXPOSED_USER(`root')dnl
I was logged in as root when testing! D'Oh!

Resources

The following link provides a good description of sendmail on CentOS 5, but you really have to know a little bit about what you are doing first, otherwise it is confusing: linuxtopia
Another closely related link.

sendmail.org is also a good source of detail, especially on what all those options/FEATURES in the sendmail.mc file are for, and for Masquerading & Relaying.


An excellent HP website on how Sendmail works.

Sendmail nullclient configuration on CentOS v5.2

Sendmail is the work of the devil.

Here, however, is how to set up a nullclient, which will enable all mail from a server to be forwarded to a central mail hub.

[root@server1 mail]# rpm -qa | grep sendmail
sendmail-cf-8.13.8-2
sendmail-8.13.8-2
[root@server1 mail]# cat /etc/mail/sendmail.mc
divert(-1)dnl
dnl #
dnl # This is the sendmail macro config file for m4. If you make changes to
dnl # /etc/mail/sendmail.mc, you will need to regenerate the
dnl # /etc/mail/sendmail.cf file by confirming that the sendmail-cf package is
dnl # installed and then performing a
dnl #
dnl # make -C /etc/mail
dnl #
include(`/usr/share/sendmail-cf/m4/cf.m4')dnl
VERSIONID(`Nullclient for Linux')dnl
OSTYPE(`linux')dnl
DOMAIN(`generic')dnl
FEATURE(`nullclient',`example.com')dnl
undefine(`ALIAS_FILE')dnl

[root@server1 mail]# cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
192.168.0.1 server1.example.com server1

[root@server1 mail]# make -C /etc/mail
make: Entering directory `/etc/mail'
make: Leaving directory `/etc/mail'
[root@server1 mail]# service sendmail restart.
Shutting down sm-client: [ OK ]
Shutting down sendmail: [ OK ]
Starting sendmail: [ OK ]
Starting sm-client: [ OK ]
[root@server1 mail]#

Oviously, you could always add those lines into a file called something like null.mc and then create your sendmail.cf file with a command line like:

[root@server1 mail]# m4 null.mc > sendmail.cf


Just discovered that much of this is covered over at faqs.org.

Vista as a Virus #1

Among many other duties and responsibilities, I am also a Domain Admin of my company's Active Directory. Despite having a normal user account, I must confess to frequently logging into my desktop with my Domain Admin account. On one such occasion, I was trying to track down a DNS issue that our Sydney office was suffering, when I realised that I needed to flush my local DNS resolver cache. Pretty straight forward? Just open a Command Prompt:

C:\>"ipconfig /flushdns"

The requested operation requires elevation

C:\>

Oh! That didn't work! What the heck is "elevation"? Other than sounding like a U2 song!

Well, having googled around and found this thread on a Microsoft site, it appears that as an Active Directory Domain Admin I was insufficiently priviledged on my Desktop to perform that operation from a Command Prompt!

To be able to perform that sort of operation in a Command Prompt I should have started the Command Prompt with "Run as Administrator".

Some may argue that this is merely improving security, but I would not be one of them. Since then, I was started up FileZilla which informed me there was an update available and did I want to install it. I said yes. FileZilla downloaded the file successfully, and then failed. Guess what! Actually running the install program was an operation which required elevation. Grrr!

Tuesday, December 16, 2008

RVTools

I've just used RVTools for the first time.

What an absolutely excellent tool. It isn't graphical, but it's tabular presentation of information reveals information that I would have had to drill down into each VM's data to find. I thoroughly recommend it to anyone using VMware ESX.

And best of all, its free!

Wednesday, December 3, 2008

Suppliers' Websites

If ever an application crashes on Windows, I never hit the button to send information about it to Microsoft. I guess I was conditioned in the futility of attempting to engage Microsoft Support over 15 years ago. Ever since I haven't bothered with them. There is only so much hitting your head against a brick wall that is good for you after all. That said in this age of the Internet the resources provided on Microsoft's website are pretty good. Even the things that I might want to see might be there. If only I could find them.

I remember possibly 10 years ago, a colleague slamming the phone receiver down in frustration after talking to IBM when trying to get a licence for some software we had purchased. I took over and finally got a licence. I wasn't completely sure it was "our" licence, but it was a licence and it worked and we were able to move on. Even after its recent re-vamp - actually I'm sure that it is probably a continuous process in play here - it is still damn difficult to find what you really want. Try and use the IBM search for the bios update for a specific server, e.g. a x346. The results will list just about any IBM server.

The solution?

Just use Google. We all know it makes sense. I just wish for a higher signal to noise ratio. But no matter how bad it is, its still better than trying to use these two Vendors own search engines.

Tuesday, December 2, 2008

VMware VDM Agent - Access is Denied

When trying to RDP onto the VM that had been set up, the RDP screen comes up and then a box with a red cross saying "VMware VDM Agent - Access is Denied".

What's Up?

By default, VDM 2.1 blocks non-VDM RDP connections. This can be disabled by Group Policy or a registry setting on the VMs.

The group policy file is included on the VDM connection server install under the ADM subfolder.

The registry key that should be set is "AllowDirectRDP"="true" which can be found in either of the following two locations:
HKEY_LOCAL_MACHINE\SOFTWARE\Policies\VMware, Inc\VMware VDM\Agent\Configuration
or
HKEY_LOCAL_MACHINE\SOFTWARE\VMware, Inc\VMware VDM\Agent\Configuration

If there is no VMware, Inc tree under Policies or if the key does not exist under the latter subtree, just create it under the latter tree.

Monday, September 1, 2008

Deskilling

13 or so years ago, when I first started at my current company, I would cheerfully jumpstart a server to an appropriate Solaris OS revision, download the source of an application, build it, install it and configure it as needed.

Then sites like sunfreeware came online. And the order of business became, jumpstart the server, download the application, install it, and then configure as needed.

Recently, it has been a case of "I wonder if there is a VM to do that?" Lets download that and configure it.

Saturday, August 16, 2008

Modifying NIS+ entries

A lot of web pages will provide you with the man page for nistbladm or tell you how to add entries into NIS+ tables. None I've come across provide you with examples of how to modify entries in NIS+ tables.

First, know the format of the tables you're dealing with:
# niscat -o ethers.org_dir
Object Name : "ethers"
Directory : "org_dir.example.com."
Owner : "server1.example.com."
Group : "admin.example.com."
Access Rights : r---rmcdrmcdr---
Time to Live : 12:0:0
Creation Time : Mon Sep 9 18:09:55 1996
Mod. Time : Fri Sep 10 17:16:44 1999
Object Type : TABLE
Table Type : ethers_tbl
Number of Columns : 3
Character Separator :
Search Path :
Columns :
[0] Name : addr
Attributes : (SEARCHABLE, TEXTUAL DATA, CASE INSENSITIVE)
Access Rights : ----------------
[1] Name : name
Attributes : (SEARCHABLE, TEXTUAL DATA, CASE INSENSITIVE)
Access Rights : ----------------
[2] Name : comment
Attributes : (TEXTUAL DATA)
Access Rights : ----------------
#

Find the entry you wish to modify:
# niscat ethers.org_dir | grep client2
8:0:20:a5:e:f client2
#
or
# nisgrep name=client2 ethers.org_dir
8:0:20:a5:e:f client2
#
or
# nismatch name=client2 ethers.org_dir
8:0:20:a5:e:f client2
#

Now modify the table entry:
# nistbladm -m name=client1 '[addr=8:0:20:a5:e:f]',ethers.org_dir
#

Force an update:
# /usr/lib/nis/nisping org_dir
Pinging replicas serving "directory org_dir.example.com." :
Master server is "server1.example.com."
Last update occurred at Tue Aug 12 10:51:30 2008

Replica server is "server2.example.com."
Last Update seen was Tue Aug 12 10:47:40 2008

Pinging ... "server2.example.com."
#

Check that the update worked:
# niscat ethers.org_dir | grep client1
8:0:20:a5:e:f client1
#

Thursday, August 14, 2008

Software Patents

Patents can protect a company's ability to be in or stay in a market. They can also keep out a whole bunch of new starter competitors, which might be equally important.

I've recently come across a software patent applied for and granted to a former work-colleague. Not from my current company, I hasten to add. No names. No pack drill!

I read the patent application. And I've thought about it for a while.

Was it obvious? Completely!

Is there oodles of prior art? I'm sure of it!

Is it enforcible? Given the previous two answers, almost certainly not!

Are software patents a complete waste of everyone's time? I think they probably ought to be. Unfortunately, I suspect they will still be around for some time.

Error adding an ESX host to Virtual Center Server

A colleague set up a new ESX host we are going to host in our DMZ. He claimed to have followed the Work Instruction I created.

This was our first VMware Server in the DMZ. I was expecting some problems because of the requirement to punch some holes in the FireWall. The Server Configuration Guide is an excellent source of information for the details on how to manage an ESX server through a FireWall. It details the ports that need to be opened; the protocols those ports will be using; and the reasons why they need to be opened.

In the VI client attached to the VirtualCenter Server, I selected the DataCenter and selected Add Host from the menu. After resolving some FireWall problems, of which more in a later blog, I entered the server name, admin id and password, checked the returned information and clicked through the next three pages.

Only to receive the pop-up error message "Failed to install the VirtualCenter Agent Service"!

I googled online for other people's experiences. Most people encounter this sort of problem after an upgrade. In those instances, the problems occur when the VirtualCenter agent service on the ESX host hasn't been upgraded for some reason.

There were some suggestions that this could occur if /tmp/vmware-root doesn't exist. It did on my server.

Others suggest that a simple restart of the mgmt-vmware service would resolve the problem. It didn't on my server.

Others again suggested restarting the vmware-vpxa service. My server had no such service. Aha!
rpm -qa | grep -i vpxa
returned nothing.

A chap called Rene has a blog where he describes how to perform a manual upgrade process of the vpxa-vmware agent. Unfortunately, it is for an earlier version of VMware. The path referenced is slightly different on my VirtualCenter server. The version number of the file is distinctly different. The real scoop on manually upgrading/installing the vpxa-vmware service can be found in the this thread from the VMware Communities website.

N.B. The correct path on my VirtualCenter server is C:\program files\vmware\infrastructure\virtualcenter server\upgrade. Check the bundleversion.xml file for the correct file to copy across to server. For my VirtualCenter server v2.5 and ESX server v3.5 was vpx-upgrade-esx-7-linux-64192.

So I sftp-ed the file to the server. I ran the shell script. The service still wasn't installed! The rpm was still not installed! So I tried to install the rpm from the command line with rpm -ivh . It failed again, but at least provided me with a useful error.

The /opt partition did not have sufficient space left! Which was odd as it should be a 1Gb in size. I checked. It was 24 Mb. And already half used! Oh dear!

I created a /optn directory - there was plenty of space left on the / partition; tar-ed the contents of /opt to /optn; umount-ed /opt; renamed /optn to /opt; and commented out the /opt entry from the /etc/fstab file. I ran the install script again. Success! Huzzah!

Well! That's that!

Tuesday, August 5, 2008

Strange ESX Console Behaviour

I usually use a web interface to our KVM system. It can be a little bit clunky with ESX Server, especially since v3.5, but generally does the job.

A colleague built a new ESX server out in our DMZ and we were having some problems accessing it. The fireWall was being just a tad too restrictive. Strike that. It was being completely restrictive.

Anyhow, I was using the Web interface, but the ESX login seemed to be frozen. Alt-F1 & Alt-F11 worked, but nothing else.

I walked into the server room and used the console directly attached to the KVM system in there. Same problem!

I attached a keyboard and monitor directly to the server!! A complete PITA. Same Problem!
My colleague was being to get worried he'd built the server incorrectly.

I added an USB Keyboard to the system. Same Problem!

Whilst pondering what to do I randomly turned the Scroll, Num and Function Lock keys off and hit enter. Eureka!! It worked!!

After some experimenting, it turns out that the Scroll Lock was the problem. Which is a slight issue when double--click is the action that brings up the KVM server list!

Back at my desk I googled for ESX and Scroll Lock and there were two interesting posts. They both came from VMware's own community pages, with the more interesting being spot on! Although the other perhaps provides some explanation?!

Well! That's that!

Monday, August 4, 2008

Ehwotay!

After a refreshing few days off on holiday, its back to the grindstone.

Just about the first thing that happened after I had sat down at my desk, was that I received a phone call from a vendor we had spoken to just over a year ago. Tideway have an application dependency mapping product that looks like it might be quite useful in those situations where your development team has largely left and you're sitting looking at your infrastructure wondering what the heck all that smoke and mirrors is doing!

Tideway came in a year ago and spoke to us about their software then, but the requirement we thought we had wasn't really there on further inspection. Apparently their software has come on leaps and bounds and is even better and whizzier!

From the conversation, I think the biggest problem will be that it seems to overlap a great deal on systems that we already have: CMDB; performance monitoring (ish); etc. I’m not sure that it makes sense to consider linking them all, so it seems to me that if we were to use Tideway’s software correctly it means we’d have to consider giving up a number of pre-existing internal systems. And not using it properly would be a waste.

I guess a good question is what is a CMDB. It seems to be one of those useful marketing terms used to imply a good thing, but which is nebulous, yet a desirable tool. Such a wide diversity of tool claims to be or to have CMDB capability that the question is very valid.

You can always have a look at the wikipedia page on CMDB, although other than placing the blame for the term fairly and squarely at the feet of the UK's OGC the most remarkable aspect of this page is the reference to the excellent, cynical but completely realistic IT Skeptic site. I agree with a lot that this site outlines. The writer obviously bares a number of scars from having implemented CMDBs and other aspects of ITIL.

Having attained ITIL Foundation certification, a couple of years ago I have queried in private how practical some aspects are. It is so reassuring to discover that others are doing so too.

So, having said all that, I claimed my company had a CMDB.

What do we actually have? In a lot of ways it is very similar to a number of the products out there. Essentially a home brew system built as a Lotus Notes DB, every server should be listed, including:

General Info:

  • Name
  • Region
  • Country
  • Site
Equipment Inventory:
  • Manufacturer/Model
  • Server Serial
  • OS
  • Database software (if appropriate)
  • Application
Purchase Details:
  • PO Number
  • PO Cost
  • Purchase Date
  • Asset Management record Number
  • Supplier Details
  • Asset Number
  • OS LicenceApplication licence
Server Details:
  • Manufacturer Type
  • System Number
  • BootProm rev
  • IP Address
  • RAM
  • Server Type
Sub-System details:
  • Type
  • Make
  • Model
  • Model No
  • S/N
Storage Configuration details:
  • Disks, i.e. number & size
  • Total Disk size
  • Disk Partitions
  • Disk Partition size
  • Config i.e. RAID 0, 1, 0+1, 1+0, 4, 5, 5E, 5EE, etc
  • Total Storage (GB) after Config
  • Drive No
  • Model
  • Drive Capacity
  • Drive Location
  • Drive Mount Point
Graphics Device
  • Make
  • Model
  • S/N
Operating System
  • Make
  • Type
  • Patches
Network Cards
  • Identifier
  • Make
  • Model
  • MAC Address
  • IP Address
  • IPX Addresses
Applications
  • Make
  • Name
  • Version
  • Type
Maintenance
  • Provider
  • Start
  • End
  • Agreement
Support Record
Any Additional Information
  • a free text area
  • space for attachments
  • Primary Contact Name
  • Escalation Contact
  • Support Queue
System Accounts - secured by a separate key
  • Name
  • Usage
  • Password

Author
Date
Status: Active/Retired
  • if retired, when

Of course, the majority of these fields are rarely documented!

What do we not have:
  • sufficient application configuration information
  • cross server relationship information
  • cross application relationship information
  • no auto-discovery
  • ...
So its really an asset database!

So it goes!

Friday, August 1, 2008

Integrating Wikis

I've recently created a Wiki for the whole company. However a number of departments have already been "testing" Wikis.

I was quite nervous about this, about integrating the Wikis.

Unnecessarily nervous, though.

I was able to use Special Pages:Export Pages to export all the pages listed on the Special Pages:All Pages page. Although you have to type them all in.

However, some of those pages reference uploaded files. Jpegs actually. There doesn't appear to be a Special Page to export all uploaded files. I had to log on to the server and copy them onto another server from whence I was able to ftp them to my desktop. From my Desktop, I logged into the Company Wiki and uploaded the files one by one. Not fun, but the file upload mechanism is surprisingly well written. It remembers your previous directory - at least whilst you are logged in! If you mistakenly try to upload a file twice an error message is displayed with the text "A file with this name exists already, please check Image:Example24.jpg if you are not sure if you want to change it." And you have the choice of replacing it or choosing a new file. Some thought has been used here! Much kudos is deserved by the developers. Regrettably, you do not often see that much thought.

To import the pages I used the Special Pages:Import Pages page and uploaded the exported xml file from the "test" Wiki. It just loads the pages.

I checked out the uploaded pages. There were a couple of instances where uploaded images had slightly different names as the references are case sensitive. It was trivial to change those and then everything looked perfect.

Despite the ease with which this went, it could have been easier still:
  • the export pages page could list all the pages with tick-box selection boxes
  • the export pages function could also export referenced images
  • the import pages function could also upload the images with the correct case
Well! That's that!

Thursday, July 24, 2008

Aarrgghh! The sky is falling!!

We had an old Windows 2000 Active Directory domain - formerly an NT 4.0 domain. It had been limping on for quite a while past its sell by date.

Finally something had to go.
And it did.
Big time!

We had only kept it for a bunch of developers who had been very resistant to change. Through it they accessed ClearCase VOBs resident on a Solaris server. We were lucky we had this architecture.

The Domain Controllers stopped replicating with each other. And nothing, no how was going to get them back to being happy with each other. Perhaps it sounds like I'm making light of the situation, but a couple of days ago everything seemed like a source of stress.

Because it was only a small group using this domain, we had a solution that could be quite quickly and easily rolled out.

Essentially, these developers stopped logging into the domain and starting using local accounts on their PCs. This is how we set things up.

For each developer's PCs:
  1. create a local user for clearcase_albd

  2. create a local clearcase group

  3. add clearcase_albd to clearcase group

  4. create a local user for the engineer

  5. create a local group for the engineer to match their UNIX group

  6. change the Atria Location Broker service to use local clearcase_albd account

  7. edit the HKEY_LOCAL_MACHINE -> SOFTWARE -> Atria -> ClearCase -> CurrentVersion -> ClearCaseGroupName registry value to point to the local clearcase group

  8. logon as clearcase_albd and set CLEARCASE_PRIMARY_GROUP EV to clearcase

  9. logon using engineer's local user and set CLEARCASE_PRIMARY_GROUP EV to the new local group matching the UNIX group

  10. Loaded client for NFS from SFU v3.5

  11. Configure client for NFS to map local user to UNIX user and to mount the VOB storage partition automatically.

  12. Create new views or fix_prot the old views.

As views are meant to be temporary structures, even where views were migrated with fix_prot, those views were only actually used to check objects in and then removed. New views were created for on-going work.

Longer term this team is going into Windows 2003 Active Directory Domain that is used by the rest of the development teams.

Monday, July 21, 2008

ClearCase replica naming and replication

Over the years I've seen various conventions used for the naming of ClearCase VOB replicas. Some of the more popular, especially from when I started at my current company, some 12 years or so ago were:
  • <site name>
  • <division name>
  • <division name>_<site name>

Recently, the admins at a remote site I've been working with have adopted a naming convention for their VOB replicas of <vob tag>_<site name>! I couldn't work out why you would want to include the vob tag as part of a replica name until I realized that they were using the supplied ClearCase MultiSite replication scripts.

Eleven or twelve years ago when using ClearCase v2.2 or possibly v3.0.1 I had written my own Perl script to manage replication. I used an earlier shell script as a starting point. Over the next 4 or 5 years I slowly modified the script to take advantage of new features, i.e [ls|ch]replica -actual I haven't touched it since.

This script automatically produced a filename: sync_<vob tag>_<source replica>_<target replica>_<Time stamp>
e.g. if a VOB had a vob tag of /vobs/src and replicas of Austin and Healey, then a pcaket sent from Austin to Healey might have a packet name of sync_vobs_src_Austin_Healey_21072008-101010 with a suffix _<num> if more than one packet was created.

But to get the vob tag to appear with the supplied scripts you need to include the vob tag as part of the replica names. Even then you only see the source replica listed not the target.

Tuesday, July 15, 2008

Changing the Domain Name of a WordPress Mu Site

I downloaded the Multi-Site Manager plugin from http://wpmudev.org/project/Multi-Site-Manager

This plugin makes it incredibly easy to create a new site. You can clone the original site to a new site, and then transfer blogs between those sites.

Unfortunately, in my experience, it didn't quite work perfectly. Probably a rookie mistake on my part. It is probably best not to transfer the primary blog from the original site. That caused some very strange behaviour from WordPress.

Anyway I ended up hacking through the MySQL database manually editing all the URL references.

Aarrgghh!!

This was made worse by the extra half dozen or more users who had appeared overnight. Each user causes an additional 8 tables to be created in the MySQL database. Truly, everyone has their own playpen!

137 tables later all the references had been updated.

There was a problem with tags and references or more specifically Permalinks. I found I had to go in and reselect the permalinks options. As none of the existing users have complained about tags and categories I’m presuming that they have successfully followed the instructions I emailed through to them or haven't yet re-logged in.


Monday, July 14, 2008

ldap services

One of the prime requirements of any service that is added within my company is authentication.

Within my company, there are two main sources of authentication:

  • Microsoft Active Directory
  • Lotus Notes ldap service

Now it is possible with varying degrees of difficulty to "persuade" most tools/services to use LDAP as an authentication source. However, there are assumptions written into most of these tools that if you are seeking to use LDAP you are either using OpenLDAP (or similar) or Microsoft's Active Directory.

In some ways it is quite encouraging to see how many other people are looking to authenticate against Active Directory. In other ways, it is deeply depressing that with so many years head start, the various UNIX vendors couldn't agree upon a common naming services standard that would be an improvement upon Active Directory.

I suspect that some will point to Kerberos and LDAP themselves as collaborative triumphs, which Microsoft had to use within Active Directory itself. However, whilst those are compelling technologies, they are not themselves individually a compelling solution. Collectively, they can be induced to become a solution, but depending upon the implementor, they may not be a compelling solution.

Restore Vizioncore vRanger backups via the OS

The Development VMware ESX server at my company is an IBM 366 with a SCSI attached EXP400 external disk pack. This system arrived in the UK from a company site on the west coast of the US via a stop at the company HQ on the east coast of the US. To say that the hardware had been shunted from pillar to post would be a minor understatement.

We are using Vizioncore vRanger to back up the VMs on both this Development box and the Production VMware ESX server.

Just recently, the RAID 5 array on the EXP400 dropped 6 disks(out of 9)! Why it did this is a different story. Here I'll recount how we recovered from this.

There were a couple of VMs we had to get back online quickly. No problem, we had vRanger backing them up.

Ah! Now, well there was a problem. The backups were very successful, but for some reason we're still investigating, vRanger refused point blank to recover to the Production system.

However, we had access to the backup via the Windows OS. A quick google discovered this thread on the Vizioncore support site.

Restoring a VMWare machine from the '.tvzc' files of vRanger :
1.) Download FileZipper : http://www.vizioncore.com/Downloads/ProductSupport/vcbrestore.zip
2.) Download BSDTar: http://www.vizioncore.com/Downloads/ProductSupport/bsdtar.zip
3.) Install BSDTar (the zipfile contains an installable .exe)
4.) Extract the desired files :
FileZipper.exe -D -I "filename.tvzc" -O - | "c:\Program Files\GnuWin32\bin\bsdtar.exe" xzvf -
5.) Ensure file permissions are correct - I have cygwin installed on my PC, which can be invaluable!
6.) remove the .vzsnp extension from the end of files.
7) Within Virtual Center Server:
  • use Browse datastore on the Production Server's storage
  • Upload the restored files to the Production ESX server
  • select the .vmf file and Add to Inventory
8.) Start up the VM.

Bob's your parental sibling, of the usually male variety.


Thursday, July 10, 2008

Blogging needs Wordpress not Wiki kludges

So our COO decided that we were going to provide a blogging solution in addition to a wiki. The "My blog" add-on to Mediawiki doesn't really cut it, although as a quick and dirty workaround it has a place.

I downloaded WordPress as a Jumpbox appliance. Quick, easy, restrictive. For a small company, it would be really good solution. For a larger company with an infrastructure to tie into, it is lacking. However, I'm really only talking about the free download version. I briefly considered registering the appliance, but didn't want any delay. So perhaps I am being slightly unfair. But hey its my blog!

I downloaded v2.5.1 of the WordPress application, created a CentOS v5.2 Linux VM configured as a Web & MySQL server and rolled my own! As a standalone application that you can install plugins into, its pretty straightforward and looks pretty good too.

I needed the Ldap plugin to enable integration with the Company's Lotus Notes LDAP service. This was actually a bit tricky to set up. I remember it taking a number of hours to accomplish. Events since have wiped out quite a bit of my recollection of the event. It was quite cool after I had configured everything properly, though.

At this point, I realized that what the COO really wanted wasn't a single blog, but the ability for many VPs to have a blog.

Back to the drawing board?

Not completely. At this point, I downloaded v1.5.1 of the WordPress MultiUser software. A default installation is just as simple as the single user version of the application.

Again I needed the Ldap plugin to enable integration with the Company's Lotus Notes LDAP service. This was actually very tricky to set up.

If you follow the above links to the Ldap plugins you'll discover that they are completely different. The wpmu-ldap plugin is different from the WordPress ldap plugin, written by different people.

The writer of the ldap plugin for WordPress MU has a blog here where he announced the release of the latest version. The maillists Aaron refers to at the bottom of his blog are an invaluable source of information, because to say the documentation is sparse is like saying that I'm an overweight bearded slaphead, i.e. a completely accurate and unbiased statement of fact.


Things I discovered whilst deploying WordPress MU and the ldap plugin are:
All the ldap files have to be owned by the httpd/apache/web server process owner. Otherwise the plugin isn't even seen. This is a file permissions problem, so not serious, but it can take an embarrassingly long time to track down. Or at least it did for me.

If after the WPMU ldap plug-in is enabled one of the files is edited by the root user and becomes owned by root, then the result is the infamous "White Screen of Death". Again, not something I immediately recognized. It took an embarrassingly long seeming hour to work it out!

The most obvious difference is that the single user wordpress plugin lets you specify the attribute to filter against, whereas the multiuser plugin lets you choose between linux LDAP and windows LDAP. Now the wpmu-ldap plugin maps linux to uid and windows to sAMAccountName. I was authenticating against Lotus Notes and needed cn! My only immediate option was to hack the source code.

Now its working its quite cool, but I did pick up some scars and a few more white hairs.

phpBB installation

This is yet another case where my mileage hasn't actually varied. But I had to write about the phpBB installation as it is just so damn slick.

It must be roughly 6 or7 years ago when I first installed and configured a phpBB site on a Solaris 8 server with MySQL v3.23.42, Apache v1.3.26 and php v4.0.6. Even then the install was pretty good, although it left enough techie stuff to be done that you felt you'd undertaken a "real man's job"! Afterall, I had to compile the Apache, MySQL and PHP distributions.

In this case, I created:
  • a new VM on my development ESX server,
  • loaded up CentOS v5.2 configured as a Web & MySQL server
  • loaded up some additional php libraries
  • started the httpd & mysqld services
  • download and installed the latest phpMyAdmin
  • created a DB
  • created a DB User with appropriate priviledges
  • started and finished the phpBB configuration very quickly
The phpBB configuration is performed via a web page. It recognises the current state of the installation and just steps you through it. When it has acquired all the relevant configuration details, it creates the tables in the database and sets up the initial admin account. And then you are in a position to start using the system.

Bob's your parental sibling, usually of the male variety!


This does slightly simplify the process, but only in terms of creating/deploying a new machine. I had to add a new server into the company QIP (now called VitalQIP) system and push that out to DNS.

After deploying a server, there's also planning that has to be undertaken for administration, usage policies, backup and restoration for the system.

Tuesday, July 8, 2008

VMware Training

After 18 months of using ESX starting with v2.5.4 and upgrading through v3.0.2 to v3.5, where the only training I'd had was to watch the DVD training "Virtualize it With: VMware ESX Server 3.0" from the elias khnaser company, I've finally taken the VMware training course "VMware Infrastructure 3: Install and Configure".

After such a long time, was it worth it?

Absolutely!

Didn't I know most of it?

Yes, perhaps 85%. I must admit to gaining a certain satisfaction at realizing how much I already knew.

But then again its hard to really know DRS and HA from the manuals and training DVD when your environment is two sites each with a development (sort of - some of those VMs seem to be in production!) server and a single production server.So 4 ESX servers in all! I have more servers coming, so the timing made sense. I'll soon be running my production servers in a DRS cluster on each site. I'll be considering making them HA clusters too, although that isn't so clear cut.

The ability to discuss issues and ask questions in a class-room environment, where the answer can be "I do not know, but lets just try that..." and there is no fall out in terms of production system downtime, can be really useful.

Also you pick up the odd bit of wisdom such as that soak testing your memory beyond that performed by the bios startup is well worth the effort. Unless you are under-utilizing your server, ESX will exercise all your RAM in a way that most other OS simply will not. So a memory fault that might never have been discovered by another OS can be exposed in very short order. Much better to perform a thorough soak test before deployment, than have your VMs do it.


I suppose the other reason for undertaking the training is that it is a requirement for the VCP exam.

Ticksy, that!

Although in VMware's position I would have done the same.

I have taken the VCP mock exam of 20 questions and passed with 80%. I'm still really annoyed by the 4 questions that I answered incorrectly. In my defence two of the questions weren't particularly practical, but in the actual exam that won't cut it.

Monday, July 7, 2008

Vizioncore vRanger Configuration Take 2

You should be thinking about backup even before you start creating Virtual Machines. This is perhaps obvious. Although, it is still possible to defer that decision by using traditional "in VM" solutions.

One of the features of ESX is that you can have a display name for a VM in the GUI which bares no relation to the names of the files. Now the name for the files of the VMs are taken at the time of creation from the XXX form. By default this is also the display name used within the GUI. The display name can be changed later. To change the filenames used, requires that the VM is stopped and all the files and the directory used are modified. And modified correctly!

Now it is possible to use a wide range of characters in the name of a VM, e.g. this is legal:


Legal, but not sensible.

Whilst ESX has no problem with filenames with non-alphanumeric characters, both vRanger and VCB do. They will both fail to back up a VM with the name indicated. That may suggest something about how both utilities are architected, or perhaps the APIs they are utilizing. It doesn't matter. You have to deal with it.

When you create a VM, give it a sensible, simple but meaningful alphanumeric name. Afterwards you can choose rename from the right mouse button menu and change the display name to include whitespace and other characters.


Ensure that you have modified the System Resource Reservation parameters, which reserves resources for the backup process to utilize.


Foreach ESX Server:
On the Configuration-> Ststem Resource Reservation->Simple Tab, set
CPU : 1500Mhz
Memory : 800MB


The Simple setting equates to the host->system setting under the Advanced tab!

After changing these settings, it is necessary to reboot the ESX server, before they take effect. Consequently, if you can it is sensible to set this all up before you start serving Virtual Machines.

Then
  1. Verify that the ssh client service has been enabled on the ESX hosts to be backed up.
  2. Enter all ESX hosts into Ranger by IP or FQDN.
  3. Create a backup user on the ESX hosts. N.B. root ssh access is required for vmfs --> vmfs backups & restores.
  4. To verify correct configuration, it is recommended that initial attempts should be undertaken using Ranger's legacy mode.

Tuesday, July 1, 2008

CD-ROMs in VMware

In my experience, after you have added enough NICs to a VMware ESX machine to be useful, i.e. at least 6 to 8, you start to get into a situation whereby ESX is unable to identify a physical CD-ROM that might be attached.

At that point whenever you try to start a Virtual Machine which is configured to try to attach to the physical CD-ROM, it will take an inordinate time to boot. Essentially hanging on the way up. and even after the machine has fully booted. It seems as though the VM is only getting 5 to 10 seconds of CPU every 2 minutes or so.

This can be immensely frustrating.

I know. Before I worked out what was happening, I became quite impatient.

The fix is quite straitforward. Simply change the CD-ROM over to a client device. For often than not, that is the most useful setting. It is still straitforward to map an iso to the drive as well, should you need that.

Sunday, June 29, 2008

Mediawiki extensions

As I have written elsewhere a good deal of the power and pain of implementing a Wiki with Mediawiki is in the use and deployment of extensions.

At the moment I am suffering from the interaction of two extensions

I am using the Ldap authentication extension to authenticate the users against the company's Microsoft Windows 2003 Active Directory. This is working very well and generated a great deal of kudos when it was deployed. Not actually single sign on, but a small step in that direction for us. Actually, after slogging through the documentation, it wasn't so difficult to install and configure. Not withstanding anything else I may have written elsewhere on this blog.

I am using the "My Blog" extension to allow the users to create simple blog pages. The blog pages are simply ordinary wiki pages, which are aggregated very much in the manner of templates.

Although users login successfully, they receive a message which suggests that they do not because their cookie settings are incorrect and they are presented with another login box. However, if they just traverse away from the login page everything is fine. But everyone had to be told that they could.


OK, I was wrong above. After further investigation, it appears that the problem originates in the PasswordProtected extension. Having checked and re-checked the source code it is far from clear exactly why it should be causing the problem. Luckily, having reviewed the functionality, it doesn't work in quite the way we'd like so I was able to remove the extension from the LocalSettings.php file.

No sooner was I was congratulating myself on the implementation of the Ldap authentication extension against 3 internal Microsoft Active Directory Domains, than I was asked to add authentication against the Ldap service of our Lotus Notes installation. Having already configured the MS AD Domain authentication this was actually quite simple. In fact I've been able to comparing the two, the ldap authentication against Lotus Notes is simpler than MS AD Domains! In fact, I've re-ordered the Domain login list, so that Lotus Notes is the first option, and left the MS AD domain login option for only a restricted group of users.

There was one thing that surprised me about the user management side of Mediawiki. And this may be a result of our using the Ldap authentication plugin, especially against multiple domains. When A N User from Domain A logged in and a local User called A n user was created. If later on A N User from Domain B logged in, then the Domain B user would be mapped to the same local A n user account as the Domain A user.

Now, in my environment, this is exactly what I want. In fact, if it didn't I'd probably have to be scrabbling through the source code to try and mangle the usernames to try and achieve that result.


Wednesday, June 25, 2008

Mediawiki VMware appliances

In general I really like downloading appliances from the VMware Community website. However, when I had to provision a Wiki for the entire company I quickly moved away from the appliances you can download.

The Jumpbox appliance is wonderful if you do not need or want anything further.

The rPath appliance is a bit is a bit more functional for an enterprise deployment. There is afterall a console that can be logged into. However the built-in OS upgrade didn't work. Via the website, it just hung. Via the commandline it ended up in an inconsistent state from which it could no longer be updated. D'Oh!

In the end I rolled my own from a CentOS Linux v5.1 Virtual Machine. I installed the latest MySQL, Apache, PHP (& loads of php libraries) and Mediawiki v1.12.0 - much further on than the v1.6.x version that the rPath appliance provided.

Part of the power and frustration of Mediawiki is the ability to extend the functionality using extensions.

I added a number of extensions to the instalation:
  • Ldap Authentication (to the company MS 2003 AD)
  • ImageMap
  • WhosOnline
  • etc

However, I battled the Ldap Authentication extension for most of a day and the ImageMap extension for most of an afternoon. The slightest mistake in the LocalSettings.php configuration file for Mediawiki or a missing library or misspelled filename would cause everything to fall apart. In contrast, WhosOnline was a dream to install. Its special page just loads and just works. Huzzah!

Trying to make the result pretty, like the Wikipedia homepage, is a whole different story. There is still knowledge required of the dhtml, css, etc.

Tuesday, June 24, 2008

The Initial Cost of VMware

I've written elsewhere abut the incremental cost of a VMware licence at the firm at which I work. But what about the initial cost?

There is the cost of:
  • the new hardware
  • training
  • VMware software
  • OS Licences
  • additional Application Licences
  • Administrator time!?!?
I try and use CentOS Linux for everything these days, but sometimes you need to use Windows. It takes just 5 to 10 minutes to create a new Windows VM from a template. However, just because you can doesn't mean you should. Each one of those Windows VMs will need a licence.

How do you factor in the cost of the mistakes you'll make with new technologies?

Some of these mistakes you will hope to avoid. Who makes mistakes? Or admits to them anyway?


Administrator time is a constant surely? I'm joking, but sometimes it does seem like there is a belief that a finite group can undertake infinite work.

There is also the opportunity cost of your Administrators not just improving and optimizing your existing infrastructure.

Training!? Choosing the right training first off for a new technology is difficult. For VMware, I'd recommend Elias Khnaser's Training DVDs. Having used VMware for nearly two years now, there are holes in his coverage. However, he also covers some topics I've yet to need to get involved with.

And new hardware! Well that wasn't a road we went down. To start with our development box was an IBM 366 with an EXP400 that had been forklifted from site to site to site. The downside, which we've had to explore extensively, is that as it originated in California IBM are unable supply a replacement motherboard in this country - a 366 in this country has different part numbers! In the long run, completely new hardware might have been the cheaper option, simply in Administrator time. Whilst we were getting the server back up, we weren't doing other more productive work!

So it goes.

Wednesday, June 18, 2008

Vizioncore vRanger configuration

Ensure that you have modified the System Resource Reservation parameters, which reserves resources for the backup process to utilize.


Foreach ESX Server:
On the Configuration-> Ststem Resource Reservation->Simple Tab, set
CPU : 1500Mhz
Memory : 800MB


The Simple setting equates to the host->system setting under the Advanced tab!

After changing these settings, it is necessary to reboot the ESX server, before they take effect. Consequently, it is a good idea to set this all up before you start serving Virtual Machines.

Then
  1. Verify that the ssh client service has been enabled on the ESX hosts to be backed up.
  2. Enter all ESX hosts into Ranger by IP or FQDN.
  3. Create a backup user on the ESX hosts. N.B. root ssh access is required for vmfs --> vmfs backups & restores.
  4. To verify correct configuration, it is recommended that initial attempts should be undertaken using Ranger's legacy mode.

Tuesday, June 17, 2008

The Incremental Cost of a VMware Licence

A VMware Infrastructure 3 Enterprise licence for 2 processors costs £2948.00.
Gold Support for for the same costs £619.46

However, at least at my company, there is additional software used in the VMware deployment. So the incremental cost of a VMware license isn't just £3567.46

Veeam Reporter is used for Infrastructure Reporting, which costs $150, which includes the first years support charge. An additional year's support is $25.

Vizioncore vCharter is used for Consolodated Performance Monitoring. vCharter costs £169.00 per CPU/year incl 1 year support.

VMware Consolodated Backup & Vizioncore vRanger Pro are used for Virtual Machine Backup. VCB is part and parcel of the VI 3 Enterprise Licence. vRanger costs £279.00 per CPU.

All that comes to £523.00, making a grand total of £4090.46.

Obviously specific prices are only correct at time of going to press and where foreign companies were involved probably subject to the vagaries of exchange rate movements! However, the general point that your costs are not only limited to the cost of the VMware licence remains.

N.B. all prices in this blog are quoted without VAT - UK Sales tax.

Monday, June 16, 2008

keyboard setting on rPath VMware appliances

rPath VMware appliances are set up for a US audience. Perhaps unsurprisingly.

So their keyboard setting are always for US keyboards. TO change that for a UK keyboard, just change the contents of the /etc/sysconfig/keyboard file from

KEYBOARDTYPE="pc"
KEYTABLE="us"

to

KEYBOARDTYPE="pc"
KEYTABLE="uk"

Friday, June 13, 2008

Lotus Notes install on CentOS Linux

Actually this is a case where my mileage didn't vary at all.

I needed to install IBM's Lotus Domino server software onto a Linux machine - for once these days real iron not a virtual machine.

Obviously, I started with the latest version of CentOS - v5.1. I had no choice on the version of Lotus Domino - it had to be v6.5.5. It had to match our windows servers.

I ran into the problem of libstdc++-libc6.1.1.so.2 being missing which I resolved by loading the compat-libstdc++-296.2.96 rpm and linking the missing library to the later library provided by that rpm.

This still didn't resolve the problem. I still had a missing library: libXp.so! A quick yum install -y libXP resolved that. However, the Lotus Domino java installation program now just hung trying to create a session.

I ran out of time. A swift look at the IBM website for OS requirements for Lotus Domino installation and I conceded defeat. Next step I downloaded the CentOS v2.1 !! That was another 20 to 30 minutes to install and then it was back to the Lotus Domino install.

The only alternative would have been to use version 8 of the Lotus domino software. Something that just wasn't possible. Sigh!

Saturday, June 7, 2008

An additional thought about whitespace and LDAP

I will have to test it, but it could be that the problem I have with using

ldapclient -v manual \
-a credentialLevel=self \
-a authenticationMethod=sasl/gssapi \
...

is that as all the users are in Active Directory OUs named things like "EMEA ENG" or "APR ENG".

As I reported earlier I had a number of problems with whitespace in the ldapclient command line. It could be that this was another problem with whitespace. However, this problem was masked by occurring at a stage where it appeared everything was working successfully.

Hopefully I can rename the OUs to be EMEA, APR, etc and resolve the issue.

Saturday, May 31, 2008

Whitespace in Solaris 10 LDAP configuration

Spaces are allowed in the ldapclient command line if attribute is surrounded by "", i.e.
-a "proxyDN=cn=admin,cn=emea users,dc=example,dc=com"
or
-a "defaultServerList=123.123.123.1 123.123.123.2"

Failing to quote either attribute statement will cause the ldapclient command line to fail with a parsing error.

However, there are instances where quoting the attribute definition will cause the command line parsing to work and for the command to succeed but ldap lookups can still fail.
May 22 18:09:44 server1 nscd[4012]: [id 293258 user.error] libsldap: Status: 49 Mesg: openConnection: simple bind failed - Invalid credentials

The problem is resolved when the proxy user is replaced with another from an OU which doesn't contain a space, i.e.
-a "proxyDN=cn=admin,cn=users,dc=example,dc=com"

Friday, May 30, 2008

Netmask settings for Solaris 10 Zones

When the zone is created a large number of files are copied into the new zone. However, the netmask file seems to be generated as an empty file.

Consequently, when you enter ifconfig you end up seeing something like:
...
eri0:2: flags=1000843 mtu 1500 index 2
zone zone2
inet 123.123.123.11 netmask ffff0000 broadcast 123.123.123.255
...

Now obviously you can use ifconfig in the global zone to change the netmask setting interactively. That works. However, a reboot wipes that out. So you have to login to the console of the zone and amend the copied netmask file so that it contains a line like:
123.123.123.0 255.255.255.0

Alternatively, you can access the netmasks file via the global zone filesystem.

Legato Networker Configuration Duplication

I made a mistake whilst installing Legato Networker on a Solaris 10 box. The mistake prevented the backup server from backing up the Solaris 10 client.

I support multiple DNS domains and the backup server appears in many of those domains and as the backupserver is multihomed it appears in those domains with different IP addresses.

The Solaris 10 client was in Domain dom1.example.com, one of the few sub-domains the backup server actually isn't in!

When I installed NetWorker I specificed that the backupserver as backupserver.dom2.example.com. The backupserver was actually trying to communicate with the Solaris 10 client over the network where it was known as backupserver.dom3.example.com.

Networker didn't like the loop effect of this arrangement and the backup was failing.

In fact I should have just specified a simple server name, unqualified by a domain name and added the backupserver into hosts file.

D'Oh!

However, it took a little longer to resolve than I had expected because Networker not only records the backup server in the /nsr/res/servers file but also records it several times in the /nsr/res/nsrla.res file.

As I said above, I would have saved myself some confusion by entering the unqualified name of the backup server upon installation.

*nix & Windows integration software I'd like to use

Vintela have some pretty cool software for integrating Solaris & Linux systems into an AD environment.

Perhaps because it has to compete against Samba and other freeware solutions, their products aren't ridiculously expensive either. Quite a refreshing experience. Its a shame that the normal mode of operation for most software houses is to seek to soak their customers. No names! No packdrill! But we all know who I mean.

Perhaps the coolest feature is the ability to apply GPOs to *nix clients. Add in the inexpensive nature of the software, and it seems pretty compelling.

What's the downside?

For my company, its the fact that our AD Servers aren't running Windows 2003 R2, which is a requirement of the solution.

Thursday, May 29, 2008

Teamcity and ClearCase

Teamcity has a known problem with requiring sequential version numbering for the files and directories it tracks.

The problem was exposed here when a developer removed the latest version of a directory and created a new version. The developer removed version 4. The recreated version became version 5! Teamcity knew the latest version of the directory ought to be version 4 and was very upset that it was missing. The continuous integration suddenly wasn't continuously integrating.

Now, some might argue the problem was with ClearCase for not assigning the new version the same version number as the removed version. However, I do not. ClearCase is using the same mechanism when the removed version is the latest version as when the version being removed is in the middle of the version tree.

Anyway, the resolution was pretty straightforward, if not pretty.

checkout the parent directory
checkout the "broken directory", we'll call it dirent1
move all files out of dirent1, i.e. cleartool mv * ..
checkin dirent1
cleartool rmname dirent1
create new directory with same name as dirent1
move files needed for version 1 into dirent1
checkin dirent1
checkout dirent1
move files needed for version 2 into dirent1
checkin dirent1
checkout dirent1
move files needed for version 3 into dirent1
checkin dirent1
checkout dirent1
move all remaining files in dirent1
checkin dirent1
checkin parent directory

We ended up with a version 3 on your parent directory and a dirent1 directory with 4 versions.

Some build labels also had to be manually re-assigned to appropriate versions, but luckily not that many.

VMware ESX re-signaturing of the SAN Config

After a slightly strange power outage in the server room at work - the UPS stayed up everything else in the server room went down!! - I came across the situation that an ESX server had lost the primary connection to its SAN through the multipath fibre channel switch fabric.

Cue: extreme nervousness. To put it mildly.

There were a number of messages on the ESX Server console of the form:
cpu2:1034)LVM: ProbeDeviceInt:4903: vmhab1:1:0:1 may be snapshot: disabling access. See resignaturing section in SAN config guide

Actually, the last part of the error message is very good advice. A good read of the SAN configuration guide is well worth the time and effort.

Somewhere along the line the ESX host has lost this VMFS3 volume and picked it up on a different path, vmhba1:1:0:1. When the host came back up, it picked up the VMFS3 on the different path, but importantly kept information about this partition at it's previous path. This is why it's decided it's looking at a Snapshot, and responded in this manner.

So go into the console, click on the Configuration tab and select "Advanced Settings"

Expand the LVM section and set LVM.EnableResignature to 1
Then click OK to apply settings.

Select the "storage adapters" link under the configuration tab and click the "rescan" button (upper right).
Right click the vmhba (under the controller adapter for your machine) and click "rescan".

Then when you go to the summary tab, you right click and slect "refresh" and you should see your storage volume.

At least, that is what the manual would have you believe. My experience was rather different.

ESX was perfectly happy to see the volume on its new path as a new volume. Consequently, I had to remove all my inaccessible VMs and re-register them from the "new" volume. I may have had other options. This seemed to be the quickest at the time.

After all that, all the VMs started up without error, and other than a delay restarting the VMs, the users were unaware of the problem.

Solaris 10 authenticating against Active Directory

There a number of good blogs discussing this subject. I'd recommend Scott's and the OpenSolaris blogs.

I used Scott Lowe's blog for the instructions on how to get CentOS Linux machines to authenticate against Active Directory. It was right on the money. Especially as I needed my machines to run Samba to create an interoperability solution for a number of software development teams who use IBM Rational ClearCase.

However, his instructions for Solaris 10 servers never worked in my environment.

My environment is Windows 2003 Active Directory with all the servers have been patched with service pack 2 and the latest monthly patches. The Server for NIS and Password Synchronization modules of Services For Unix v3.5 has also been installed, which obviously had extended the schema. As Service Pack 2 had been installed, the hotfix that fixes passwd sync after the "upgrade" has also been applied. N.B. the servers are not running Windows 2003 R2 - that would make a big difference and from all accounts it would be much easier to interoperate with.

There is an article on BigAdmin on this subject. The method described almost worked for me. I'd say it went 90% of the way. The part that didn't work was the ldapclient command. Specifically, it was trying to usecredentialLevel=self with authenticationMethod= sasl/gssapi , i.e.

ldapclient -v manual \
-a credentialLevel=self \
-a authenticationMethod=sasl/gssapi \
...


I was able to get around this by changing the ldapclient command to:

ldapclient -v manual \
-a credentialLevel=proxy \
-a authenticationMethod=simple \
-a proxyDN=cn=proxy_user,cn=users,dc=example,dc=com \
-a proxyPassword=password \
...


I also had to change the serviceSearchDescriptor attributes from

-a serviceSearchDescriptor=passwd:cn=users,dc=example,dc=com?one \
-a serviceSearchDescriptor=group:cn=users,dc=example,dc=com?one

to

-a serviceSearchDescriptor=passwd:dc=example,dc=com?sub \
-a serviceSearchDescriptor=group:dc=example,dc=com?sub


That done and Bob was my parental Sibling of the usually male variety!

gcc 64bit compilation on Solaris 10

By default, if you were to build a shared library with gcc, you'd enter commands similar to
# gcc -fPIC -c file.c
# gcc -shared -o file.so file.o
The resultant shared library would be 32 bit, i.e.
# file file.so
file.so: ELF 32-bit MSB dynamic lib SPARC Version 1, dynamically linked, not stripped, no debugging information available
#

If you wish to build a 64 bit library you should amend the command above as follows:
# gcc -fPIC -m64 -mcpu=ultrasparc -c file.c
# gcc -shared -m64 -mcpu=ultrasparc -o file.so file.o

With the result:
# file file.so
file.so: ELF 64-bit MSB dynamic lib SPARCV9 Version 1, UltraSPARC1 Extensions Required, dynamically linked, not stripped, no debugging information available
#

Tuesday, May 27, 2008

Changing hostids for Solaris 10 Zones

I found 3 resources on the Internet which discuss changing the hostid of a Solaris instance.

Only two are specifically related to the case of Solaris Zones. The other is a more general "You can change the hostid..." type of resource.

In my experience with Solaris 10 Zones, only one of these methods succeeded.

Initially I attempted the "Dynamic Library Interposition method described by Julien Gabel on his Blog'o thnet. Initially this held promise.

  1. I compiled the code
  2. I set the environment variable
  3. I ran the code
Success!!

I added the environment variable to an existing startup script. It failed!

The error indicated the library was the wrong type! As I had installed the 64-bit version of Solaris 10, I re-built the dynamic library as 64-bit. Received the same error!


At this point, I actually created my zones, following the outline provided by this Zones Tutorial. The tutorial on how to create a Solaris 8 Zone on a Solaris 10 Server describes how to set an attribute of the Zone as the hostid. Perhaps this works when you really do have a Solaris 8 Zone on your Solaris 10 Server, but it didn't work for my Solaris 10 zones on a Solaris 10 Server.


Finally I resorted to the other mechanism for altering hostids described upon Julien Gabel's Blog'o thnet - daemonizing a DTrace script. This did work. In fact it works very well. Much Kudos should be directed towards Brendan Gregg.