Whilst sitting here on the 6th hour of this outage call, I decided to update the theme for my site. While the other one was pretty simple, I was beginning to get bored of it. So I stumbled about the Wordpress Theme site for a bit, until I stumbled on this one. I’m not a great fan of the 3 columns, so I tweaked it a little to go back to 2 columns, and fit my “style” a little more. So far I like it, clean, looks good (to me at least), and quite functional. We’ll see how it goes.
Track-It SmartServer issues after server rename
A while ago we had deployed a virtual server to use with Track-It. As we were using a standard template, with sysprep, to build our virtual hosts, this went well. During a regular audit of the event log, I noticed that errors were being generated by SmaRTIndexServer-Importer and SmaRTIndexServer-Indexer. So I decided to check up on them. The importer service was generating an error about not being able to find the SQL server:
Error loading Import Specifications: [DBNETLIB][ConnectionOpen (Connect()).]Specified SQL server not found. -SELECT cmd_str, field_map, data_source, smi.area_id, poll_freq,import_id,attrs,last_run,update_cmd, import_type_id FROM smImports smi ORDER BY poll_freq
The indexer was generating a slightly different error message:
An error occurred when indexing answers: [DBNETLIB][ConnectionOpen (Connect()).]Specified SQL server not found. -Unspecified error
Both suggested not being able to connect to the SQL server. So I checked up on the SQL server. I know it was running as I was using it earlier this morning for some inventory stuff. Knowing that was working, I started to look through google. There was no help there at all. So I turned to the knowledgebase for Track-It. Doing a search for “Error loading Import specificatons” found this gem.
A few reg changes later, and a restart of the service, and no more errors.
Dallas Zoo
Managed to get out over the weekend to the Dallas zoo. Quite busy due to Radio Disney putting on a bit of a show, but I managed to get some pictures taken. This is one of the lioness at the zoo. I have more in my gallery.
What a Shocker...
Microsoft’s March announcement for updates is a shocker.
Microsoft has not released any security bulletins on March 13, 2007.
That is, despite several zero day vulnerabilities being investigated.
DST Rollover...
Well, I stayed up last night until 2am, which shortly became 3 am, then on to 4 am. The new DST time period rolled over great for all my linux boxes, my windows boxes on the other hand, caused a bit of an issue. All the Windows 2003 servers rolled fine. As I’d mentioned here, Windows XP and Windows 2003 had a patch, whilst Windows 2000 had a registry hack. This hack didn’t work for me, despite running it several times, and rebooted numerous times too. In the end, I went with “method 2” as listed on Microsoft’s page. It tells you to download TZEdit, which will allow you to edit a time zone. When I pulled them up, they were all as they should be, but I opened them, and saved it again. It then tells you to change time zone, and change back for the changes to kick in. After doing that, the time was correct. Application testing showed there wasn’t really any concerns, as we’d previously tested the applications without issues. I’ve not received any calls today about the services, so I’m guessing that we should be just fine.
Rant...
I am, once again, sitting on a 0200 conference call for an emergency change request to push a bug fix to production. I’m bubbling with frustration right now because once again, we spend 40 minutes unsure of:
- How to test the bug
- If the bug was what it was
- Who should be testing
- What the successful result should have been
The problem with tonights update was pretty simple. A small stored procedure change, executed in a few seconds, by the corporate DBAs. The first few minutes of testing was filled with… “well I ran it, but I’m not sure, so get the other guy” statements. When he jumped on, he did the same thing, announced to the world it worked, but didn’t believe the code change had been run before the first guy executed the code. This of course was not true, but he was pretty adamant about this.
Then the first guy proceeds to ask if the VPN would impact it, to which the second guy said it might do, and you should VPN into the corporate network. I proceed to explain that access to the production network from the corporate office would be identical, with the exception of source IP (I didn’t explain that bit as I feared confusing the pool souls), as their home connection. The second guy agreed, then proceeded to tell the first guy to use the VPN.
Then the question/statement came up that we wasn’t sure it was fixed because one of the QA guys testing the issue from another point reported an error. Here is where I nearly lost it, and yelled down the phone, but decided on the mute button instead. The second developer guy, proceeded to tell us that dev guy 1 got the error because his clock was wrong, and that because the clock was wrong, he didn’t see the code changes yet. At this point, I decided to mute the phone, and go find a drink. I was thinking of something incredibly strong, but I have more testing todo tonight as they are also doing load balancer updates.
Right now, I’m trying to figure out if dev guy 2 was joking or not. He’s not the kind of guy to joke about stuff like this, so I’m really worried… I mean really worried.
Corporate Policies, Symantec Firewall, and deploying standard policies
One of the corporate policies that was sent down from up high during the last audit was desktop firewalls. We originally had it set so when on the corporate network, the Windows firewall was off, when off the network, it was on. We then tweaked that, and set it to optional when off the network (with default being on), and off when on the network. Corporate security didn’t like that, and said we needed to enable it both off and on the network, and that they also recommended a second firewall. Their recommended product hadn’t been released or updated in 3 years as it had been incorporated into “Symantec Client Security”. This isn’t bad, we just got an add-on for our Symantec Anti-Virus to have the firewall included.
Symantec’s recommended method of deploying the firewall service is to start off with a clean desktop. Install all the usual utilities that you will be requiring as part of every day operations for the company, run the firewall, and applications, permitting what is required. When a suitable time has passed, a day or two, take the firewall administrator, and export the firewall policy. This can then be used as a template to push to the remote computers. This is pretty easy, and is documented fairly well here. Documented pretty well, right up until it comes to deploying it. It appears to stop right at the exporting part, which makes figuring out deployment fairly difficult. I figured you could just deploy it by right clicking on a guest in the
I began to wonder if I’d have to force all the computers into a single group to deploy this, but then I thought better of it. I right clicked the server, and the same option was there too. So in theory, I could deploy to the whole server group without any issue. Well, that’s where the better half of my brain kicked in, and reminded me of the users, application base, and possible impacts. I have created a handful of groups that we can use for the deployment of the new policies. Why multiple groups? Well, imagine if you deployed your shiney new policy to 150 people all at once, and something was wrong, and you killed access to the internet. Would you want to be on the phone with that hickup? I didn’t think so. What I’ve gone with is a roll based group setup. Each department has their own group, and each computer will be moved to the associated group. This will allow me to build policies that meet the needs for each department, without having to worry about crashing the entire network.
Cisco, Dell, and DST
I recently wrote about the 2007 DST changes that are coming up very rapidly (this weekend in fact), and all the changes we’re having to go through. One of the things that keeps slipping my mind, because it works so well, is the infrastructure. I use it every day, all day, and rely on it heavily. While the DST stuff isn’t going to make a huge difference to it, it’s nice to have logs reporting any issues in the right time. So I went to check on the availability for updates on the Cisco and Dell switching infrastructure we have. There wasn’t any. In fact, there wasn’t really any mention of the possible impact at all. This is where I started to get a little concerned, until I started playing around in the application. There is a command, on both the Dell, and Cisco switches, that will let you pre-define the DST date and times each year.
pix > ena
pix #> config t
pix (config)> clock summer-time CST date Mar 11 2007 02:00 Nov 4 2007 02:00
This sets the summer time shift on the new DST dates for this year. The command is almost identical on the Dell switches, the Cisco switches, and the Cisco routers.
Are you secure from the Trojan Horse?
I had to chuckle at this one. Probably everybody knows the story of the Trojan Horse, and how it was used to smuggle soldiers in. This link provided by Dark Reading’s “Firewalled” suggests that history really doesn’t teach us much.
The "gotchas" of upgrades
On Sunday I wrote up a “brief” little record of my upgrade nightmare. Today, I got an email from Bill reporting his anti-spam comment plugin was broken. So I decided to check in on it. I had assumed there was an issue with the PHP build I had created after the installs. I had to rebuild it 3 times yesterday because the XML functionality was broken, which meant little things like w.bloggar were broken. So that was my first stop. Everything looked okay. I’d assumed it’d probably be using the GD libraries, but I was wrong there.
Looking at the code for SecureImage, it seemed to be calling the ImageMagick convert
application. This seemed like the next good point to check on. I don’t remember rebuilding anything that would have affected that application, so I attempted to call it using the same arguments. This turned out harder than I thought as it called for the use of STDIN
and STDOUT
, both of which I couldn’t really emulate enough quickly, but it wasn’t genearting any basic errors like libraries being missing, so I moved on, keeping it in the back of my mind.
Next step was log files. I checked the apache logs, nothing obvious there, then the php logs, nothing again, last one to try, the mysql logs. Checking in /var/log/mysql/mysqld.err gave me a bit insight into what was wrong…
[Warning] './wp_secureimage' had no or invalid character set, and default character set is multi-byte, so character column sizes may have changed.
This reminded me of some mentions on character changes in the mysql upgrade “gotchas”. About a quarter of the way down the page is this little number:
Incompatible change: MySQL interprets length specifications in character column definitions in characters. (Earlier versions interpret them in bytes.) For example, CHAR(N) means N characters, not N bytes.
What does that mean? It doesn’t mean a whole lot for single character/single byte strings. However if you use multi-byte characters (ujis for example), it means a whole lot. This is where the whole issue was… MySQL had changed the character length of one field from 32 characters down to 10. This caused an issue loading the captcha image from the database because the md5 string for the filename wasn’t found in the DB because it had been truncated down to 10 characters. This was easily resolved using the following statement in SQL:
alter table wp_secureimage modify img_name varchar(32);
A quick refresh of firefox, and his random images are back up and running again. I’ve now got to go through the log files for SQL and adjust all the others now… weeeee