The Fluffshack

Unraveling the world one sock at a time

When you touch that server you touch me

They turned off the HPSIM and general management and alerting server this morning, or at least, unplugged it, cause it was causing this huge network spike at a remote site

I know for a fact that no one besides myself knows what it is exactly that machine does, as its only usefull to me and what I do.

That doesnt mean it isnt explained in the server list in Sharepoint that I made and painstakingly try to keep up to date, that no one bothers to ever look at.

And of course no one bothered to ask during the day what exactly the impact is that they unplugged the server.

I mean, who cares about hardware and remote monitoring of servers anyway. It is, after all, only the most basic part of my job.

That made me feel really appreciated.

HPSIM was reinstalled a few weeks go by one of my collegues. When I explained it took me 2 days to set it up last time I installed it, he was suprised.

I will admit, it doesnt need to take that long. But it was new software to me at the time, and I was carefull, and ran into some awkward service account issues.

Its a very messy collection of software, basicly, so you need to be carefull and precise.

I read the manuals first.

I ended up needing 3 different service accounts. With different levels of rights and access. 

He reinstalled HPSIM in about 1 hour. Its his way, he loves to impress with how fast he can do things.

I havent logged on to it in the meantime, because my time was needed elsewhere for the last few weeks. Build activities that go first. Project. Bids. Money.

I warned them in a long email 2 weeks ago, that no one was now doing any active systems administration. No one was keeping an eye on things. No one was cuting the grass.

Fast forward to this morning...

So, I cant dispute that HPSIM or something on that server killed that sites 2mbit WAN line for an hour, daily, between 10 and 11.

I went in over the ILO to have a look, after I asked them to at least plug -that- back in.

HPSIM service wouldnt start, as it couldnt authenticate its domain service account, cause it had no network. This was expected.

What wasn't expected, was the fact that it was using this collegues domain admin account to start.

And so was the OpenSSH service.

And so was the Sofware update repository service.

I curse myself for not having reinstalled it myself, for one. And I curse myself for not having managed that server myself the past few weeks.

They ask me now, wtf was that server doing? I honestly dont know. I havent managed it for the past few weeks, due to me being allocated to build activities, as they well know.

I hate it. I hate the fact that I dont know.

Even though I have no need to feel responsible, I so very much do. This server was mine, it did this on my watch, at least that is how it feels.

I cant be sure what caused the network spike, and I will never know because they wont let me plug the server back into the network.

This weekend I will reinstall HPSim on a different server. A server that I had racked as spare, for this exact kind of scenario.

It will be reintalled slowly, carefully, with the appropriate documentation at hand, as I did last time.

It will be stable. It will be secure. It will be managed.

It will be beautifull.

And I am not gonna let anyone else on that server. If it ever misbehaves again, they can hold me personally acountable, I want them to, god knows I want them to.

There is only one person in my department with a sense of responsibility for our enviroment.

There is only one person in my department who actually cares things are done correctly.

Every time I place my trust in another technical person, I am dissapointed.

No one else is touching that server from now on.

Happy Sysadmin day.

July 25th, 2008 Posted by Jemimus | Sysadmin | no comments

Of Mice and … keyboard

My Microsoft Elite keyboard/mouse setup has served me well for a number of years, but recently the keyboard started becoming a bit erratic.

I decided to pimp my system with some new hardware.

Logitech G15 Gaming Keyboard

I love the display, which has some pre-customized WoW settings in there. I also downloaded a G15 Teamspeak addon that shows who is speaking on Teamspeak, very cool. The keys are very pleasant on the fingers, and the keyboard has some special macro buttons down the side, which come with quite powerfull macro/scripting software.

The backlit lighting is very nice. Though I prefer the blue that the G11 model has.

 

For a new mouse, I got the Microsoft Sidewinder.

Its very nice, but after playing around with Lia’s new MX1000, I think I might actually prefer that for the feel.
The Sidewinder is pretty good though, though it is a bit bulky. It comes with weights that you can insert into the side, which is kinda smart. I went for the max load. It also comes with a cable block to keep the cable at the right managable length. You can customise the buttons, but not the buttons at the top which I thought was a bit weak. The buttons at the top control on-the-fly DPI switching which can be usefull, the little diplay shows what DPI the mouse is set to.

Because I am rather jellous of Lia’s new MX1000, I decided to get the MX Revolution for my work.


Logitech MX Revolution

http://www.digitgeek.com/wp-content/uploads/2008/02/logitechmx.jpg

Now the MX Revolution is, I guess, the successor to Lia’s MX1000.

http://www.exe.com.tr/urunresimleri/Urunler/46146mx1000.jpg

Logitech MX1000

Now while that is true, let me say that these are in fact two very different beasts. Lia doesnt like the second scroll wheel for example and while I dont mind it, I cant blame her.

Estetically they are both gorgeous and very well designed, and fit in your hard totally naturally, supporting the thumb very well.

The revolution does a funny thing with the main scroll wheel, it actually adapts the way it rolls during scrolling,  on the fly, depending on the kind of page you are on. You feel it physically switch between the common “clicks” and the “smooth” mode that some Microsoft mice have standard.

Anyhow, to conclude, I am very happy with my new purchases, and am sure they are gonna serve me very well. And at work, I am totally pimpin of course ;)

 

July 4th, 2008 Posted by Jemimus | Gadgets, Tech | one comment

Doctor Horrible Teaser VideoDoctor Horrible’s Sing-Along BlogDoctor Horrible’s Sing-Along Blog

June 25th, 2008 Posted by Jemimus | Uncategorized | no comments

Rise of the Data Center EngineerData Center Knowledge

June 17th, 2008 Posted by Jemimus | Uncategorized | no comments

seesmic video test

June 16th, 2008 Posted by Jemimus | Video | 2 comments

Got Logitech Quickcam Orbit AF

I already had the Orbit MP, which I was quite happy with.
But I recently discovered that Skype supported what they are calling “High Quality Video”. Now this only works on select Logitech hardware, but not the MP version of the Orbit.

I am kinda drooling over the posible pictuer quality this offers though, and considdering me and Lia’s relationship is currently very heavily reliant on Skype, it makes sense we get the most possible out of it.

Review of Skype HQ Video 1
Review of Skype HQ Video 2

So as a present to myself for my recent promotion, I bought the AF version on Saterday, here they are side by side:


Zoom In

Left is the MP, right is the AF version.

AF stands for Auto Focus, and the feature works quite well. Its got a far better lense, made by Carl Zeiss, which is another cool branding victory for them.

 Logitech Quickcam Sphere AF

 

Now I was gonna give my MP to Lia, but we actually want to do High Quality Video both ways, so she is now considdering getting the Pro 9000

http://www.connectreviews.com/images/logitech_quickcampro9000_1.jpg

Anyhow, I am very happy with it so far. Its capable of very high quality streaming video, and although its only a 2mp sensor, it can extrapolate to 8pm resolution.

 


Zoom in to full res: 3264px × 2448px

High-Def test video of me:

http://blip.tv/file/get/Jemimus-HiDefTestVideoWithLogitechOrbitAF246.wmv?referrer=blip.tv&source=1

 

 

 

 

June 9th, 2008 Posted by Jemimus | Gadgets, Geek, Tech | one comment

Datacenter Move post 4

Youtube videos are worth a thousands words, so I will let them do the talking.
Progress TCR Move May27 part 1 (the old location)

Progress TCR Move May27 part 2 (the new location)

What Mustafa thinks of IBM rackmount kits:


And some pics from the last 3 days: (Click for larger versions)

IBM xseries 336, bout 3 years old now

IMG_3452 ServeRAID 6M controller (in the PCI slot bay). Its IBM branded but its basically an Adaptec. This comes out of one of the 2 xseries 336 servers that, together with the EXP400 shelf, served as a Windows 2003 cluster. The ServeRAID controllers are needed to provide failover control of the shared disk shelf.

IMG_3453 Mainboard of an xseries 336, with the PCI card bay/thing removed.  The blue bracket at the top is where usually an RSA-II management card would be sitting, but this one doesnt have one :(

IMG_3454 ServeRAID 6M removed from PCI bay of the 336 server

IMG_3456 The rack is slowly emptying. I remember when I was building it all up, 3 years ago! Check it out

IMG_3459 Installing windows on an IBM xseries 336. Its been a while. I noticed the IBM Serveguide CD has a few more options now.

IMG_3460 Picture I needed to have to illustrate where to connect everything.

IMG_3461 Ready to move to new location

IMG_3464

IMG_3465 Not the most ideal way of moving servers, but its better than nothing. At least they are softer  here than in the back of the car.

IMG_3467 Richard, our project manager, trying to get more work in.

IMG_3469 Temporary cabling ;)

IMG_3472 Its slowly growing

IMG_3473 They are not done with the rack interconnects, damnit. I cant finish my patching like this.

IMG_3474 Our new firewall cluster

IMG_3475 I love the blue glow of the console. Kinda wierd to have that out-of-place IBM in there, squashed between the HP.s.  The cable rail for the server is different than HP aswell, so that will be fun to cable.

IMG_3476 Moved some servers around, getting to our final config now.

IMG_3477 WAN comms rack in the new location

IMG_3478 LANcoms rack in the new location

IMG_3480 Khalid on servicedesk duty

IMG_3481 Gertjan is helping us decomission

IMG_3482

IMG_3483 Mustafa hard at work decomissioning servers

IMG_3486 A lot of servers where decommissioned today, these are all basically being scrapped.

May 28th, 2008 Posted by Jemimus | In The Trenches, Sysadmin, TCR Move 2008, Tech, Video | no comments

The Vegas SuperNAP

Datacenter Knowledge has 2 posts up about Switch Communications new datacenter in Las Vegas, which they are claiming is the highest-density datacenter in the world.

http://www.datacenterknowledge.com/archives/2008/May/27/the_vegas_supernap_a_data_center_revolution.html

http://www.datacenterknowledge.com/archives/2008/May/27/1500_watts_a_square_foot_a_look_at_tscif.html

switch-tscif-aisle.jpg swith-tscif.jpg

Switch Communications says it is successfully cooling a section of its Las Vegas data center running at nearly 1,500 watts per square foot using air cooling. How are they accomplishing this?

The key to Switch's high-density cooling is a design known as Thermal Separate Compartment in Facility (TSCIF), according to company co-founder Rob Roy. The ingredients in this approach include high-capacity AC units placed outside the data center area, and a tightly integrated hot aisle containment system for the racks. Here's an overview:

    * The cabinets are set on a slab, with no raised floor.

    * Chilled air is delivered into the cold aisle near the ceiling rather than through the floor, and enters the cabinets through the front.

    * Each cabinet fits into a slot in the TSCIF unit, which encapsulates the rear and sides of each cabinet, while the open front extends beyond the enclosure.

    * The hot aisle containment system delivers waste heat back into the ceiling plenum, where it can be returned to the chiller.

 

Very cool video of the SuperNAP setup:

http://www.switchnap.com/pages/products/the-supernap-video.php

More pics of their T-Scif cooling system: http://www.switchnap.com/pages/tech-specs/thermal-scif.php

The statistics off their site:

407,000 square feet of space
 
250 MVA Switch owned substation
 
146 MVA of generator capacity
 
84 MVA of UPS supply
 
30,000 tons of system plus system cooling
 
4,500,000 CFM

30 cooling towers

100% heat containment using thermal-scif™

Designed for 1500 watts per sq. ft. density
 
7000+ cabinets
 
Armed 24/7/365 military trained
Switch employed security staff

 

May 27th, 2008 Posted by Jemimus | Sysadmin | no comments

Desicionmaking on the new Proxy solution


For your enjoyment, here is a, slightly edited, email I just sent to the department head and various other decision makers. It goes over some of the options we need to consider to solve the current issues with our internet access.

Names and places have been changed to protect the guilty ;)

And please exuse the spelling. I was in a hurry and I really dont care about spelling as much as I do content.

----------------------------

Hi all,

We are currently faced with some decisions that need to me made in regard to the Internet Proxy solution for the Netherlands and Belgium.

This is the current situation in regard to the proxy servers in Lala City and Chipville.

Server Lala City: LA-Server-S99
Server Chipville: CHIPVILLE-Server-S99

Both servers are HP DL360 G2 servers, and are now approaching 8 years of age. They are very out of warranty, and no hardware support can be expected from HP anymore regarding these.
Both servers run Windows 2000 standard
The Proxy software on both servers is ISA server 2000, running on the SQL MSDE engine. This software is still supported by Microsoft, but has been superceded by 2 newer versions.
In addition, we currently run the Surfcontrol web-filtering software, as a plug-in for ISA.
This software allows us to tightly control web-behaviour, for example to allow certain users access to certain sites, and to block entire catagories of websites, or web-protocols.
We have built up a pretty extensive rule-set over the years on both machines, and both rulesets are largely identical.
The company "Surfcontrol" was aquired by Websense in 2007, and since that time the Surfcontrol software is no longer supported, no patches or service packs are being offered for download, and no licences are being extended or sold, forcing all former Surfcontrol customers, including us, to look for alternatives.
The software combination on these servers has causes us some issues in the past. Some elements of Surfcontrol have always been buggy, and as the hardware has aged, it has become unreliable.
Furthermore, the decision to use SQL MSDE has causes problems, because of its inherrent 2gb limit.

Lala City
The Lala City proxy server is due to be replaced with new hardware, located in SiteB. This action is outstanding as part of the TCR move project.
As part of this, a new server was purchased, together with W ISA server 2006, and SQL 2005
At the moment, no replacement for Surfcontrol has yet been purchased, although Dick Dickerson did get a cost estimate for the Websense software, based on a single server, 500 users, and 3 years of licensing. (included as attachement)
A decision on this has been on the back burner, due to the fact that we where also planning on moving the current ISA server to SiteB anyway, and using the Chipville ISA server as a backup.

Chipville
The old Proxy server in Chipville is in a similair state to the one in Lala City. Although one of its 2 disks (that run in a mirror) has failed since last week.
This causes a serious risk to internet service continuity. It also represents a risk to the TCR move project, as this server is now no longer a reliable fallback while we move the Lala City server.

We need to decide how to proceed going forward.

The time factor
We have only a limited time to come up with a solution. Currently the situation in Chipville is more pressing, because of the hardware failure of the server there.
The big-bang server move from TCR Lala City is sqeduled less than a month from now, and we need a stable and supportablesolution at the very least in Chipville before that time, and idealy a solution for Lala City aswell.

There are a number of options:

Option 1. Keep the current servers
The Lala City server can be moved to SiteB and continue to operate from there, serving Internet users (non-citrix) in the Netherlands.
However, the hardware and software is no longer supported, the software is in an unstable state due to past problems with Surfcontrol, and the ISA MSDE database.
Due to the advanced age of the hardware, it is only a matter of time before it fails. Moving it might actually break it too.
The Chipville server cannot operate as-is, on a failed hardware mirror. This absolutely needs to be replaced, more or less disqualifying this option.
Due to the above, I cannot recommend this option in any way.

Option 2. Outsource the Proxy service to European Datacenter / UK
This would involve redirecting all internet traffic from Netherlands and Belgium to an outside, centralised Proxy system for internet access.
This would simplify our support model somewhat, and remove the technical burden of supporting the solution ourselves.
The downside though, is that we no longer have direct control over what is allowed/disallowed over the Internet.
By default, as far as I have heard, no rules are in place for both the UK and European Datacenter proxy solutions, meaning that there are no limits on what people can do with the Internet connection, it would be a free-for-all, whereas right now, we have strict limits on usage.
This option should be considdered. But the question has to be asked why the web-filtering function was ever needed in the first place. If web-filtering and control remains a business requirement, that this options cannot be considdered.

Option 3. Hybrid In-country hosting / European Datacenter hosting
I have been made aware of a version of the European Datacenter hosting scenario, that includes re-directing in-country internet traffic to European Datacenter, but in combination with a local Proxy/web-filter server, running the Websense software. This would involve installing a local server with the "Websense" filtering software, and "chaining" it to the Websense Proxy server in European Datacenter. Many countries apparently already follow this model.
This has the advantage of retaining local control of a rulebase, allowing us to continue to restrict internet use where nessesary, but with the advantage of not needing local Internet line for basic Internet use anymore. MEGACORP(TM) also can retain an amount of corperate internet-use control, via the gateway in European Datacenter, as all internet fraffic eventually moves through there to get out. Currently MEGACORP(TM) does not pose any global restrictions on the Internet gateway in European Datacenter, as far as I have heard.
This option should be considdered, however it will take some time to study and set up properly. The support model may be complicated because of the fact you are dealing with possible web-filtering and proxying in 2 different locations, supported by 2 different organisations. It would however, also require that local websense software be purchased and supported. I have also been told by some, that the connection via European Datacenter is very slow and not that usefull for many operational tasks.  This could hurt us, as we run a number of line-of-business web-based applications over the internet. (Hp Shipview, etc)
We would also benefit from the fact that the websense software can be centerally managed from 1 console, making it very easy keep the netherlands and Belgium ruleset identical, and simplifying reporting and failover.
I would recommend this option if we can be sure the performance is adequate for our business needs, and if the support model can be agreed apon quickly. The major downside of this solution currently is that it will take time to set up, and we dont have much time anymore.

Option 4. New installation In-Country
This involves basicly rebuilding the 2 Proxy servers on new hardware, and installing fresh, current and supported Proxy and web-filtering software.
In this scenario we would use our own local Intranet lines in SiteB and Chipville.
We would directly support the solution, and maintain direct control over the web-filter ruleset, this is the most simple support scenario.
Hardware for this is already available: The replacement of the Lala City server was already part of the TCR move project, as is the licence for ISA 2006 and SQL 2005.
Hardware for Chipville is also already available on site, in the form of a 3-year-old IBM server, however, this server may soon fall out of hardware support (needs to be checked).
Apart from the ISA and SQL license that would be needed for the Chipville server, we need new web-filtering software for both servers, again, IF the business still deems this a requirement.
If they dont, then this solution would provide unfiltered internet access to all (non-citrix) internet users in Netherlands and Belgium.
For the web-filtering requirement, I would at this time advise to als go with the Websesne software, as they are currently regarded as the market leader, and their software is well supported ans well known in the industry. (they are incorperating a lot of the Surftcontrol concepts as part of the aquesition )
We need to look at the current available hardware for this. Almost all the hardware we have is 3 years old or older, so it may be advisable to considder purchasing a new piece of hardware for this solution in Chipville.
This option should be considdered. It has the advantage of retaining central control and will be quick to set up, once the software has been purchased. The downside is that the Websense software is expensive, so we may want to considder looking at alternatives, even though it has becomes a defacto standard within MEGACORP(TM). Again, we have a time-constraint problem here.
We would also benefit from the fact that the websense software can be centerally managed from 1 console, making it very easy keep the netherlands and Belgium ruleset identical, and simplifying reporting and failover.
I would recommend this option first and foremost, and it is the prefered solution technically, considdering the circumstances.



Again, i wish to stress the timeconstraints we have, less than a month before big-bang, we want a new solution up and running within the next 3 weeks!


-------------------------

May 26th, 2008 Posted by Jemimus | Sysadmin | no comments

Excellent Schneier Article on Selling SecurityPlanet Sysadmin

May 26th, 2008 Posted by Jemimus | Uncategorized | no comments