Sysadmin Still Surviving: March 2014

Monday, March 31, 2014

Programming Makes Me Feel Dumb

I've made blog posts before about my foray back into programming. I'm not sure if it's curiosity or masochism that keeps drawing me back to programming; curiosity for the technology, a fascination with the rules by which programming works, or the order it imposes on a computer environment. Maybe it's a combination of things. I never quite put my finger on it.

I just know that I have a history of trying to dive into a programming language and becoming frustrated at how stupid I feel when I hit a point that things just doesn't make sense.

I was reminded of this when working on a tutorial for making a very basic wiki using Go. It was a really good tutorial; step by step, it not only created a simple wiki server, but explained how it underwent an evolution.

Here's how, when we connect, this handler will stream these HTML instructions to the client! Now in this part, we'll take that text in the reply and change it to a template. Create a text file with this in it, and change the function so it calls the template instead. Same results, different method of getting them!

I carefully read the tutorial and implemented the instructions, with an occasional personalized touch just to test that I understood what was happening. And in the end I could say that I was able to trace out most of what it was doing. I was pretty proud of myself. But there were still limits to my understanding, and I found them most irritating.

For example, at the end of the exercise, there were "some simple tasks" to tackle on my own. They're good to do, since it demonstrates an actual understanding...to some degree...of the material. One of them involved creating a handler for people to connect to the web root and when they do, the server replies with a page called FrontPage.

I made the mistake of thinking to myself, "That shouldn't be too hard."

I trace out the execution path. Go makes the initial handling simple; in main() there's:

func main() {
http.HandleFunc("/view/", makeHandler(viewHandler))
http.HandleFunc("/edit/", makeHandler(editHandler))
http.HandleFunc("/save/", makeHandler(saveHandler))
http.ListenAndServe(":8080", nil)
}

The wiki lets you open pages using http://server/view, or /edit or /save, and anything else is a 404. This makes it kind of easy to decode what happens here. To create a handler for web root, I think, I just add a line for the root address to go to a makeHandler, which takes the actual handler (I'll explain in a moment) as an argument. Like this:

http.HandleFunc("/", makeHandler(frontpageHandler))

I also make a frontpageHandler function. To save redundant coding I just want to redirect requests to the web root to a page called FrontPage. Not overly efficient since it's a redirect, but I figure it shouldn't be a huge penalty for what I'm doing.

func frontpageHandler(w http.ResponseWriter, r *http.Request) {
http.Redirect(w, r, "/view/FrontPage", http.StatusFound)

}

This should pawn off the root page request to the /view hander. There's a problem though. The execution never gets that far. See, MakeHandler does, as was part of the tutorial, checking to make sure you are trying to access particular pages and nothing else.

func makeHandler(fn func(http.ResponseWriter, *http.Request, string)) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
m := validPath.FindStringSubmatch(r.URL.Path)
if m == nil {
http.NotFound(w, r)
return
}
fn(w, r, m[2])
}

}

See that validPath.FindStringSubmatch, passing the URL? Yeah, that goes back to this:

var validPath = regexp.MustCompile("^/(edit|save|view)/([a-zA-Z0-9]+)$")

And there's the stupidhammer hitting me in the head. It's REGEX! Now, seeing this, I can sort of tell what it's doing. The call in makeHandler will only pass on the request to edit/save/view handlers if the requested URL contains edit, save, or view, followed by a slash and an alphanumeric string. But how do I tell it to take a...nothing? Since I didn't know how to get regex to match just a slash alone...and I tried several variations only to fail each time...I was growing tired of hitting a wall.

Then I thought, "I would need to rewrite the request before it gets to the makeHandler! Then it'll pass the /view to the path!"

Only my attempts at that failed each time as well. I would have to change how the application pulls the variable assignment, since the URL that is analyzed for the existing code wouldn't be changed. Attempting to redirect prematurely seemed like it should work, but I couldn't get it to work in the initial handler in main().

I was getting a virtual bruise on my head before thinking, "Why pass it to makeHandler at all? I'm making a loop in the execution path anyway. It's not the most efficient or direct, but...it would work, wouldn't it?"

At this point I felt stupid for not understanding regex syntax well enough to just match a damn slash by itself (and my only solace was that at least I was aware of this limitation) but now I felt stupid for the amount of time it took before I thought of just bypassing the problem function.

So I made a change in main:

http.HandleFunc("/", frontpageHandler)

Now this created a new problem. It turned out that Go figured everything that matched with a slash would go to frontpageHandler, if it didn't match the edit/view/save handlers first. I didn't test if this was a cascade rules thing or not...I put the frontpageHandler at the bottom of the list of handlers, so if it was at the top, would everything evaluate to true for the slash first and never hit the save/view/edit handlers? I didn't test that. I was excited that it seemed to be getting a step closer to solving the initial "handle the web root request" goal. It did handle it! Only...it handled everything else you threw randomly at the server too. Whoops.

Next I figured my frontpageHandler needed to check that the request was truly for just the web root or return an error message. I went about making some changes:

func frontpageHandler(w http.ResponseWriter, r *http.Request) {

m := r.URL.Path
if m == "/" {
http.Redirect(w, r, "/view/FrontPage", http.StatusFound)
} else {
http.NotFound(w, r)
return
}
return
}

This seemed to do it. Now if you request the web root, it returned FrontPage, using the existing code paths via a redirect to /view. If you requested something like http://server/letsfeeditrandomtrash, you 404'd.

This seems to work. It seems to fit the criteria. But I still feel like I'm mentally defective because I know that programmers like Matt Sherman and Matt Jibson wouldn't have solved the problem this way. They probably would have bypassed the whole problem by knowing the regex off the tops of their heads, and my solution was an inefficient hack.

It's times like this that I feel an overwhelming grasp of the divide between, "I made this work as requested!" and "This was a horrible hack, nowhere near up to quality standards that I want it to be." And worse, I feel like I'm missing the ability to figure out how to bridge that gap.

I look at the source code for the tutorial and think to myself that I understand most of it. I miss some details...why does this look like it's passing by reference, or why is this looking like a pointer...but I get the gist of what's happening and am able to trace what is going on (otherwise I'd be still figuring out why my frontpage function is 404'ing and missing entirely that validation check.) I also understand that if I were coding the wiki example from scratch, a lot of this approach to making the wiki I would have missed. I wouldn't have thought to use the regex patterns for validating the input. Many of these functions I wouldn't have thought to use; I'd still be looking up what the libraries in question can do. I mean...a function that takes a function as an argument then returns that sub-function to the appropriate function after evaluating if the initial call is "valid" via the URL? The final tutorial version even does a sweep of the template files and reads them in as a variable that is referenced instead of individual calls to pull a template during runtime. I doubt I'd have thought of that.

So at this point I feel stupid because I couldn't figure out the proper regex...my solution was a horrible hack...and looking at the existing code, I wouldn't have thought of doing it this way if I were left to my own devices. The distance between where I was and where I'd want to be in competence is a gap larger than any ladder I imagine having in my possession.

At the same time I know that if I call it quits, I'll end up trying something again later. Maybe it would be an idea for learning Ruby on Rails. Maybe it'll be picking up Go again. Or Visual Basic. And I'll bang my head on the same limitations, but with a sliver more familiarity. It's happened before. Several times. Hence...here again. Same crossroads.

Perhaps coincidentally I saw a post on Reddit at approximately the time I was having this crisis in faith from a person who developed an iOS game and had virtually nil sales after starting his programming journey two years ago. A lot of people bitched that he was just trying to market his application, and mods removed the post eventually, but he insisted he was trying to talk about the process he went through in creating the application. I found that part interesting. Part of what he said:

A few weeks ago I released a game in the iOS AppStore called “Inject”. In 2012, I had this game idea/mechanic in my head that I thought was pretty cool – To use a touch screen to simulate an injection mechanism. At the time I didn’t know a single thing about programming or game development but the dream seemed possible. And so I decided to try. I struggled to learn how to program to bring this game to life. It was brutal. I wasn’t built to program things. Codeacademy lies. Not everyone can code. But in the end, I finished.

He also said,

I used Code Academy just to learn how languages work. I seriously had ZERO background in anything to do with programming. I was always bad at math. I even suck at most games but I like playing them.

I think once you understand how computer languages in general work. (you only need a really superficial understanding). You can then chose a language that you want to focus on and look up tutorials for that language/framework to build what you are trying to build. So if you're trying to make something for iOS look into obj-C /cocos if you have big balls. Or for something more newb friendly like game maker/stencyl etc.

Two years ago he started learning a programming language from scratch. Two years until it reached a point he felt he could release it on the iOS App Store. Two years of struggling until he "finished."

I suppose I can also keep struggling to make this work. I used to say that each small step was still a small step forward. I'll keep trying to make tiny steps forward to see what I can make.

Now...I'm off to find a good basic tutorial on HTML. Those templates don't magically work on their own, after all.

Monday, March 24, 2014

Use a Single GOPATH in Go (GoLang)

I like when things can be neat and modular. It makes sense in my world. I like the idea that if something gets mucked up, it's easy to pull the "nuclear options" and just delete that bit and start over. It's part of the reason I like the Mac; most of the time when you want to delete an application, you just delete the .app bundle (there might be some preference files to delete, but for the most part, this isn't an issue.)

Go, by default, likes creating static binaries. Sure, it may wreak a little hell on memory use, but when it's a static binary, you just copy the binary and it works. No dependencies to fuzz over or accidentally break.

So this philosophy extends to projects I'm working on. While playing with programming, I like to create separate work directories, so when I inevitably screw up I can delete it (if need be) and start over. Plus it kind of sandboxes my work, so I don't spread the muck among my learning endeavors.

Guess what? That's not the Go Way.

(Has anyone really worked on documenting the philosophy of Go? I mean, Ruby has the whole Rubyist way of doing everything; reading tutorials and blog posts about the language from people who aren't bitching about it would make you think the language is half programming language and half poetry, complete with a set of unwritten-but-documented rules that are half documentation and half teachings of a programming philosopher. But I digress...)

I wasn't completely aware of the Go Way until I talked to a coworker who is a fan of Go and I had an issue I posted about on StackOverflow. He sent me an email with a link to a blog post called "Why I use just one GOPATH." It helped clarify some thinking for me.

After reading the documentation on workspaces, I started creating a directory per project, each with its' own src, pkg and bin directory. Then I pointed my GOPATH to that folder. This was what I thought the instructions said, and there wasn't anything that explicitly said not to do that. I was simply exporting GOPATH at the prompt when I needed to.

Don't do that.

You could do it that way, but it's a bit more of a pain and isn't the Go way. What Go wants you to do is to put your projects into their own subdirectories under /src. For example,

/home/myuser/goprojects
/src
/project 1
/project 2
/project 3
/bin
/pkg

Each project will go into the appropriate folder under src, executable binaries end up in /bin, and you get one GOPATH set to /home/myuser/goprojects.

While it still grates my own aesthetic tastes to somewhat mingle all the projects into one folder, it does simplify my Git repository and still somewhat sandboxes my application experimentation by keeping source code in just the /src.

So if you, like me, found the online documentation for Go to be vague enough that you find a way to make it fit your workflow rather than understanding the Go Way(tm), you're probably doing it wrong. Create one folder path, set it to GOPATH, and create new applications as subfolders under $GOPATH/src.

Monday, March 17, 2014

Godoc is Missing; Another GoLang Adventure

I was working through some Go tutorials and it told me that Go was kind of self documenting using godoc. I tried their example and discovered that the executable wasn't found.

It turns out that in Go 1.2, if you installed from source, godoc is missing. From what I can piece together from searches online the install-from-binary packages include godoc, but the source doesn't; also as of 1.2 godoc was split so the binary is separated from its source and package information. Still want godoc? Here's how you can get it.

First, you need to have $GOROOT and $GOPATH set. You can do this at the command line, which will only last for your session, or you can add it to one of your environment files like .bash_profile or .bashrc so it persists over logins.

export GOROOT=`go env GOROOT`
export PATH=$PATH:$GOROOT/bin

This assumes you already set your GOROOT variable. You can use the above method to set your GOPATH if the environment variable is already set (hint: go env GOPATH shouldn't return a blank line; if it does, you need to use your workspace path after the equals sign. Also, GOPATH cannot match GOROOT. That would be bad. Side note, Go won't allow it anyway.)

Time to get godoc.

go get code.google.com/p/go.tools/cmd/godoc

This installed the binary to my GOROOT/bin directory, adding it to the binaries for the compiler. Godoc's source and package files are stored in GOPATH/src.

It's kind of strange if you're not accustomed to the Go conventions style...but if you want godoc, this is how you can get it installed!

Friday, March 14, 2014

All Hard Drives Go To Heaven (How To Recover From Drive Failure)

One neat thing about Raspberry Pi's is they use SD cards for storage. They seem rather reliable; they die when their memory cells start going kerplooey. Unfortunately SD cards tend to provide little in the way of storage capacity.

For this reason I added a scrap 500 gig laptop hard disk to the Pi I am using to experiment with Go. It's connected to a SATA-USB converter, which in turn connects to a USB hub. Spin drives, though remarkably reliable today, still apparently have limits.

The other day I was working on fixing a problem with a missing godoc install when the external drive wouldn't let me write to it. I checked with 'mount' and, sure enough, it was mounted as a read-only drive.

This should have triggered an alarm bell...after all, fstab lists the drive as mounted in RW, and I hadn't remounted it, but in my distracted mind I chalked it up to a fluke and decided to reboot it and let that sort it out.

sudo shutdown -r

Oh, side note of some importance. I access this Pi through ssh.

After giving it a few minutes to boot up, I tried reconnecting. Nothing.

Maybe it needed a hard restart? I pulled power to the USB hub, causing it all to power down. Count to ten. Plug it back in. After a bit of time, try connecting to it again. Nothing.

I pull the unit out and plug in an HDMI monitor and boot again. Doesn't take long to see the problem. The screen fills with errors as the kernel tries mounting the external drive.

Aw, damn.

Bad news, all my source code experimentation is gone. Good news is I had a spare drive to replace the dead one. Here's what you do.

First, the Pi eventually logged in Root at the console without the network launching. I copy /etc/fstab to /etc/fstab.old, then opened fstab and removed the UUID line that mounted the external drive. Then I shut down the Pi and switch out the drive.

Power it back up, and now networking, along with ssh, is back.

Dmesg tells me the new drive, like the old, is on /dev/sda.

I open fdisk and removed all the existing partitions with the "d" command, then create one big partition with "n". Create it with "p"rimary, 1, and accept the defaults for start and end sectors. Exit fdisk with "w" command so fdisk writes changes to the drive.

Next I formatted the drive with:

sudo mkfs.ext4 /dev/sda1

At this point you'd better be really sure you're working with the right drive or you're a little more than screwed.

At this point I want it to set the new drive to mount automatically at boot. I run:

sudo blkid

...and copy the UUID to my clipboard (I'm running iTerm to ssh to the Pi, so copy and pasting is really easy.) I then re-create the fstab entry using a lazy method.

cd /etc

mv fstab fstab.orig

mv fstab.old fstab

I open fstab in a text editor and replace the dead drive's UUID with the new UUID in the clipboard. Save the file (you did launch the editor using su, right? You need root privileges to edit fstab...)

Reboot. Did it mount? If yes, you're ready for the next step.

I re-created my go_projects folder with sudo, and chowned the directory to my user.

Now experimenting with git pays off (and this is why the post is also tagged with "programming."

git init

git remote add origin git@<my git server repo>:/<gitpath>

git pull origin master

This only pulled a ./src folder. I added the other two workspace folders..

mkdir bin pkg

Yay! Back in business!

Source control is not only great for keeping your source code protected as you make changes and work in groups, but it also allows you to have a kind of distributed backup!

Monday, March 10, 2014

GoLang: Don't Put "Test" In Your Filename!

Still discovering weird things (or at least weird to me as a beginner) while trying to become familiar with Go.

While playing with my simple little speedtest script (I know I have a number of entries about this, but I'm spreading out my scheduled postings so it seems like I'm fixated on this much more than I really am...) I threw the source text into a file called count_speed_test.go on another machine, just to check the time it took to run on that particular configuration.

It threw a rather wonky error. I couldn't make heads or tails of it.

I studied the text, which wasn't all that complicated, and couldn't find a typo. Nada. Zip. I created a second file named countspeed.go, inserted the same text, and it compiled.

Huh?

I used md5 to check the checksums. They matched. They were the same file. The Go compiler didn't like the filename!

I sent a message to a programmer coworker dabbling in Go asking what happened; I included this text:
******

me@mymachine /go_projects/src/countspeed $ md5sum *

9a59ff21a94861590e1ecc94676db0b5 countspeed.go

9a59ff21a94861590e1ecc94676db0b5 count_speed_test.go

me@mymachine /go_projects/src/countspeed $ go build countspeed.go

me@mymachine /go_projects/src/countspeed $ ls

countspeed countspeed.go count_speed_test.go

me@mymachine /go_projects/src/countspeed $ go build count_speed_test.go

# command-line-arguments

runtime.main: illegal combination BL C_NONE C_NONE C_ADDR, 1 3

(2069) BL ,main.init+0(SB)

runtime.main: illegal combination BL C_NONE C_NONE C_ADDR, 1 3

(2076) BL ,main.main+0(SB)

runtime.main: illegal combination BL C_NONE C_NONE C_ADDR, 1 3

(2069) BL ,main.init+0(SB)

runtime.main: illegal combination BL C_NONE C_NONE C_ADDR, 1 3

(2076) BL ,main.main+0(SB)

main.main(0): not defined

main.init(0): not defined

me@mymachine /go_projects/src/countspeed $

************

I kind of wracked my brain trying to recall if I'd run into this on the tutorials or teaching guides or online documentation; pretty sure I hadn't. My coworker shot an email back:

"Yes, the _test file name is probably the issue. That convention is for actual tests (which you would invoke with go test)."

My suspicion was correct (Yay I could figure that much out!) and apparently the keywords I needed were "go test filename" to find the documentation. But don't count on it being apparent...I saw it was briefly touched upon, mentioned in passing, that the file name ends in _test, and most of the Go documentation focuses on the testing package and framework within the file rather than the filename itself.

In other words, don't name a Go source file with "_test" unless it's for testing your stuff. Go doesn't like it, and it's not really obvious that this is a convention. I saw my coworker in actual real life later on and I thanked him for his reply, and he added that the "test" filename thing was one of those things that is documented, but you really kinda had to know what you were looking for to find it.

Oh, the joys of a young language and the growing body of documentation not quite "there" yet...

(Although the documentation they have is remarkably good for such a young language, overall.)

Friday, March 7, 2014

Interesting Speed Observation Using a Simple Ruby Script

One day I was a little bored and thinking about a simple speed test, which I quickly threw together and previously posted about. The thing is that I saw some inconsistencies in speed depending on how I was connected to the machine.

Here's the Ruby script I threw together:

for n in 1...1000000 do
puts "#{n}"
end

Other pertinent information:
Ruby version: ruby 2.0.0p247 (2013-06-27 revision 41674) [universal.x86_64-darwin13]
OS X version 10.9.2
2.4 GHz Core 2 Duo
4 GB RAM

Before leaving the office, I connected to the test machine through SSH. This is a straight SSH connection from a terminal, no VPN or anything like that. I ran the script with a simple

time ruby ./rubytest.rb

Results when SSH'd in:
Run 1:
real 0m54.064s
user 0m5.020s
sys 0m3.501s

Run 2:
real 0m53.146s
user 0m5.134s
sys 0m3.451s

Run 3:
real 0m54.862s
user 0m4.986s
sys 0m3.356s

Once I got home I ran the script on the terminal locally.
Run 1:
real   0m16.106s
user   0m5.324s
sys 0m3.840s

Run 2:
real   0m15.469s
user   0m5.307s

sys 0m3.771s

Run 3:
real   0m15.293s
user   0m5.344s
sys 0m3.750s

Was this something introduced by using SSH? I used SSH to connect to localhost and tried again...
Run 1:
real   0m10.477s
user   0m5.016s

sys 0m3.327s

Run 2:
real   0m10.894s
user   0m5.092s

sys 0m3.382s

Run 3:
real   0m11.278s
user   0m5.112s

sys 0m3.403s

I don't know the reason for this exactly. The amazing part was connecting to SSH locally and having it speed up the script. All I can figure is that SSH is compressing console output, and when I'm connected over the Internet the connection is still lagged but connecting from localhost gives an advantage despite processing overhead of the compression, and Ruby is somehow constrained to how fast the data can be dumped into the console (maybe it waits for some kind of acknowledgement that output is actually output before moving to the next calculation?)

For me it's speculation. I'll have to see if I can find answers somewhere.

NINJA EDIT: I posted the question to unix.stackexchange.com and got some excellent answers!

Monday, March 3, 2014

System State Preservation (Deep Freeze, Fortres 101 and Clean Slate)

When I talk of system state preservation, I mean you want a system that is treated kind of like a turn-key kiosk. A locked down workstation, if you will.

I worked in a school environment for several years, and if there's one thing students love to do, it's destroy things. I've seen some outrageously entitled behavior before, such as the destruction of phones because certain teens have Mommies and Daddies that will buy them another after carelessly spider-webbing the touch screen. But some still manage to take it a level higher when they are using property that isn't theirs, such as school computers. I mean...yikes.

Sometimes it's not even malicious, it's just stupid. I won't get into the configuration and management issues involved in the politics of how much to allow installed on systems or user privilege levels that enables installation of software. Suffice it to say that when you manage an environment where you have hundreds of users and only a few staff to oversee them, there are compromises made.

That led to my experience with Deep Freeze from Faronics. Kind of neat, really. It seems to work by taking a snapshot of the system state when you "freeze" the workstation, then every disk write is written as a diff from the snapshot. Reboot, and the system rolls back to the snapshot.

Pro? It felt great to delete most of the Windows folder and still have the machine boot. Very cathartic. Oh, and it let students destroy the data on machines without the effect being permanent. We also had a kind of control console that gave a status of workstations in the school network, and we could remotely freeze and thaw systems from that. Or reboot them. Remotely update the Deep Freeze files by pushing updates. That sort of thing. Still...good to destroy Windows on a bad day.

We had a situation once where malware was spreading on several systems. From the console we told all the machines to reboot at the same time...the malware was gone. Changes students made that they didn't save to their network home drives or a USB drive? Well, that went away too. Listen to the messages we send out or you lose stuff. Whoops.

Cons? Anything automated for updates, like Windows update, would get very, very confused. Push out an update, poll the workstation and it's updated, next day...it's not updated anymore. Same goes for antivirus. Oh, sure, there was a function built into Deep Freeze so you can define a "maintenance period", where the machines would reboot and for a period of several hours the machine was "thawed," meaning changes were allowed and permanent until it was in a "frozen" mode again. The problem is that this was not always reliable. Sometimes Windows Update didn't finish in the allotted time. You know what happens if you (automatically) reboot when it's still updating or something was stuck? Yeah, whoops.

Or Deep Freeze would be in a mode where it was thawed until X boot cycles. You know that thing Windows will do where it says, "Oh, stuff is in use. Let's reboot, and at startup we'll update those files!," then it sets a reboot flag and restarts? Only if the bad timing fairy showed up, that reboot means the system is frozen. So it reboots, Deep Freeze freezes, Windows does the update, then reboots automatically to the frozen-in-need-of-updates-at-startup image, and Windows proceeds to update and then reboot again and this continues ad infinitum, requiring some fun hoops to jump through to disable the stuck-cycle because by the way, Deep Freeze was specifically made to prevent kids in schools from screwing with the image.

And of course there were glitches. We occasionally had systems that would report themselves as thawed, but the management console would say it was still frozen. And vice-versa. It would usually take a reboot cycle or two via the console to convince the systems they should revert to the desired state.

Overall Deep Freeze worked well to create a computer that wasn't easily broken by the students. We could create directories and open registry hives to full access for users, so they didn't get errors when trying to use the machines. Keeping systems patched to the latest level wasn't quite as pressing since any malware finding its way to the workstation would disappear at reboot. Relying on this mode of protection, of course, meant that the systems were extremely vulnerable when the systems were thawed for maintenance, since we gave up trying to get antivirus to behave properly using Deep Freeze.

If you have an environment where you are trying to prevent people from actively destroying your system configurations, or accidentally destroying them, Deep Freeze works. It shines in a "lab" environment where you want sets of systems with a heterogeneous configuration that stays intact when you are too grossly understaffed to properly monitor them.

Fast forward to today. I have a request to configure a small number of systems for use as a sort of communication appliance. We aren't trying to lock out configuration changes for the sake of locking people out, or fear that people will intentionally try to reconfigure the systems. Instead the aim is to lock systems into a simplified configuration, so they can be used with minimal training or worry they'll make permanent configuration changes that will confuse other users or break functionality. Additionally, we want to make it hard for people to accidentally misuse systems in a way that will allow someone access to documents with sensitive data.

Deep Freeze could work, but it would still require a lot of extra configuration to try to lock systems down into a simplified interface. I was looking for something that locked down system configurations but also could manipulate the interface to restrict the number of options people had when using the computers. I needed turnkey "appliances."

After a little digging around I discovered there seems the be relatively few players that managed to get traction in this field. There are several small applications available similar to Deep Freeze, according to the Wikipedia page, but none that seem to be really popular.

So I decided to test a combination of two products called Fortres 101 and Clean Slate, both made by the same company. Presumably this would mean they would work in a complementary fashion (indeed, when I got the install CD's, the CD appears to have several of their products available and differentiated only by the licensing key, reminiscent of the Windows install where you get every version of Windows from Home to Ultimate, but the version you actually activate depends on the activation key entered in setup.)

Fortres 101 is aimed at securing the computer; I can configure it to hide the Start menu, prevent the system tray icons from working, prevent access to particular folders and/or allow particular applications to run. It also allows for a Kiosk mode, where the computer will launch, log in, and run a particular application as the interface. There's a long list of options that can be allowed and disallowed, along with options for security to apply to just users or users and administrators alike (I wonder how hard it is to accidentally lock out all access to the computer?)

Clean Slate is more directly analogous to Deep Freeze; when active, changes to the computer are reverted to a pre-enabled state.

I alluded earlier that the installs for all the Fortresgrand software was included on the application install CD; my impression is that their products are made to be modular enough to be standalone, but intended to be used all together as you purchase licenses. When I activated Fortres and Clean Slate, launching the administrator application...the interface for modifying the configuration to the protection programs...both appear on the same application, just integrated into a tree structure on the left side of the configuration program. By reading through various options, I eventually created a state where the computers seemed to act close to the envisioned appliances in functionality.

But there were problems.

The kiosk mode was supposed to allow the system to boot up, log in as a user, and present a program as the interface. I tried getting it to launch a web page for menus to launch other programs, but that didn't work; I deactivated kiosk mode, but it still logs in as the regular local user at startup. I like that it does this automatically, but why is it doing it, when the kiosk mode tree is not enabled? Is it a feature that, behind the scenes, is treated as another feature in a list of available options, while the UI presents it as a part of a branch in the option tree like it's a "related function" for kiosk configuration?

Or is something broken?

These machines, when worked on for maintenance, be accessed remotely. The configuration process was therefore carefully done through accessing the administrative drive share and RDP. Even though I had it set not to enforce rules for administrators...and I'm a domain administrator on the network, and when I had restrictions in place for users the interface was clearly different between my login and the local user login...I couldn't copy files to the machine's drives when accessing the system as me. From the description, I should have been able to do this; I'm an administrator, after all. And it must see me differently, since user interface restrictions didn't apply when I logged in and I could clearly see they weren't enforced.

So why couldn't I copy files?

Then there were obvious bug-related issues. A big one; the Start menu, which as a user was supposed to give me just options to log off or reboot rather than actually present a menu, started giving me a message that Explorer had crashed. It relaunched Explorer and the menu was now appearing for users. After giving that error message over several restarts, it stopped giving that error, but now the user gets a the Start menu when clicked, and I haven't figured out how to get it back to the previous setting yet.

I don't know what mechanisms are used to manipulate the OS. I infer certain details from observed behavior, but I think the companies keep things quiet to keep their implementations proprietary. For example, I remember using GPO many moons ago to prevent access to certain drive letters. Then I ran a file manager...perhaps it was File Manager from a previous version of Windows, my memory is foggy...and the locked out drives appeared. Apparently GPO was just bit-flipping settings in the registry where Windows Explorer checked what it was supposed to do; GPO wasn't affecting the operating system itself.

Deep Freeze allows changes at the drive level; you could create a "thawspace" to save data across boots, or specify which drives to freeze if you want to leave a drive persistent across boots. Fortres and Clean Slate seem to allow specification of drives you can save to, for selective persistence; if that's the case, it would seem to follow that Deep Freeze is monitoring drives and erasing a diff-like image of changes from your session while Clean Slate is doing something more along the lines of monitoring file and directory access. Someone even made a comment to this effect on a blog but I don't know how they gleaned this information, so I kind of take it with a grain of salt. I wonder if it's inaccurate though, given that the website makes a claim that sounds like Windows Updates work according while F101/CS is active.

From http://www.fortresgrand.com/products/cls/cls.htm

Everything in my interactions with Deep Freeze indications a snapshot-like behavior, while the information on the Fortresgrand site seems consistent with some kind of logging of file changes...journaling...behavior that rolls back a restart or logoff.

So does Fortresgrand work using a combination of registry changes and policy enforcement from its own driver(s)? Or is it entirely self contained? If I had enough time I might be tempted to try digging deeper into the inner workings. Part of me is a little worried that this journaling model isn't as secure as the snapshot-diff method. What dictates that a file is a critical update allowing a system update? Or what if something is incompletely or improperly labeled, allowing a partial alteration, leading to system compromise or corruption?

That leads me to the big head scratcher while trying to use our test prototype appliance systems. I configure the systems then hand them off to a project manager to test out, as he has the ultimate vision for how these are to function. He hands the prototype back with some notes on tweaks, which I try to implement and repeat until he gives the final okay and I move to the next milestone, create the system image to clone out.

The past several weeks, I've been nailed by Windows Updates that wouldn't apply. It seemed quite random, but there were 2 or 3 (the number would change depending on the reboot, but usually that was an MS anti-malware update tacked on to the 2 that kept re-listing and failing. I have no direct proof that F101/CS was the direct cause, but there was something definitely corrupted.

I disabled F101. I triple checked CS was disabled. I tried installing from safe mode. I even uninstalled F101/CS, after carefully saving the settings to exported files since there was no way I could reproduce my settings (fun fact: a file identification utility thinks F101/CS uses SQLite 3.x in the back-end) and re-ran updates, but still they failed.

There is a "system imaging" mode that was discussed in a blog entry by someone having trouble installing a service pack; apparently the system imaging label is actually a way to tell F101/CS to disable the driver at reboot. It didn't fix his problem, and it didn't fix my problem either, unfortunately.

I eventually got on the right track to a solution thanks to someone who wrote up a similar problem in their blog; corrupt and missing files in the update manifests. I've definitely not run into this problem before, and it's certainly possible that this was coincidental with my other file access oddities while trying to update and configure the system with F101/CS running. But could F101/CS have corrupted something in the update process? It's unfair to blame it without direct evidence, but I think it's fair to suspect it had a hand in it.

I was a little more suspicious of its behavior when I reinstalled F101/CS. They both asked for my license, and I was afraid that reinstallation would eat up a total of 2 of X licenses despite this being a reinstall on the same hardware (I hated the prospect of having to call the company and explain we needed the licenses back...)

But nope. It knew the installs were reinstalls. It informed me I had X-1 licenses left.

More than that, it had my previous settings still set. All my custom settings. The uninstallation didn't fully erase the application, apparently. I'm always a little antsy when dealing with applications that don't fully uninstall when I tell them to.

In Conclusion...

The intention of the security settings and state preservation wasn't to lock people out, it was to create a system that defaulted to behavior that made it easier to use for a certain set of applications and discouraged accidentally leaving sensitive documents on "public" systems.

There's a lot of settings that allow such customization in F101/CS. I'm a little leery of what appears to be the underlying implementation, and the access quirks I hit when trying to copy files was a little annoying. I am also troubled by the corrupted manifests in Windows Updates.

If I were trying to use system state preservation in a lab environment, I think I'd stick to Deep Freeze. If Windows had better support for configuring a system to be locked down into a pseudo-kiosk mode, I'd probably have used Deep Freeze to round out securing these systems. Instead I'm relying on the hope that F101/CS would step on each others toes less than, say, combining F101 with Deep Freeze.

If you've run Deep Freeze, Fortres 101 and/or Clean Slate, I'd love to hear your experiences. Were the issues I ran into an anomaly? Am I right to suspect something a little wonky going on with the implementation? Anyone have insights on how these applications work behind the scenes? Please do share in the comments...

Next...I hope I won't run into too many problems trying to image these systems...