Zac Stewart

Zac Stewart

Have an interesting problem? Let's talk

Using scikit-learn Pipelines and FeatureUnions

Since I posted a postmortem of my entry to Kaggle's See Click Fix competition, I've meant to keep sharing things that I learn as I improve my machine learning skills. One that I've been meaning to share is scikit-learn's pipeline module. The following is a moderately detailed explanation and a few examples of how I use pipelining when I work on competitions.

Read more

Building A Language Identifier

I recently gave a talk on language identification at Big Nerd Ranch. The gist of it was extracting text from Wikipedia and training a naive Bayes classifier to predict the language of text. You can check out the resulting classifier running on Heroku

Read more

Kaggle See Click Fix competition postmortem

This challenge was to predict the number of votes, comments, and views that issues created on See Click Fix would get. The provided datasets included the latitude and longitude, summary and description (both text fields), a source (mobile client, API, city-initiated, etc…), a created timestamp, and a category tag. Of course, the training dataset included the three items to be predicted.

Read more

Learning C++: A brainfuck Interpreter

As I try to steer my career away from generic application building and into scientific computing, I find myself wanting to learn C++. This is an odd admission for a Rubyist, as I usually see movement in the opposite direction, with tired C programmers washing up on the sweet shores of Ruby, sighing, with relief, that it just feels so right.

Read more

Establish Your Encrypted Channels Now

The purpose of this post isn't to bemoan the expanding surveillance state, warn of impending civil liberty revocation, or even to make you feel paranoid. I only want to talk sensibly about a few tools that we should all be comfortable using and know when we should use them.

Read more

Some Things I Read in 2012

I read a total of 14 books last year. I had set my goal for 15, but finished the year two-thirds of the way into three different books. I tend to read plurally.

Read more

Verifing Minecraft User Accounts

When I need to give my brain a rest, I like to play Minecraft on an interesting server known as Civcraft. The unique thing about this server is that it is an experiment in anarchy of sorts. There are no rules except not to exploit software glitches that could give you an unfair advantage. Robbery, murder, griefing and trolling of all sorts are completely legal within the rules of the server. As a result, there have evolved complex and organic societies complete with competing cities, marketplaces and even ad hoc police forces and bounty hunters.

Read more

Defining Abilities for Collections of Records Using CanCan

I've been using CanCan for managing role-based authorization in Rstrnt, my restaurant management solution. CanCan is a very simple and easy-to-use authorization library that works out-of-the-box with Devise (and any other authentication system that provides a current_user method). However I had a use case that doesn't seem to be documented on the project's wiki.

Read more

Meow: A Growl Work-Alike for jQuery

jQuery Meow mimics Growl notications. It supports all jQuery events and you can bind it to various sources for message input making it ideal for form validation, Rails flash notices, or a replacement for the alert() box.

See a demo

Deploying Your Jekyll Blog On Dreamhost Via Capistrano

I've gotten Jekyll working on my shared Dreamhosting account, and not just pushing the compiled pages to my webroot: compiling my Sass stylesheets and then compiling the static HTML pages with Jekyll and even using Pygments to generate syntax-aware HTML--all server-side.

Read more