College Scorecard Application
Building an application to serve College Scorecard open data
Still working on the College Scorecard dataset. Previously I explored the dataset in a real-world application, talked about how to clean the data, and worked with the data API.
Now I’ve decided I want to put this in a web application so others can use the dataset in the same flexible way that I have been using it. (Reminder that several popular college-search websites exist, but they are limited in the ways you can filter the data. Also they tend to gather personal data for I suppose ad generation.)
I pulled out some previous work in React and Flask, and started setting up an application. All was going well; I was able to show a couple of views of the data successfully. Here’s the women’s colleges view:

Pretty quickly, though, I ran up against a limit on the filters I wanted to use. Not every field is indexed to use in the public API. The obvious answer is to import the dataset into a local postgres database. The datasest is wide though, much wider than the 1000-column limit in postgres. I’m working on transforming it into something I can use. Thankfully, the dataset administrators provided a nice data dictionary yaml that I’m using to programatically create tables and then import data into distinct tables of less than 1000 columns.
The cool part is that once I have the data imported, I can combine it with other datasets that have location data. On a parallel thread, I’m taking a course on location analytics, using ESRI Business Analyst. I’ll write about ESRI and the course later, but I was thrilled to learn about new categories of location data. A (lat, long) place for everything, and everything in its (lat, long) place.