I started a new game with the kids recently: collecting photos of historical markers. It’s fun for everyone, rather educational, and has opened my eyes to a few other things as well. Day One is a perfect match. The project is kind of a journal and relies heavily on photos and geotagging, both of which Day One has awesome implementations for. I apply the tag to each entry ‘historicalmarker’ so I know I can find them later if I need.
For whatever reason the kids really love this. I have them watching the roads whenever we drive anywhere. Whenever they see one of these markers they yell out “blue sign!” to let me know I need to pull over. They actually notice them quite far off in the distance so it gives me plenty of time to slow down.
We’ll read the sign together and then I run out and take a photo of it in Day One on my phone. A lot of times this sparks some sort question or discussion for a bit too, which is a great side effect.
Later on I try to go back and type in the text from the sign so it will be more easily searchable later. At some point I may try to put an OCR step in there to automate this. It shouldn’t be too difficult as I currently have Day One synching via Dropbox which gives me direct access to all the markdown files and photographs Day One uses.
The whole process only takes about 90 seconds so I told myself that I should rarely have an issue taking the time to pull over. I was caught on this point over this past weekend. I was taking the family out for a movie when my son yelled “blue sign!” as we drove down the road. I was afraid we wouldn’t make it in time for the movie so I said we couldn’t stop right then.
Really? 90 seconds … I couldn’t take 90 seconds to snap a quick picture? I suddenly became aware of how so much of our time constraints are our own. There was plenty of time, and I should have taken it. We did drive back that way so we could stop. It turned out to be a very timely sign as well since there is so much debate going on around our local school budgets right now. The sign was for George Wolf who was involved in the Free School Act of 1834 which was the foundation the public school system. You can see the sign below.
It’s a wonderful way to show our children how the past continues to be relevant and make history seem more alive.
As I developed this module I became more and more excited about what these tools I kept finding could do if I were to integrate them. In discussing ideas with colleagues there was a central theme that kept coming up and was said best by Peter Parker:
With great power comes great responsibility.
My goal in creating this module was to make some of the programming process a bit more transparent as well as quantifiable. With a few missteps in creating sample reports to demo to people I quickly realized how easy it would be to 1) use this as a litmus test and 2) deduce inaccurate meaning potentially leading to 3) unfortunate decisions and consequences.
To be more specific; I became aware of how this should not be used to determine someone’s employability or value. That is not to say that it can’t help educate a decision, but rather just that this tool should not be the judge and jury. Its intended purpose and most complimentary role is to help teams gain a higher level picture and historical understanding over their own or inherited projects. This understanding with undoubtedly raise questions and draw attention to certain aspects of a project, prompting further investigation. I find this to be both appropriate and healthy.
Let’s take a quick example. Suppose you are comparing two developers on a project to see how they are performing. You find that both are checking in code regularly, commenting appropriately, and even producing roughly the same amount of code. However you see a major difference in code complexity. One developer’s code is maintaining fairly consistent complexity levels (easy to maintain) while the other’s code is rising steadily in complexity. Your initial concern could be that the code is not being structured properly or will be far too difficult to make changes to at a later date.
But what if there is something else going on? What if you forgot to consider the breakdown in roles between the two developers; one was responsible for writing simple helper functions while the other was implementing some pretty hairy business logic being handed to them via complex requirements? What actually may be happening here is not an indictment of each developer’s skill but rather the manifestation of a requirements gathering phase that didn’t expose some contradictory functionalities. The report itself is still proving to be very valuable, but as with anything involving interpretation it is essential to tread cautiously.
Another topic has come up in discussion as well. One of the great ideas shared with me was to use this tool as a gatekeeper of sorts to validate thresholds before allowing pushes to production. This can offer a great deal of stability to a project. It can also have two opposite side effects; a false sense of security and a roadblock preventing essential code from being deployed. In either case, nothing can replace a solid code review process. This tool definitely has a role but it should be in helping a code reviewer look in to various aspects of a project and glean insight, not replacing the code reviewer all together.
Let’s walk through building a submodule for Project QA, shall we? There’s three main steps to implementing your own extension to projectqa:
Create a fresh module
Create entities to store your data
Set up your module
Creating a new module is outside the scope of this post. If you are new to module development or need a refresher, head over to drupal.org for all the resources.
I am curious what modules people are working on, though. if you start one I’d love to hear about it. It may be something valuable to a wide enough audience to consider including in the main module.
Create entities to store your data
This part of projectqa is intentionally left wide open. You are expected to develop the entities to store your data. This also means you have complete control over the structure and accessing abilities of that data. I highly recommend leveraging the Entity API module. It will make your life easier as well as the lives of anyone building off of your work. Additionally it’s already a dependency for projectqa so you’ll have access to it on any system that you’re building your projectqa submodule for.
Consider how you want to leverage your data but also try to keep it as normalized as possible. For PHPLOC I decided to store in two tables; one table for the extracted data and one table for the delta data that was calculated. Putting it all in one table would’ve made a far too large table as well as too much information if I’m only interested in the delta. A bit more description on this will be in the next step.
This hook gets fired by the main projectqa module on each git commit in the history of a repo. You don’t need to worry about walking the repo history, that is taking care of for you. You also don’t need to execute any git commands as the repo is already checked out to the proper location for you.
$repo_path: The system path where the repo is located.
$git_commit: The git commit to process.
The most important piece of information is the repo path that gets passed to you. This tells you were on the filesystem to look in order to start your processing of the code. Be sure not to alter any of the files as that would pollute the code to be processed for both your module and any other module accessing that git commit after your code is executed.
The git commit hash that is passed to you is more of an optional piece of information. You don’t need to do anything with it unless it is valuable to whatever data processing algorithm you are utilizing. The git commit hash is already being saved to the projectqa_gitcommit table for you.
In the case of the projectqa_phploc submodule I was interested in calculating the deltas between commits and storing them so reporting could be easier and more performant. In order to accomplish this, I made this call from within my hook implementation:
This way I can now access the correct records in my own table for the previous commit (remember, do not alter the file system or execute git commands) and generate the diff against my current commit and save to the database along with my current values. For better organization and scaleability I keep all delta values in a separate table (entity).
If you end up developing a submodule for projectqa be sure to stay in contact and keep an eye out for updates. I already have plans for a few more hooks to implement that may help a number of developers. I will also be implementing some functionality to help validate, reprocess, and catch up data if needed.
Previously I set up a Rube Goldberg method of publishing my Jekyll-based blog via Siri. Here’s a more streamlined update.
Every step in the process is a potential breaking point. Fortunately, the two biggest that were breaking on me are also fairly simple to replace. IFTTT works great, but wasn’t always real-time. Dropbox had a major outage this week that meant it couldn’t be used at all. I replaced both of them with a far more direct AppleScript:
The basic concept is the same, only now instead of creating an IFTTT rule to generate a text file on Dropbox that gets synced to my Mac mini where Hazel notices it and fires off an AppleScript …
… I have an AppleScript that checks Reminders.app directly and put it on a launchd job to check every 15 seconds, similar to the process I set up in my post Create Reminder Tasks from Email. Now I’m keying directly off of Reminders.app and have the same flexibility to add new automation tasks at any point.
Ever wish there was a better way to keep an eye on a project you’re working on? After being inspired by this presentation at Drupalcon Portland I decided to create a Drupal module to help automate the code evaluation process: Project QA.
Don’t want to read? Skip ahead, there’s a video introduction and demo!
At its heart, Project QA is a Drupal module that scans git repos and offers hooks for processing them. By itself, Project QA does not evaluate your code at all.
When you tell Project QA to scan a repo, it checks the repo out locally (or updates it if it already has a copy from a previous run), and then begins stepping backwards through the git history one commit at a time. A record of the commit is saved, including the commit hash, the timestamp of the commit, and the author of the commit. After each commit is checked out a hook is fired to allow submodules to act on the repo in its current state. After all commits have been processed, Project QA evaluates the existing tags in the repo and matches them up with the imported commits.
It is pretty generic except for one major condition: every action or piece of data is always linked back to a specific git commit. The code being evaluated doesn’t need to be Drupal code, or even PHP code; as long as it exists in a git repo you can access.
The real fun begins when you consider submodules. With this initial release I’ve included one submodule based on the PHPLOC tool. PHPLOC “is a tool for quickly measuring the size and analyzing the structure of a PHP project.” One of its great strengths is measuring cyclomatic complexity alongside a host of stats about the number of files, methods, lines of code, comments, etc.
The PHPLOC submodule uses the Project QA hook to run on every commit in a repository’s history. All data is saved into custom entities, as well as delta information from between git commits for faster and easier reporting. All of this information is exposed to views so writing custom reports is easy.
Some report ideas include:
Who is introducing the most complexity (for good or bad)?
Is code being checked in regularly during a project?
How does a particular module stack up against others?
Are your developers commenting their code enough?
Were there any major changes or spikes in the history of the code, indicating things such as a bad merge, extra modules or files, or maybe an onslaught of sloppy or duplicate code?
a.k.a. How I skim, read, and archive information on the web.
The more information I come across the faster and more efficiently I need to be able to separate the wheat from the chaff. Additionally, I need to make each moment worth more; each moment I spend working or scouring the internet as well as each moment I spend with my wife or kids. And whether or not we chose to publicly admit it, we all know down down that we’re not efficient multi-taskers and context-switching is very expensive.
So … here is how I’ve tried to divide up my time and make my information gathering more effective, efficient, and out of the way.
It all starts with the input. The two main input sources for me are RSS (in your face, Google Reader!) and Twitter. There’s no reason you can’t have other inputs, these are just the ones I use most. The reasons for software choices will become apparent as I walk through how I process this information.
For Twitter I use Tweetbot. There are two very straightforward reasons for this:
1. It’s available on all the platforms I use (OSX, iPhone, iPad) and reliably syncs my read location across all of them.
2. It integrates with a variety of read-later services, the use of which becomes essential later in subsequent steps
RSS isn’t quite as straightforward but works just as well. My aggregation point is (currently) Feedly though I do have a license for Fever that I’m strongly considering a switch to. I use two main apps to read RSS, which doesn’t cause a sync problem since any hosted RSS solution has the concept of “read” articles that provides the syncing for you.
On my laptop I use ReadKit, though to be honest I check my RSS news streams on my laptop far less often than on mobile devices. ReadKit connects to a number of RSS hosted aggregators and a number of read-later services. Combination options galore!
For the iOS devices I use Reeder 2 which follows the same trend, a lot of various connection options.
At this point in the process what I have is:
Two main article/news input sources, RSS and Twitter
Apps to read those sources across all my platforms
The ability to sync what I’ve skimmed
The ability to send what I want to really read to another (centralized) service. In my case, I’ve chosen Readability.
When I’m reading articles in Readability and want to add them to my personal archive I simply “favorite” them
My basic process is to set time aside to skim through all the inputs. I treat those times as little bonuses throughout my day to take a break and see what’s going on in the rest of the world. It really is a “skim” – I usually don’t spend much time on this at all. If anything looks like something I want focus on and dive in to I’ll send it to Readability for reading when I have more time and focus.
All of this is rather straightforward, until you get to my archival process. There are articles I find that I may want to reference later. I’ve tried just using Google to bring them back up. What I’ve found is that between the transient nature of websites and my own lack of memory (let’s face it, it my memory was perfect I wouldn’t need to refer back to 1 out of a million articles I previously read) my Google foo rarely finds what my conscious mind is vaguely recalling from a distant digital past. My own archive of favorite articles is the only way to reliably retrieve them and limit the search pool so I have a chance of needle-haystack success of finding them. My format of choice for archiving things like this is PDF.
This is where my choice of Readability as a read-later service becomes important. Read-later services usually provide the wonderful service of stripping out all the site-specific graphics and advertisements leaving just the valuable content behind for you to focus on. This feature is perfect for my archiving as well since I’m interested in archiving the information, not the styles and trends (or often times lack-thereof) of web design. Readability was the only service that reliably (and script-ably) could Print-to-PDF this stripped down view of the article.
To accomplish this archiving I use Fake. I don’t use Fake for a lot of things, but when I do it’s invaluable. Most times it is in lieu of APIs that don’t exist or aren’t available to me. You can think of Fake as Automator for Safari. Non-Mac people may respond better to similar to Selenium. If you’re not familiar with Fake be sure to check out the video on their homepage, it gives a great explanation.
So here’s the Fake part of my workflow:
In english, it does the following:
Goes to the Readability website
Logs in to the site with my credentials
Clicks on the “Favorites” link
Sets up a loop – for each article in my favorites list it does the following:
Clicks on the favorited article to load it
Grabs the article title
Uses keyboard shortcuts to tell OSX to Print-to-PDF (more on this next)
In the file save dialog, it enters the article title (grabbed previously) and hits return to save the file
Clicks on the link to un-favorite the article so it’s not processed again on a subsequent run
Clicks on the “Favorites” link to go back to the main favorites listing
Now that all the favorited articles have been processed, it logs out
Inside the loop above, in step 3, I mention keyboard shortcuts in OSX. By default most apps recognize command-p as the shortcut to print. You can see in the screenshot above of my Fake workflow that the embedded AppleScript actually calls this command-p shortcut twice. Why is this? MacSparky has a great video over on his site that outlines this trick. Basically, I’ve set up a separate keyboard shortcut that maps to command-p as well to choose the Save as PDF option once the print dialog comes up. This way my scripting can depend on the more reliable keyboard shortcuts instead of trying to code in ways to simulate mouse clicks on buttons.
Obligatory video demo:
At this point I’m running the Fake workflow manually as a vetting process to ensure it runs reliably. At some point it will be relegated to my Mac mini to run on a schedule (nightly?) and dump the PDFs into a watched folder for Hazel to process and file away for me.
For a while I’ve wanted to be able to add emails to whatever task manager I’m using, from any device I’m using. Currently I’m keeping things simple by using the default Reminders app for OSX and iOS, but this solution could work for any task management app that has an OSX counterpart and supports AppleScript.
I didn’t come up with this script entirely on my own, but I did customize it, alter the overall workflow, and work through some Gmail + Mavericks issues (now I’m starting to understand why Gmail’s IMAP bastardization is a real pain).
That will check all enabled accounts. If you want to only process one account you’ll have to modify the script a bit. You can adjust what goes in the notes field as well if you want (msgBody). I opted to copy in the text of the entire email in case I needed to reference it while I was on the go. The link to the original email is at the bottom, but that only works on OSX. I also tried to keep the From text short (and stripped out the email address) so that the display in Reminders is more readable.
To get the AppleScript running periodically to check do this task you’ll need to set it up to run via launchd or cron. I’m more familiar with cron, but it seems as though Apple leaning towards launchd and adding additional functionalities. I figured this would be a good simple task to learn launchd on so I’ll have it in my toolbelt for later on.
What you need to do is create a new .plist file in
~/Library/LaunchAgents . Mine is named
com.nateofnine.FlaggedEmailToReminders.plist and looks like this:
Hot on the heels of setting up my markdown-dropbox-blog workflow, I realized I needed a more reliable way to regenerate and rsync the rendered files. The workflow I had set up worked fine for new articles, but not for edits to already-published articles, it assumed I wanted to publish immediately, and didn’t offer any easy way to handle any sort of failure.
Each year I find myself torn when it comes to New Year’s resolutions. I simultaneously notice the arbitrary significance where one day is culturally more important than the surrounding identical days, while also understanding the benefit of cyclical patterns and motivation for renewal. Ultimately I settle into a position of trying to utilize cultural momentum without over emphasizing the occasion.
This year, coming off the recent heels of growing a creepy ‘stache to raise money for men’s health, I’m attracted to the concept of 30 day challenges. Matt Cutts, a prominent Google employee I admire and respect, tweeted a few times recently about considering 30 day challenges instead of traditional resolutions. Here’s a TED talk of him speaking to the idea and his experience with it:
Overall I love the concept, as well as the idea of refreshing my focus and motivation twelve times over the course of this year. Two years ago I found 2012 to be a very rough and difficult year. In comparison 2013 was incredible and I can feel the momentum flowing into 2014 and well. Knowing I won’t be setting myself up for failed massive resolutions mixes with that making me even more excited! My plan is to announce my focus for the month and report back a synopsis at the end of the month.
I’m going to ease in to the year, though. First up is a new take on a cliché goal. I’ll be logging my food intake with LoseIt! each day. I already use it, but often I’ll miss part or all of a day. Since I already have some momentum I’d like to see myself up the ante by being more regular, making myself healthier while also feeding my borg-like desire for data. I presume now the blog name makes a bit more sense?