Week 2: "Why spend 30 minutes on a task when you can spend 30 hours automating it?" - Every Computer Scientist ever.

Share This:

Monday, July 22, 2024

By:

Collins Kariuki

Hello there! In my last blog post, I shared the objectives of my internship at AIP, but I overlooked a key aspect of my role here. I am tasked with finding an effective way to automate the chapter activity report update process at AIP. To understand the current challenges, I decided to manually update the Zone 1 chapter activity Excel sheets. This hands-on approach would help me identify the best automation strategy. 

Having completed this manual update, I now realize why “proficient at Excel” isn’t on my resume. Although I can manage basic Excel formulas, the process was incredibly time-consuming, even with just 16 schools to update. This experience reminded me why I opted to learn Python instead of taking a Microsoft Office course back in Kenya. Teaching myself Python has been far more beneficial.

Despite the tedious nature of the manual work, it was essential. However, I knew I needed to automate this process to avoid repeating it for the remaining 17 zones. Thus, I combined my knowledge and expertise to develop a Python script to streamline this task. Let the automation journey begin!

I have a confession to make. I LOVE coding. I can easily spend hours hammering away at my laptop, immersed in writing code (whether it’s elegant or not is another matter). Even more, I enjoy documenting my coding process. So, I fired up Notion and began creating detailed notes with Toggle titles like there was no tomorrow. I was in the zone! Column by column, I updated the activity report, feeling ecstatic because my code worked. But it wasn’t without its challenges.

Let me share one problem I encountered. During the manual update process, I often relied on data consistency, such as using school names as the definitive reference for updating activity reports. However, inconsistencies can occur, such as one Excel sheet listing “University of Massachusetts - Amherst” and another listing “University of Massachusetts Amherst”. While these names look similar to us, they are different to a computer. How could I solve this?

My mind immediately went to the Levenshtein distance, a concept I learned in my Algorithms class. The Levenshtein distance, also known as the edit distance, measures the difference between two strings. It calculates the minimum number of single-character edits required to transform one string into the other. I thought, “I can code this easily.” But with limited time left in my internship, I decided to see if someone had already created a library for this. Thankfully, someone had (one of the many reasons I love Python). While I’m still finalizing everything, I’m excited because I get to code. And that’s my favorite part.

In addition to my work activities, I had some fun experiences this week. I toured NIST and saw the impressive million-pound-force machine (see attached picture). Visiting NIST was bittersweet, but I enjoyed exploring the beautiful campus and imagining a future where I might work there. Special thanks to Charlotte and Jenna for organizing the tour!

I also visited Ford’s Theatre (where President Lincoln was assassinated) and the accompanying museum with Kai and Sonja. It was fascinating to soak in the history, especially being able to see and touch artifacts over a century old, including entering the room where Lincoln succumbed to his injuries.

I played spike ball with some fellow interns near the Lincoln Memorial – such a quintessential DC experience! They got to see my competitive side, which was a lot of fun. Watching the sunset by the Potomac River was a beautiful and relaxing end to the week.

Overall, it was a wonderful week filled with memorable experiences. As my time in DC winds down, I’m going to miss the city and the amazing people I’ve met.

Collins Next to a Bust of Abraham Lincoln
NIST's Million-Pound-Force Machine
Sunset by the Potomac River

Collins Kariuki