Archive

All published posts

2440 posts latest post 2026-04-21
Publishing rhythm
Apr 2026 | 41 posts

Automating my Post Starter

One thing we all dread is mundane work of getting started, and all the hoops it takes to get going. This year I want to post more often and I am taking some steps towards making it easier for myself to just get started.

When I start a new post I need to cd into my blog directory, start neovim in a markdown file with a clever name, copy some frontmatter boilerplate, update the post date, add tags, a description, and a cover.

hot and fast

...

Windowing Python Lists

In python data science we often will reach for pandas a bit more than necessary. While pandas can save us so much there are times where there are alternatives that are much simpler. The itertoolsandmore-itertools` are full of cases of this.

This post is a walkthrough of me solving a problem with more-itertools rather than reaching for a for loop, or pandas.

I am working on a one-line-link expander for my blog. I ended up doing it, just by modifying the markdown with python. I first split the post into lines with content.split('\n'), then look to see if the line appears to be just a link. One more safety net that I wanted to add was to check if there was whitespace around the line, this could not simply be done in a list comprehension by itself. I need just a bit of knowledge of the surrounding lines, enter more-itertools.

...

1 min read

Adding Audio to my blog posts

This is episode 1 of the Waylon Walker Audio experience, posts from waylonwalker.com{.hoverlink} in audio form.

So I have had this idea for awhile to add audio to my blog posts. The idea partly comes from the aws blog, if you have ever been on their blog you will have noticed that they have a voiced by amazon polly section.

Honestly I don’t know this is all new to me and I dont have much to go off of. For now its a test that may or may not work out.

...

Expand One Line Links

I wanted a super simple way to cross-link blog posts that require as little effort as possible, yet still looks good in vanilla markdown in GitHub. I have been using a snippet that puts HTML into the markdown. While this works, it’s more manual/difficult for me does not look the best, and does not read well as

The new card should be fully automated to expand with title, description, and cover image. Bonus if I am able to attach a comment behind it.

If you can call it a card 🤣. This card was just an image wrapped in an anchor tag and a paragraph tag. I found this was the most consistent way to get an image narrower and centered in both GitHub and dev.to.

...

Find and Replace in the Terminal.

grepr # grepr() {grep -iRl "$1" | xargs sed -i "s/$1/$2/g"} ```bash grepr() {grep -iRl "$1" | xargs sed -i "s/$1/$2/g"} grepd # grepd() {grep -iRl "$1" | xargs sed -i "/^$1/d"} CocSearch # :CocSearch published: false -g *.md

Resume Tips

customize for the job Why are you a good fit? What will you bring to the role? Give real outcomes give real experience Stop tech vomiting if you link to GitHub Make a profile readme Guide me to your best work have some activity if you link to LinkedIn Provide some benefit that is not on your resume Have a logical flow of experience (dont make me hunt for past experience) Keep it under 2 pages Who you know. Reference real experience Deployed 12 data pipelines with over 500 nodes to process 200GB of data at a Fortune 100 company vs Knowledge of Data Engineering methodology with python EC2 Dont be so fluffy
1 min read
Codeit Bro Interview

Codeit Bro Interview

use this profile image

Please share your professional role as a data scientist? [Also feel free to share about your personal projects, publications, etc.]

I graduated with a Mechanical Engineering Degree 8 years ago. Much of my work early in my career was wrapped around analyzing larger datasets for my group to understand quality, drive changes to improve quality or prove that quality was already good.

...

7 min read

reasons-to-kedro

There are many reasons that you should be using kedro. If you are on a team of Data Scientists/Data Engineers processing DataFrames from many data sources should be considering a pipeline framework. Kedro is a great option that provides many benefits for teams to collaborate, develop, and deploy data pipelines

What is Kedro

Kedro makes it super easy to get started with their cli that utilizes cookiecutter under the hood.

...

Reasons to Kedro

Reasons to Kedro # collaboration Sharable catalog small nodes over monolithic notebooks catalog easily load anything without needing to run No need to write read/write code pipeline No need to keep execution order in your head easily run a slice of a pipeline plugins pip install make your own hooks flexible expandable cli Reasons Not to Kedro # Already utilizing another DAG framework Data is not in a widely supported format Micro short-lived project Large Project / Deadline Use a lower profile project to learn first Team not willing to change Need minimal dependencies God Project - kedro owns everything??
1 min read

What's New in Kedro 0.16.6

Kedro 0.16.6 is out! Let’s take a look through the release notes

This is really exciting to see more deployment options coming from the kedro team. It really shows the power of the framework. The power of some of these orchestrations options is incredible.

Most of them hinge on a sweet combination of the kedro cli, docker image, and the pipeline knowing your nodes dependencies.

...

A brain dump of stories

I started making stories as kind of a brain dump a few times per day and posting them to [LinkedIn](https://www.linkedin.com/in/waylonwalker/(https://www.linkedin.com/in/waylonwalker/). Here are the last 11 days of stories.

I store all the stories on my website with the hopes of doing something with them on my own platform eventually. For now it makes it easy to make these posts.

cd static/stories ls | xargs -I {} echo '![](https://waylonwalker.com/stories/{})'

Stories 10-10-2020 - 10-21-2020 #

1 min read

Fix git commit author

I was 20 commits into a hackoberfest PR when I suddenly realized they they all had my work email on them instead of my personal email 😱. This is the story of how I corrected my email address on 19 individual commits after already submitting for a PR.

stop the bleeding

Before anything else set the email correctly!

...

3 min read

Designing a "Router" for kedro

I released a router-like plugin for kedro back in April 2020. This was not the first design, the idea actually came from one of the QB folks who taught me kedro nearly a year before. We were assembling our pipelines with something called nodes_global. It worked fairly well but did have some issues around being set as a global variable.

But…

One thing in particular that it did not lend itself well to was being able to create a packagable pipeline that I could pip install and append into any of my existing pipelines. Something I am still trying to work out, maybe I don’t need this. I think I have it working for our internal pipelines and it seems like the way to go, but we don’t necessarily end up using it.

...

4 min read

Reclaim memory usage in Jupyter

Today I ran into an issue where we had a one-off script that just needed to work, but it was just chewing threw memory like nothing.

It started with a colleague asking me How do I clear the memory in a Jupyter notebook, these are the steps we took to debug the issue and free up some memory in their notebook.

How do I clear the memory in a Jupyter notebook?

...

3 min read

Strip Trailing Whitespace from Git projects

A common linting error thrown by various linters is for trailing whitespace. I most often use flake8. I generally have [pre-commit](https://waylonwalker.com/pre-commit-is-awesome hooks setup to strip this, but sometimes I run into situations where I jump into a project without it, and my editor lights up with errors. A simple fix is to run this one-liner.

bash

git grep -I --name-only -z -e '' | xargs -0 sed -i -e 's/[ \t]\+\(\r\?\)$/\1/'

...