Posts tagged: python

All posts with the tag "python"

268 posts latest post 2026-03-31
Publishing rhythm
Jan 2026 | 3 posts

002

** 0.3.0 just launched with _ support πŸŽ‰

1 min

Kedro Static Viz 0.3.0 is out with Hooks Support

kedro-static-viz is out with support for the newly released hooks feature. This means that you can have kedro-static-viz automatically deploy a full gatsby site before_pipeline_run keeping your visualization always up to date.

Even though it is a static site there is no functionality lost. The only thing that’s missing is the flask server. With kedro-static-viz you can deploy your visualization to a number of static hosting providers such as GitHub pages free of charge with wicked...

...

Create Custom Kedro Dataset

Kedro provides an efficient way to build out data catalogs with their yaml api. It allows you to be very declaritive about loading and saving your data. For the most part you just need to tell Kedro what connector to use and its filepath. When running Kedro takes care of all of the read/write, you just reference the catalog key.

Under the hood there is an AbstractDataSet that each connector inherits from. It sets up a lot of the behind the scenes structure for us so that we dont have to. For the most part kedro has connectors for about anything that you want to load, csv, parquet, sql, json, from about anywhere, http, s3, localfile system are just some of the examples.

Here is a DataSet implementation from their docs. Here you can see the barebones example straight from the docs. Parameters from the yaml catalog will get passed in

Interrogate is a pretty awesome, brand new, cli for Python packages

As usual while listening to python bytes 181 I heard of a tool that I had to try out right away!

This thing is πŸ”₯ hot off the press folks, we’re talking the first release only 3 weeks ago. Its something that the python community needed years ago, and it belongs in your CI today. I had tried several tools that tried to do docstring coverage in the past but they were a bit cumbersome and were quickly forgotten about. Not interrogate, its dead simple!

Nothing I have tried has come close to being this good

...

2 min read

creating the kedro-preflight hook

Kedro Hooks Intro - kedro hooks are an exciting upcoming feature of kedro 0.16.0. They allow you to hook into catalog_created,pipeline_run, and node_run(nouns). With a before, or after (adjective). This really reminds me of reacts lifecycle hooks, that let you hook into various state of react web components. This is going to make kedro so extendable by the community. I am super pumped to see what the community is able to do with this ability.

kedro hooks are an exciting upcoming feature of kedro 0.16.0. They allow you to hook into catalog_created,pipeline_run, and node_run(nouns). With a before, or after (adjective). This really reminds me of reacts lifecycle hooks, that let you hook into various state of react web components. This is going to make kedro so extendable by the community. I am super pumped to see what the community is...

...

πŸ“ Kedro Preflight Notes

This is a very rough idea for a kedro package to prevent time lost to get partway through a pipeline run only to realize that you dont have access to data or resources.

TIL: Bind arguments to dynamically generated lambdas in python

This past week I had a really weird bug in my kedro pipeline. For some reason data running through my pipeline was coming out completely made no sense, but if I manually request raw data outside of the pipeline it matched expectations.

NOTE While this story is about a kedro pipeline, it can be applied anywhere closures are put into an iterable.

After a few days of looking at it off and on, I pinpointed that it was all the way down in the raw layer. Right as data is coming off of the database. For this I already had existing sql files stored and a read_sql function to get the data so I opted to just set up the pipeline to utilize the existing code as much as possible, leaning on the

...

2 min read

python-deepwatch

Is it possible to deep watch a single python function for changes?

keeping track of a python functions hash is quite simple. There is a__hash__ method attached to every python function. Calling it will return a hash of the function. If the function changes the hash will change.

[ins] In [1]: def test(): ...: return "hello" [ins] In [2]: test.__hash__() Out[2]: 8760526380347 [ins] In [3]: test.__hash__() Out[3]: 8760526380347 [ins] In [4]: def test(): ...: return "hello world" [ins] In [5]: test.__hash__() Out[5]: 8760525617988 [ins] In [6]: def test(): ...: return "hello" [ins] In [7]: test.__hash__() Out[7]: 8760526380491

Using hashlib provides a consistent hash.

...

1 min read

Four Github Actions for Python

If you are developing python packages and using GitHub here are four actions that you can use today to automate your release workflow. Since python tools generally have such a simple cli I have opted to use the cli for most of these, that way I know exactly what is happening and have more control over it if I need.

If you are developing python packages and using GitHub here are four actions that you can use today to automate your release workflow. Since python tools generally have such a simple cli I have opted to use the cli for most of these, that way I know exactly what is happening and have more control over it if I need.

flake8 is pythons quintessential linting tool to ensure that your code is up to the standards that you have set for the project, and to help prevent hidden bugs. I am a heavy user of black and isort as well, but for ci flake8 is typically considered the gold standard. black and isort will help you automate many...

...

Variables names don't need their type

So often I see a variables type() inside of its name and it hurts me a little inside. Tell me I’m right or prove me wrong below.

Pandas DataFrames are probably the worst offender that I see

# bad sales_df = get_sales() # good sales = get_sales()

Sometimes vanilla structures too!

...

What is Kedro

What is Kedro

This is my original what-is-kedro article. There is a brand new one

Kedro is an open source data pipeline framework. It provides guardrails to set your project up right from the start without needing to know deeply how to setup your own python library for data pipelining. It includes really great ways to manipulate catalogs and pipelines. This article will cover the 10K view of kedro, future articles will dive deper into each one.

...

Long variable names are good

🏷️ Long variable names are a good thing. Self documenting code is more important than poorly documented code. Simply adding a few characters to your variable names can go a long ways.

Scope is important

1 min read

simple click

cli tools are super handy and easy to add to your python libraries to supercharge them. Even if your library is not a cli tool there are a number of things that a cli can do to your library.

Things a cli can do to enhance your library.

πŸ†š print version πŸ•Ά print readme πŸ“ print changelog πŸ“ƒ print config ✏ change config πŸ‘©β€πŸŽ“ run a tutorial πŸ— scaffold a project with cookiecutter

...

SqlAlchemy Models

Make a connection # from sqlalchemy import create_engine def get_engine(): return create_engine("sqlite:///mode_examples.sqlite") Make a session # from sqlalchemy.orm import sessionmaker def get_session(): con = get_engine() Base.bind = con Base.metadata.create_all() Session = sessionmaker(bind=con) session = Session() return session Make a Base Class # from sqlalchemy.ext.declarative import declarative_base Base = declarative_base() Base.metadata.bind = get_engine() Make your First Model # class User(Base): __tablename__ = "users" username = Column('username', Text()) firstname = Column('firstname', Text()) lastname = Column('lastname', Text()) Make your own Base Class to inherit From # class MyBaseHelper: def to_dict(self): return {k: v for k, v in self.__dict__.items() if k[0] != "_"} def update(self, **attrs): for key, value in attrs.items(): if hasattr(self, key): setattr(self, key, value) Use the Custom Base Class # class User(Base, MyBaseHelper):...
1 min read

Building Cli apps in Python

Click primarily takes two forms of inputs Options and arguments. I think of options as keyword argument and arguments as regular positional arguments.

**From the Docs

To get the Python argument name, the chosen name is converted to lower case, up to two dashes are removed as the prefix, and other dashes are converted to underscores.

...

1 min read

Kedro

See all of my kedro related posts in [[ tag/kedro ]].

I am tweeting out most of these snippets as I add them, you can find them all here #kedrotips.

Below are some quick snippets/notes for when using kedro to build data pipelines. So far I am just compiling snippets. Eventually I will create several posts on kedro. These are mostly things that I use In my everyday with kedro. Some are a bit more essoteric. Some are helpful when writing production code, some are useful more usefule for exploration.

...