Check out aws [1] and their project aws-cli [2].
Universal Command Line Interface for Amazon Web Services
References:
[1]: https://github.com/aws
[2]: https://github.com/aws/aws-cli
Publishing rhythm
Create Configurable Kedro Hooks
There are two main ways to create kedro hooks, with modules and classes. Each
one still uses the same verbiage as the function/method names.
Class hooks seem a bit special as they give you a way to configure them so that they are a bit more generally useful.
What is Kedro [1]
If you are completely unsure what kedro is be sure to check out my what is kedro [2] post
Installation # [3]
.create a new environment manager of choice. Here I will use conda. Then we will install kedro from pypi.
conda create -n kedro_class_hooks -y
conda activate kedro_class_hooks # may also be source activate kedro_class_hooks or activate kedro_class_hooks
pip install kedro
Create a sample project # [4]
Kedro new # [5]
For more details check out my full post on kedro new [6]
For this post I really just want a working pipeline as fast as possible. For this I am going to use iris pipeline that is generated from the kedro new command in the cli. It’s important that you answer y to create an example pi...
Brainstorming Kedro Hooks
This post is a 🧠 branstorming work in progress. I will likely use it as a
storage location/brain dump of hook ideas.
What is Kedro 🤔 # [1]
If you are completely unsure what kedro is be sure to check out my what is kedro [2] post
after_catalog_created # [3]
- filepath replacer
- bucket replacer
before_pipeline_run # [4]
- preflight
- check that data exists
- run kedro_static_viz
- run mypy
- run interrogate
- run flake8
after_pipeline_run # [5]
- Great Expectations
- send email
- send slack
before_node_run # [6]
after_node_run # [7]
- Great Expectations
- save stats/meta data
-
Execution Order # [8]
hooks are executed in reverse order of the hooks list.
hooks with tryfirst will be moved to the end of the list
hooks with trylast will be moved to the end of the list
- after_catalog_created
- before_pipeline_run
- args
- run_params = run_params = {‘run_id’: ‘2020-05-23T15.24.23.958Z’, ‘project_path’: ‘/mnt/c/temp/kedro0160’, ’env’: ’local’, ‘kedro_version’: ‘...
How to get Dev Comments from an article Url
I want to incorporate some of the wonderful comments, \U0001F495, \U0001F984,
and \U0001F516’s that I have been getting on dev.to on my website. I have
dabbled once or twice with no avail this time I am taking notes on my journey,
so follow along and let’s get there together. By the end of this post, I will
have a way to get comments from posts on the client-side thanks to the
wonderfully open dev.to API.
I want to incorporate some of the wonderful comments, 💕, 🦄, and 🔖’s that I have been getting on dev.to on my website. I have dabbled once or twice with no avail this time I am taking notes on my journey, so follow along and let’s get there together. By the end of this post, I will have a way to get comments from posts on the client-side thanks to the wonderfully open dev.to API.
The API # [1]
dev.to has an open API that allows us to easily get comments as HTML [2]. They have their API hosted at https://docs.forem.com/api/#tag/comments, let’s take a look at it.
[3]
Here we can...
Four github actions for your website
GitHub’s actions are a new GitHub feature that will trigger GitHub to spin up a
virtual machine and run some tasks with some special access to your repo. It
can interact with comments/issues, it can clone your repo, You can explicitly
pass in secrets so that it can commit back to the repo or deploy to another
service. The environment may be a Linux, windows, or even a mac machine. I
believe this is wildly incredible for the open-source community, putting these
tools in the same place that we are already collaborating is so convenient.
What can they do for my personal website? 🤔 # [1]
GitHub actions can give you confidence that your site is up and running, with the latest JavaScript packages, does not have broken links, and can even take screenshots of what your website looks like on different screen sizes and operating systems.
- periodically check that the website is up
- update npm
- url checker
- screenshot website
srt32/uptime [2] # [3]
srt32/uptime [2] is an action that...
Adding google fonts to a gatsbyjs site
stack overflow link [1]
References:
[1]: https://stackoverflow.com/questions/47488440/how-do-i-add-google-fonts-to-a-gatsby-site
Create Custom Kedro Dataset
Kedro provides an efficient way to build out data catalogs with their yaml api. It allows you to be very declaritive about loading and saving your data. For the most part you just need to tell Kedro what connector to use and its filepath. When running Kedro takes care of all of the read/write, you just reference the catalog key.
But what is happening behind the scenes # [1]
Under the hood there is an AbstractDataSet that each connector inherits from. It sets up a lot of the behind the scenes structure for us so that we dont have to. For the most part kedro has connectors for about anything that you want to load, csv, parquet, sql, json, from about anywhere, http, s3, localfile system are just some of the examples.
Here is a DataSet implementation from their docs. Here you can see the barebones example straight from the docs. Parameters from the yaml catalog will get passed in
from pathlib import Path
import pandas as pd
from kedro.io import AbstractDataSet
class MyOwnDataSet(...
Interrogate is a pretty awesome, brand new, cli for Python packages
As usual while listening to python bytes 181 [1] I heard of a tool that I had to try out right away!
This thing is 🔥 hot off the press folks, we’re talking the first release only 3 weeks ago. Its something that the python community needed years ago, and it belongs in your CI today. I had tried several tools that tried to do docstring coverage in the past but they were a bit cumbersome and were quickly forgotten about. Not interrogate, its dead simple!
Nothing I have tried has come close to being this good
Interrogate # [2]
It runs documentation coverage for your python project. It allows you to set the minimum amount of docstring coverage for your project and has some great setup instructions right in the readme.
Install it # [3]
Interrogate is on pypi so it is super simple to install with pip
pip install interrogate
run it # [4]
This is the best part, its super easy to run right from the command line! Just call it, and give it a path to run.
interrogate -v <path>
😲 I hav...
Just starred pyp [1] by hauntsaninja [2]. It’s an exciting project with a lot to offer.
Easily run Python at the shell! Magical, but never mysterious.
References:
[1]: https://github.com/hauntsaninja/pyp
[2]: https://github.com/hauntsaninja
I like econchick’s [1] project interrogate [2].
Explain yourself! Interrogate a codebase for docstring coverage.
References:
[1]: https://github.com/econchick
[2]: https://github.com/econchick/interrogate
drawing ascii boxes
When creating cli’s I often want some nice full-width character. I find it tough to find them, and when I do half the time it is an image or something that cannot be copied 👿.
I rarely get very complex with my semi-manual ASCII art. I can do 98% of what I need with bars and corners. Using some simple full-width characters can really give your cli a nice clean look.
Example # [1]
I’d say 50% of what I need is just a full-width horizontal bar to give some visual flair or separation.
[2]
Bars # [3]
― ⍽ ⎸ ⎹ ␣ ─ ━ │ ┃
Square Corners # [4]
┌ ┍ ┎ ┏ ┐ ┑ ┒ ┓ └ ┕ ┖ ┗ ┘ ┙ ┚ ┛
Round Corners # [5]
╭ ╮ ╯ ╰ ╱ ╲ ╳
Harpoons # [6]
⃑ ⃬ ⃭ ↼ ↽ ↾ ↿ ⇀ ⇁ ⇂ ⇃ ⇋ ⇌ ⥊ ⥋ ⥌ ⥍ ⥎ ⥏ ⥐ ⥑ ⥒ ⥓ ⥔ ⥕ ⥖ ⥗ ⥘ ⥙ ⥚ ⥛ ⥜ ⥝ ⥞ ⥟ ⥠ ⥡ ⥢ ⥣ ⥤ ⥥ ⥦ ⥧ ⥨ ⥩ ⥪ ⥫ ⥬ ⥭ ⥮ ⥯
Double Boxes # [7]
═ ║ ╒ ╓ ╔ ╕ ╖ ╗ ╘ ╙ ╚ ╛ ╜ ╝ ╞ ╟ ╠ ╡ ╢ ╣ ╤ ╥ ╦ ╧ ╨ ╩ ╪ ╫ ╬
Dashed Boxes # [8]
┄ ┅ ┆ ┇ ┈ ┉ ┊ ┋╌ ╍ ╎ ╏
Connectors # [9]
├ ┝ ┞ ┟ ┠ ┡ ┢ ┣ ┤ ┥ ┦ ┧ ┨ ┩ ┪ ┫ ┬ ┭ ┮ ┯ ┰ ┱ ┲ ┳ ┴ ┵ ┶ ┷ ┸ ┹ ┺ ┻ ┼ ┽ ┾ ┿ ╀ ╁ ╂ ╃ ╄ ╅ ╆ ╇ ╈ ╉ ╊ ╋
Others # [10]
☐ ☑ ☒ ⫍ ⫎ ⮹ ⮽...
Check out rec [1] and their project safer [2].
🧷 A safer writer 🧷
References:
[1]: https://github.com/rec
[2]: https://github.com/rec/safer
creating the kedro-preflight hook
Kedro Hooks Intro - kedro hooks are an exciting upcoming feature of kedro
0.16.0. They allow you to hook into catalog_created,pipeline_run, and
node_run(nouns). With a before, or after (adjective). This really
reminds me of reacts lifecycle hooks, that let you hook into various state of
react web components. This is going to make kedro so extendable by the
community. I am super pumped to see what the community is able to do with this
ability.
kedro hooks are an exciting upcoming feature of kedro 0.16.0. They allow you to hook into catalog_created,pipeline_run, and node_run(nouns). With a before, or after (adjective). This really reminds me of reacts lifecycle hooks, that let you hook into various state of react web components. This is going to make kedro so extendable by the community. I am super pumped to see what the community is able to do with this ability.
What is Kedro [1]
If you are completely unsure what kedro is be sure to check out my what is kedro post
Docs # [2]
a w...
📝 Kedro Preflight Notes
This is a very rough idea for a kedro package to prevent time lost to get partway through a pipeline run only to realize that you dont have access to data or resources.
Must Haves # [1]
- check that inputs exist or are of a type to skip (sql)
Good to haves
- check that all input and output databases are accessible with good credentials
- check for s3 bucket access
- check for spark install
Implementation # [2]
@hook_spec
def before_pipeline_run(run_params, pipeline, catalog):
run params # [3]
{
"run_id": str
"project_path": str,
"env": str,
"kedro_version": str,
"tags": Optional[List[str]],
"from_nodes": Optional[List[str]],
"to_nodes": Optional[List[str]],
"node_names": Optional[List[str]],
"from_inputs": Optional[List[str]],
"load_versions": Optional[List[str]],
"pipeline_name": str,
"extra_params": Optional[Dict[str, Any]]
}
References:
[1]: #must-haves
[2]: #implementation
[3]: #run-params
The work on autoflake [1] by fake-name [2].
Removes unused imports and unused variables as reported by pyflakes
References:
[1]: https://github.com/fake-name/autoflake
[2]: https://github.com/fake-name
Check out trys [1] and their project sergey [2].
A tiny lil’ static site generator
References:
[1]: https://github.com/trys
[2]: https://github.com/trys/sergey
Maintianing multiple git remotes
git remote -v
git remote add gitlab <url>
git push gitlab main
📢 Announcing find-kedro
find-kedro is a small library to enhance your kedro experience. It looks through your modules to find kedro pipelines, nodes, and iterables (lists, sets, tuples) of nodes. It then assembles them into a dictionary of pipelines, each module will create a separate pipeline, and __default__ being a combination of all pipelines. This format is compatible with the kedro _create_pipelines format.
[1]
[2]
[3]
[4] # [5]
kedro is a ✨ fantastic project that allows for super-fast prototyping of data pipelines, while yielding production-ready pipelines. find-kedro enhances this experience by adding a pytest like node/pipeline discovery eliminating the need to bubble up pipelines through modules.
When working on larger pipeline projects, it is advisable to break your project down into different sub-modules which requires knowledge of building python libraries, and knowing how to import each module correctly. While this is not too difficult, in some cases, it can trip up even the most se...
Explicit vs Implicit Returns in Javascript
Often when reading through javascript examples you will find some arrow
functions use parentheses () while others use braces {}. This key
difference is that parentheses will implicitly return the last statement while
braces require an explicit return statement. It is important to understand the
difference between them because it is likely that you will find code examples
of both and trying to edit code written differently than you’re used to may
have unintended consequences.
[1] # [2]
Arrow functions are one-liner functions in javascript that have two main syntactical ways to create the code block. with parentheses and braces. Let’s take a look at both ways of creating arrow functions so that when we come accross them in the wild it will all make sense.
[3] # [4]
Here is an example of an arrow function that will implicitly return the last
statement without the return keyword. I believe that these are a bit more restricted
in that you cannot set variables inside them. They are ...
Twitter deepdives
Inspired by Chris Achard
My ideas # [1]
Python # [2]
- List comps
- Classes
- Inheritance
- Background
- Click
- Lambdas
Kedro # [3]
- Cataloging
- Custom datasets
- Reusable pipelines
- find-kedro
Learn kedro in 5 days # [4]
Email course inspired by learn d3 in 5 days
Mail # [5]
- Share your knowledge
- Practice
- Practice in public
- Make practice easy
- Share your notes
- Digital Gardening
- Own your content
- Build your audience
- Be nice
- Have empathy
- Learn your way
- Continuous learning
References:
[1]: #my-ideas
[2]: #python
[3]: #kedro
[4]: #learn-kedro-in-5-days
[5]: #mail