Categories
Me

YearCompass 2023-2024

In this post, I use the free YearCompass booklet to reflect on 2023 and to plan some professional goals for 2024.

Table of Contents

Introduction

Towards the end of last year, I used YearCompass for the first time because I wanted to commit to some 2023 goals. YearCompass is a proven and long-lived framework with over 18k Facebook Likes and availability in 52 languages, so it made sense to try it out.

It went very well! So much so that I have used YearCompass again to choose my 2024 professional goals. The first half of this post covers 2023; the second half 2024.

Firstly, let’s examine YearCompass itself.

YearCompass

From the YearCompass site:

YearCompass is a free booklet that helps you reflect on the year and plan the next one. With a set of carefully selected questions and exercises, YearCompass helps you uncover your own patterns and design the ideal year for yourself.

YearCompass.com

YearCompass started as a reflection tool for a small group of friends and was made publicly available in 2012. It is available as an A4 and A5 PDF, with options to fill out the booklet both digitally and by hand. YearCompass is currently available in 52 languages.

YearCompass positions itself as an alternative to New Year’s Resolutions. Each PDF has two sections. The first half examines the previous year and the second half considers the next one.

Each section consists of a series of prompts and questions. These guide the user through the reflection process and help them identify their priorities and plan for the future.

Some of the questions are:

  • What are you most proud of?
  • Who are the three people who influenced you the most?
  • What does the year ahead of you look like?

While prompts include:

  • List your three biggest challenges from last year.
  • This year, I will be bravest when…
  • I want to achieve these three things the most.

There are no hard and fast rules for completing YearCompass. The book suggests a break between sections, although some prefer to do the whole thing in one sitting. Personally, I don’t complete every section as by a certain point I have what I need.

This year, I had my 2022 and 2023 YearCompass PDFs open side by side. It made sense for 2022’s compass to inform 2023’s, and it gave me an idea of which goal-setting approaches worked best.

2023 Retrospective

In this section, I look back at my 2023 goals and see how things went with them.

Confidence Building & Anxiety Management

This goal was focused on self-belief. I wanted to bolster my confidence and improve my technical skillset.

2022 saw my first tech conference at AWS Summit London. While London is an intense experience for a socially anxious shark, it successfully expanded my comfort zone by putting me around unfamiliar faces with similar interests.

2023’s summit was easier on the senses, and I had chats with suppliers and AWS Solution Architects about topics including data lineage, lakehousing and event orchestration.

PXL 20230607 1106476362

I also attended some events closer to home: May’s DTX Manchester and October’s Data Relay Manchester.

PXL 20230517 142206077

Separately, I used my DataCamp subscription to improve my Data Engineering skills. Their Data Engineer track has several great Python courses and helped me benchmark my T-SQL skills. Their Object-Oriented Programming in Python and Writing Functions in Python courses also helped me plug some work-related gaps.

Helpfully, DataCamp has a new My Year In Data feature that summarises the 17 courses I completed this year:

DataCamp HoursLearning
DataCamp XPDayStreak

Finally, I also recertified my AWS SysOps Administrator certification in October using the now-traditional duo of Stephane Maarek and Tutorials Dojo. This certification validates my experience in deploying, managing, and operating workloads on AWS and puts me in a good position for my 2024 goals.

AWSSysOpsBadge

Collaborating & Communicating

This goal was focused on finding my voice and improving my work quality. I wanted to strengthen my contributions and increase my value.

A big part of this was making sure that I understood the languages and terms being used around me. My ultimate aim was to use and apply these terms correctly and appropriately. 2023 was the year I became familiar with data and programming terms including:

I also signed up for some Data Engineering-focused feeds to improve my industry knowledge. Examples include the Data Engineering Weekly and Seattle Data Guy newsletters, and the Advancing Analytics and nullQueries YouTube channels.

Collaboration-wise, I also hosted my first T-SQL Tuesday this year. It was great to work with Steve Jones, and I get to include myself on a pretty illustrious list of industry professionals now!

Finally, I also made it to my first User Group meeting! While I only made it to one event this year, I overcame a lot of personal anxiety there and look forward to exploring my local user groups more in 2024.

Knowledge Sharing & Presenting

This goal was focused on creating value. I wanted to improve my presentation skills, and find real-world applications for the knowledge gained from my posts and certifications.

It’s time to discuss New Stars of Data!

Bicuit & Terrabyte on a New Stars Of Data T-shirt

Upon viewing the event’s Call For Speakers, I saw a great chance to work on this goal. I’d already started writing a VSCode Data Wrangler post at the time (which ultimately became the New Stars Of Data Retrospective post) and quickly realised the post would lend itself very well to the requested abstract.

When creating the session, the combination of a sport I enjoy, data and code I’m familiar with and an impressive VSCode extension resulted in a smooth journey from storyboarding to delivery. The session was a pleasure to create and deliver, and was exactly the presenting introduction I was after!

I also wrote a blog series while creating the session, both as something to look back on and to potentially help future speakers.

2024 Goals

In this section, I use YearCompass to decide on my 2024 professional goals. For each goal, I’ll write a user story and then explain my reasoning.

Build Technology Projects

As a cloud enthusiast, I want to complete valuable project builds so that I can develop and validate my knowledge and skills, and have subject matter for future session abstracts.

It’s fair to say that 2023 has been a year of learning, with sources including:

All of which have given me ideas for things I can build! Some completely new things. Some things that have been gaining steam for a while. Other things that recent innovations have put within reach.

My first YearCompass 2024 goal is to start building them! As well as testing my skills and validating my knowledge, some of these projects would probably lend themselves to a session abstract or two!

Additionally, I’m considering studying towards an AWS Professional certification in 2025. So if I decide to go ahead with that, building a bunch of stuff would be well worth the effort and investment!

Finally, although I’ve gotten better at finishing projects since starting amazonwebshark there’s always room for improvement. This No Boilerplate video about The Cult Of Done Manifesto really resonated with me upon first watch, and I’ll be benchmarking my 2024 projects against it.

Build My Personal Brand

As an IT professional I want to build my personal brand so that I improve my soft skills and define my public image.

Continuing the build theme, my second YearCompass 2024 goal is focused on my soft skills and visibility.

I’ve spoken about confidence and anxiety previously. I will always be working on this, but it isn’t something I want to hide behind. As my contributions to this site and the wider community grow, I need to consider how those contributions influence the projected image of my personality, skills, experience, and behaviour.

Furthermore, in an age where AI tools and models are getting increasingly adept at a range of tasks, practising and demonstrating my soft skills is perhaps more important than ever. As technology becomes increasingly democratised, it is no longer enough to focus on technical skills alone.

I’ve already begun to establish my personal brand via amazonwebshark and social media. With my 2024 goals likely to put me in front of more fresh faces for the first time, now is definitely the time to make my personal brand a primary focus.

Build A Second Brain

As just a normal man I want to build a second brain so that I can organise my resources and work more efficiently.

For my final YearCompass 2024 goal, I want to take steps to solve a long-standing problem.

I have lots of stuff. Books, files, hyperlinks, videos…STUFF. Useful stuff, but unorganised and unstructured stuff. I also have lots of ideas. Ideas for efficiency and growth. Ideas for reliability and resilience. And I have various ways of capturing these ideas depending on where I happen to be. Even my car has a notepad.

Finally, and perhaps most importantly, I have several partially-enacted systems for handling all of this. Some systems turned out to be unfit for purpose. Some were overwhelmed, while others became unwieldy.

Recently, I’ve made efforts to organise and define everything. I’m already finding success with this, and with the recent discovery of the Second Brain and CODE methodologies I now have a framework to utilise. A well-built second brain will help organise my backlog, assist my day-to-day and support my future goals.

Summary

In this post, I used the free YearCompass booklet to reflect on 2023 and to plan some professional goals for 2024.

Having finished this post, I’m happy with my 2024 goals and am looking forward to seeing where the year takes me! I’ll post updates here and via my social, project and session links, which are available via the button below:

SharkLinkButton 1

Thanks for reading ~~^~~

Categories
DevOps & Infrastructure

Migrating amazonwebshark To SiteGround

In this post, I examine the process of migrating amazonwebshark to SiteGround and give an overview of the processes involved.

Table of Contents

Introduction

When I started amazonwebshark I had to make some infrastructure decisions. I registered the domain name with Amazon Route 53, and then needed to choose a blog hosting platform.

In December 2021 I took advantage of a Bluehost offer and paid £31.90 for their Basic WordPress Hosting package. This included a variety of services including:

My Bluehost renewal came through earlier this month, priced at £107.76. I’ve had great service from Bluehost and have no complaints, but that price was quite a leap. So, before I accepted it, I decided to do some research and see what my alternatives were.

Hosting Alternatives

In this section, I go through the results of my research into alternative hosting platforms.

Amazon Lightsail

I started by looking at Amazon Lightsail. Essentially, Lightsail is a simplified way of deploying AWS services like EC2, EBS and Elastic Load Balancing.

Lightsail pricing differs from most AWS services. Instead of the common Pay-As-You-Go pricing model, Lightsail has set monthly pricing. For example, a Linux server with similar memory, processing and storage to my Bluehost server currently costs $3.50 a month.

There is an important difference though. While Bluehost has teams of people responsible for tasks like server maintenance, database recovery, hard disk failures and security patches, with Lightsail the infrastructure would become my responsibility. I would save money over Bluehost, but at the cost of doing my own systems admin.

And the above list isn’t even exhaustive! It doesn’t include the setup and maintenance of services like CDN, SSL certificates and email accounts, all of which come with their own extra requirements and costs.

At this point, Bluehost was still on top.

SiteGround

SiteGround is in the same business as Bluehost. It offers a variety of hosting solutions for an array of use cases and has good standing in the industry.

SiteGround also had a great Black Friday offer this year! It was offering pretty much the same deal as Bluehost at £1.99 a month:

2022 11 25 SitegroundOfferScroll

This is INSANELY cheap, especially considering how much all this infrastructure costs to run!

SiteGround has also developed a free WordPress plugin to automate migrations from other hosting platforms. While this isn’t unique to them, a combination of good reviews, extensive services, low hassle and a great price was more than enough to get me on board.

Migrate What Exactly?

Before continuing, I thought it best to go into a bit of detail about what exactly is being migrated. I’ve mentioned servers, databases and domains, but what gets moved where? And why?

Well, because I’m moving things around on the Internet, I need to talk about the Domain Name System (DNS).

Wait! Come back!

Explain DNS Like I’m 5

What follows is a very simple introduction to DNS. There’s far more to DNS than this, but that’s beyond the scope of this post.

Let’s say I want to phone The Shark Trust. I can’t type “The Shark Trust” into my handset – I need their phone number. So I open my phone book, turn to the S section and find The Shark Trust. Next to this entry is a phone number: 01752 672008. I type that number into the handset and get through to their office.

DNS is like the Internet’s phone book. Websites are held on servers, and the ‘phone numbers’ for those servers are IP addresses. When I request a website like amazonwebshark.com, my web browser needs to know the IP address for the server holding the site’s data, for example 34.91.95.18.

Explain DNS With Pictures

This WebDeasy diagram show DNS at a high level:

WebDeasy: How the Domain Name System (DNS) works – Basics

When a URL is entered into a web browser, a query is sent to a DNS server. Using the phone book analogy, the web browser is asking the DNS server for the amazonwebshark.com phone number.

DNS servers don’t have any IP addresses, but they know which ‘phone book’ to look in. These ‘phone books’ are called name servers. The DNS server finds and contacts the right name server, which matches the amazonwebshark.com domain name to an IP address.

The DNS server then returns this IP address to the web browser, which uses it to contact the server hosting the amazonwebshark.com resources.

In the diagram, the DNS-Server represents Route 53. Route 53 holds DNS records for the amazonwebshark.com domain name, and knows where to find the name servers that have the amazonwebshark.com IP address.

The webdeasy.de server represents the Bluehost name servers. These servers can answer a variety of DNS queries, and are considered the ground truth for initial site visits and browser caching.

amazonwebshark’s DNS Setup

At the start of December 2022 the amazonwebshark.com domain name was hosted by Route 53, with an NS record pointing at the Bluehost name servers:

2022 12 02 Route53Bluehost

The basic infrastructure looked like this, with outbound requests in blue and inbound responses in green:

2022 12 27 amazonwebsharkDNSdiagram

And that’s it! To further explore DNS core concepts, this DNSimple comic is well worth a read and this Fireship video gives a solid, if a little more technical, account:

Data Migration

In this section, I start migrating my amazonwebshark data from Bluehost to SiteGround.

SiteGround has an automated migrator plugin that copies existing WordPress sites from other hosting platforms. And it’s very good! The process boils down to:

The process can also be seen in this Avada video:

The plugin copies all the amazonwebshark server files, scripts and database objects in a process that takes about five minutes. SiteGround then provides a temporary URL for testing and performance checks:

2022 12 02 SiteGroundMigratonComplete

After completely migrating amazonwebshark to SiteGround, the next step involves telling the amazonwebshark domain where to find the new server. Time for some DNS!

DNS Migration

In this section, I update the amazonwebshark DNS records with the SiteGround name servers.

I repointed the existing amazonwebshark NS record from Bluehost to SiteGround by updating the values in Route 53 from this:

2022 12 02 Route53Bluehost

To this:

2022 12 02 Route53SiteGround

My change then needed to propagate through the Internet. Internet Service Providers update their records at different rates, so changes can take up to 72 hours to complete worldwide.

Free DNS checking tools like WhatIsMyDNS can perform global checks on a domain name’s IP address and DNS record information. The check below was done after around 30 hours, by which time most of the servers were returning SiteGround IPs:

2022 12 06 DNSPropagationCheck

Any Problems?

First, the good news. There was no downtime while migrating amazonwebshark to SiteGround! During the migration, DNS queries were resolved by either Bluehost’s or SiteGround’s name servers. Both platforms had amazonwebshark data, so both could answer DNS queries.

Additionally, as I set a change freeze on amazonwebshark until the migration was over, there was no lost or orphaned content.

I did lose some WPStatistics hit statistics data though. There is no data for December 03 and December 04:

2022 12 18 WPStatisticsHits

This was my fault. The DNS propagation took longer than it should have because of a misunderstanding on my part!

So why was data lost? WPStatistics stores data in tables in the site’s MySQL database. When I first migrated my data on December 02, the Bluehost and SiteGround tables were the same. After that point, Bluehost continued to serve amazonwebshark until December 05, and wrote its statistics in the Bluehost MySQL tables.

It was only after I corrected my DNS mistake that SiteGround could start serving content and writing statistics on the SiteGround MySQL tables. So SiteGround didn’t record anything for December 03 and December 04, and as no additional data migration was done the statistics that Bluehost recorded never made it to the SiteGround tables.

I can recover this if I want to though. I took a full backup of my Bluehost data before ending the contract. That included a full backup of the Bluehost MySQL database with the WPStatistics tables. I’ll take a look at the tables at some point, see how the data is arranged and decide from there.

Future Plans

I’m considering moving amazonwebshark to a serverless architecture in 2023. While the migration was a success, servers still have inherent problems:

  • Servers can break or go offline.
  • They can be hacked.
  • They can be over or under-provisioned.

Serverless infrastructure could remove those pain points. I don’t use any WordPress enterprise features, and amazonwebshark could exist very well as an event-driven static website. Tools like Hugo and Jekyll are designed for the job and documented well, and people like Kendra Little and Chrissy LeMaire have successfully transitioned their blogs to serverless infrastructures.

The biggest challenge here isn’t architectural. If I moved to a serverless architecture, I would want something similar to the Yoast SEO analysis plugin. This plugin has really helped me improve my posts, and by extension has made them more enjoyable to write.

I’ve seen lots of serverless tooling for migrating resources and serving content, but not so much for SEO guidance and proofreading. Any amazonwebshark serverless migration would be contingent on finding something decent along these lines. After all, if the blog becomes a pain to write for then what’s the point?

Summary

In this post, I examined the process of migrating amazonwebshark to SiteGround and gave an overview of the processes involved.

I’m very happy with how things went overall! The heavy lifting was done for me, both companies were open and professional throughout and what could have been a daunting process was made very simple!

If this post has been useful, please feel free to follow me on the following platforms for future updates:

Thanks for reading ~~^~~

Categories
Developing & Application Integration

Production Code Qualities

In this post, I respond to November 2022’s T-SQL Tuesday #156 Invitation and give my thoughts on some production code qualities.

tsql tuesday

Table of Contents

Introduction

This month, Tomáš Zíka’s T-SQL Tuesday invitation was as follows:

Which quality makes code production grade?

Please be as specific as possible with your examples and include your reasoning.

Good question!

In each section, I’ll use a different language. Firstly I’ll create a script, and then show a problem the script could encounter in production. Finally, I’ll show how a different approach can prevent that problem from occurring.

I’m limiting myself to three production code qualities to keep the post at a reasonable length, and so I can show some good examples.

Precision

In this section, I use T-SQL to show how precise code in production can save a data pipeline from unintended failure.

Setting The Scene

Consider the following SQL table:

USE [amazonwebshark]
GO

CREATE TABLE [2022].[sharkspecies](
	[shark_id] [int] IDENTITY(1,1) NOT NULL,
	[name_english] [varchar](100) NOT NULL,
	[name_scientific] [varchar](100) NOT NULL,
	[length_max_cm] [int] NULL,
	[url_source] [varchar](1000) NULL
)
GO

This table contains a list of sharks, courtesy of the Shark Foundation.

Now, let’s say that I have a data pipeline that uses data in amazonwebshark.2022.sharkspecies for transformations further down the pipeline.

No problem – I create a #tempsharks temp table and insert everything from amazonwebshark.2022.sharkspecies using SELECT *:

When this script runs in production, I get two tables with the same data:

2022 11 02 SQLResults1

What’s The Problem?

One day a new last_evaluated column is needed in the amazonwebshark.2022.sharkspecies table. I add the new column and backfill it with 2019:

ALTER TABLE [2022].sharkspecies
ADD last_evaluated INT DEFAULT 2019 WITH VALUES
GO

However, my script now fails when trying to insert data into #tempsharks:

2022 11 02 SQLResults2Sharp
(1 row affected)

(4 rows affected)

Msg 213, Level 16, State 1, Line 17
Column name or number of supplied values does not match table definition.

Completion time: 2022-11-02T18:00:43.5997476+00:00

#tempsharks has five columns but amazonwebshark.2022.sharkspecies now has six. My script is now trying to insert all six sharkspecies columns into the temp table, causing the msg 213 error.

Doing Things Differently

The solution here is to replace row 21’s SELECT * with the precise columns to insert from amazonwebshark.2022.sharkspecies:

While amazonwebshark.2022.sharkspecies now has six columns, my script is only inserting five of them into the temp table:

2022 11 02 SQLResults3Sharp

I can add the last_evaluated column into #tempsharks in future, but its absence in the temp table isn’t causing any immediate problems.

Works The Same In Other Environments

In this section, I use Python to show the value of production code that works the same in non-production.

Setting The Scene

Here I have a Python script that reads data from an Amazon S3 bucket using a boto3 session. I pass my AWS_ACCESSKEY and AWS_SECRET credentials in from a secrets manager, and create an s3bucket variable for the S3 bucket path:

When I deploy this script to my dev environment it works fine.

What’s The Problem?

When I deploy this script to production, s3bucket will still be s3://dev-bucket. The potential impact of this depends on the AWS environment setup:

Different AWS account for each environment:

  • dev-bucket doesn’t exist in Production. The script fails.

Same AWS account for all environments:

  • Production IAM roles might not have any permissions for dev-bucket. The script fails.
  • Production processes might start using a dev resource. The script succeeds but now data has unintentionally crossed environment boundaries.

Doing Things Differently

A solution here is to dynamically set the s3bucket variable based on the ID of the AWS account the script is running in.

I can get the AccountID using AWS STS. I’m already using boto3, so can use it to initiate an STS client with my AWS credentials.

STS then has a GetCallerIdentity action that returns the AWS AccountID linked to the AWS credentials. I capture this AccountID in an account_id variable, then use that to set s3bucket‘s value:

More details about get_caller_identity can be found in the AWS Boto3 documentation.

For bonus points, I can terminate the script if the AWS AccountID isn’t defined. This prevents undesirable states if the script is run in an unexpected account.

Speaking of which…

Prevents Undesirable States

In this section, I use PowerShell to demonstrate how to stop production code from doing unintended things.

Setting The Scene

In June I started writing a PowerShell script to upload lossless music files from my laptop to one of my S3 buckets.

I worked on it in stages. This made it easier to script and test the features I wanted. By the end of Version 1, I had a script that dot-sourced its variables and wrote everything in my local folder $ExternalLocalSource to my S3 bucket $ExternalS3BucketName:

#Load Variables Via Dot Sourcing
. .\EDMTracksLosslessS3Upload-Variables.ps1


#Upload File To S3
Write-S3Object -BucketName $ExternalS3BucketName -Folder $ExternalLocalSource -KeyPrefix $ExternalS3KeyPrefix -StorageClass $ExternalS3StorageClass

What’s The Problem?

NOTE: There were several problems with Version 1, all of which were fixed in Version 2. In the interests of simplicity, I’ll focus on a single one here.

In this script, Write-S3Object will upload everything in the local folder $ExternalLocalSource to the S3 bucket $ExternalS3BucketName.

Problem is, the $ExternalS3BucketName S3 bucket isn’t for everything! It should only contain lossless music files!

At best, Write-S3Object will upload everything in the local folder to S3 whether it’s music or not.

At worst, if the script is pointing at a different folder it will start uploading everything there instead! PowerShell commonly defaults to C:\Windows, so this could cause all kinds of problems.

Doing Things Differently

I decided to limit the extensions that the PowerShell script could upload.

Firstly, the script captures the extensions for each file in the local folder $ExternalLocalSource using Get-ChildItem and [System.IO.Path]::GetExtension:

$LocalSourceObjectFileExtensions = Get-ChildItem -Path $ExternalLocalSource | ForEach-Object -Process { [System.IO.Path]::GetExtension($_) }

Then it checks each extension using a ForEach loop. If an extension isn’t in the list, PowerShell reports this and terminates the script:

ForEach ($LocalSourceObjectFileExtension In $LocalSourceObjectFileExtensions) 

{
If ($LocalSourceObjectFileExtension -NotIn ".flac", ".wav", ".aif", ".aiff") 
{
Write-Output "Unacceptable $LocalSourceObjectFileExtension file found.  Exiting."
Start-Sleep -Seconds 10
Exit
}

So now, if I attempt to upload an unacceptable .log file, PowerShell raises an exception and terminates the script:

**********************
Transcript started, output file is C:\Files\EDMTracksLosslessS3Upload.log

Checking extensions are valid for each local file.
Unacceptable .log file found.  Exiting.
**********************

While an acceptable .flac file will produce this message:

**********************
Transcript started, output file is C:\Files\EDMTracksLosslessS3Upload.log

Checking extensions are valid for each local file.
Acceptable .flac file.
**********************

To see the code in full, as well as the other problems I solved, please check out my post from June.

Summary

In this post, I responded to November 2022’s T-SQL Tuesday #156 Invitation and gave my thoughts on some production code qualities. I gave examples of each quality and showed how they could save time and prevent unintended problems in a production environment.

Thanks to Tomáš for this month’s topic! My previous T-SQL Tuesday posts are here.

If this post has been useful, please feel free to follow me on the following platforms for future updates:

Thanks for reading ~~^~~