Categories
AI & Machine Learning

Microsoft AI-900: Artificial Fintelligence

In this post, I talk about my recent experience with the Microsoft AI-900 certification and the resources I used to study for it.

Table of Contents

Introduction

On 04 November 2022, I earned the Microsoft Certified Azure AI Fundamentals certification. I’ve had my eye on the AI-900 since passing the SC-900 over Summer. Last month I found the time to sit down with it properly! This is my fourth Microsoft certification, joining my other badges on Credly.

Firstly, I’ll talk about my motivation for studying for the Microsoft AI-900. Then I’ll talk about the resources I used and how they fitted into my learning plan.

Motivation

In this section, I’ll talk about my reasons for studying for the Microsoft AI-900.

Increased Effectiveness

A common Data Engineering task is extracting data. This usually involves structured data, which have well-defined data models that help to organise and map the data available.

Sources of structured data include:

  • CSV data extracts.
  • Excel spreadsheets.
  • SQL database tables.

Increasingly, insights are being sought from unstructured data. This is harder to extract, as unstructured data aren’t arranged according to preset data models or schemas.

Examples of unstructured data sources include:

  • Inbound correspondence.
  • Recorded calls.
  • Social media activity.

Historically, extracting unstructured data needed special equipment, complex software and dedicated personnel. In recent years, public cloud providers have produced Artificial Intelligence and Machine Learning services aimed at quickly and easily extracting unstructured data.

In the case of Microsoft Azure, these include:

Knowing that these tools exist and understanding their use cases will help me create future data pipelines and ETL processes for unstructured data sources. This will add value to the data and will make me a more effective Data Engineer.

And on that note…

Skill Diversification

Recently I was introduced to the idea of T-shaped skills in a CollegeInfoGeek article by Ransom Patterson. Ransom summarises a T-shaped person as having:

…deep knowledge/skills in one area and a broad base of general supporting knowledge/skills.

“The T-Shaped Person: Building Deep Expertise AND a Wide Knowledge Base”Ransom Patterson on CollegeInfoGeek
t-shaped skills

Ransom’s article made me realise that I’ve been developing T-shaped skills for a while. I’ve then applied these skills back to my Data Engineering role. For example:

My studying for the AI-900 is a continuation of this. This isn’t me saying “I want to be a Machine Learning Engineer now!” This is me seeing a topic, being interested in it and examining how it could be useful for my personal and professional interests.

Multi-Cloud Fluency

This kind of follows on from T-shaped skills.

Earlier in 2022, Forrest Brazeal examined the benefits of multi-cloud fluency, and built a case summarised in one of his tweets:

This applies to the data world pretty well, as many public cloud services can interact with each other across vendor boundaries.

For example:

With multi-cloud fluency, decisions can be made based on using the best services for the job as opposed to choosing services based on vendor or familiarity alone.

This GuyInACube video gives an example of this using the Microsoft Power BI Service:

To connect the Power BI Service to an AWS data source, a data gateway needs to be running on an EC2 instance to handle authentication. This introduces server costs and network management.

Conversely, data stored in Azure (Azure SQL Database in the video) can be accessed by other Azure services with a single click. As a multi-cloud fluent Data Engineer in this scenario, I now have options where previously there was only one choice.

Improved multi-cloud fluency means I can use AWS for some jobs and Azure for others, in the same way that I use Windows for some jobs and Linux for others. It’s about having the knowledge and skills to choose the best tools for the job.

Resources

In this section, I’ll talk about the resources I used to study for the Microsoft AI-900.

John Savill

John Savill’s Technical Training YouTube channel started in 2008. Since then he’s created a wide range of videos from deep dives to weekly updates. In addition, he has numerous playlists for many Microsoft certifications including the AI-900.

Having watched John’s SC-900 video I knew I was in good hands. John has a talent for simple, straightforward discussions of important topics. His AI-900 video was the first resource I used when starting to study, and the last resource I used before taking the exam.

Exceptional work as usual John!

Microsoft Learn

microsoft learn logo

Microsoft Learn was my main study resource for the AI-900. It has a lot going for it! The content is up to date, the structure makes it easy to dip in and out and the knowledge checks and XP system keep the momentum up.

To start, I attended one of Microsoft’s Virtual Training Days. The courses are free, and their AI Fundaments course currently provides a free certification voucher when finished. Microsoft Product Manager Loraine Lawrence presented the course and it was a great introduction to the various Azure AI services.

Complimenting this, Microsoft Learn has a free learning path with six modules tailed for the AI-900 exam. These modules are well-organised and communicate important knowledge without being too complex.

The modules include supporting labs for learning reinforcement. The labs are well documented and use the Azure Portal, Azure Cloud Shell and Git to build skills and real experience.

I didn’t end up using the labs due to time constraints, but someone else had me covered on that front…

Andrew Brown

Andrew Brown is the CEO of ExamPro. He has numerous freeCodeCamp videos, including his free AI-900 one.

I’ve used some of Andrew’s AWS resources before and found this to be of his usual high standard. The video is four hours long, with dozens of small lectures that are time-stamped in the video description. This made it easy to replay sections during my studies.

Andrew also includes two hours of him using Azure services like Computer Vision, Form Recognizer and QnAMaker. This partnered with the Microsoft Learn material very well and helped me understand and visualise topics I wasn’t 100% on.

Summary

In this post, I talked about my recent experience with the Microsoft AI-900 certification and the resources I used to study for it. I can definitely use the skills I’ve picked up moving forwards, and the certification is some great self-validation!

If this post has been useful, please feel free to follow me on the following platforms for future updates:

Thanks for reading ~~^~~

Categories
Developing & Application Integration

Production Code Qualities

In this post, I respond to November 2022’s T-SQL Tuesday #156 Invitation and give my thoughts on some production code qualities.

tsql tuesday

Table of Contents

Introduction

This month, Tomáš Zíka’s T-SQL Tuesday invitation was as follows:

Which quality makes code production grade?

Please be as specific as possible with your examples and include your reasoning.

Good question!

In each section, I’ll use a different language. Firstly I’ll create a script, and then show a problem the script could encounter in production. Finally, I’ll show how a different approach can prevent that problem from occurring.

I’m limiting myself to three production code qualities to keep the post at a reasonable length, and so I can show some good examples.

Precision

In this section, I use T-SQL to show how precise code in production can save a data pipeline from unintended failure.

Setting The Scene

Consider the following SQL table:

USE [amazonwebshark]
GO

CREATE TABLE [2022].[sharkspecies](
	[shark_id] [int] IDENTITY(1,1) NOT NULL,
	[name_english] [varchar](100) NOT NULL,
	[name_scientific] [varchar](100) NOT NULL,
	[length_max_cm] [int] NULL,
	[url_source] [varchar](1000) NULL
)
GO

This table contains a list of sharks, courtesy of the Shark Foundation.

Now, let’s say that I have a data pipeline that uses data in amazonwebshark.2022.sharkspecies for transformations further down the pipeline.

No problem – I create a #tempsharks temp table and insert everything from amazonwebshark.2022.sharkspecies using SELECT *:

When this script runs in production, I get two tables with the same data:

2022 11 02 SQLResults1

What’s The Problem?

One day a new last_evaluated column is needed in the amazonwebshark.2022.sharkspecies table. I add the new column and backfill it with 2019:

ALTER TABLE [2022].sharkspecies
ADD last_evaluated INT DEFAULT 2019 WITH VALUES
GO

However, my script now fails when trying to insert data into #tempsharks:

2022 11 02 SQLResults2Sharp
(1 row affected)

(4 rows affected)

Msg 213, Level 16, State 1, Line 17
Column name or number of supplied values does not match table definition.

Completion time: 2022-11-02T18:00:43.5997476+00:00

#tempsharks has five columns but amazonwebshark.2022.sharkspecies now has six. My script is now trying to insert all six sharkspecies columns into the temp table, causing the msg 213 error.

Doing Things Differently

The solution here is to replace row 21’s SELECT * with the precise columns to insert from amazonwebshark.2022.sharkspecies:

While amazonwebshark.2022.sharkspecies now has six columns, my script is only inserting five of them into the temp table:

2022 11 02 SQLResults3Sharp

I can add the last_evaluated column into #tempsharks in future, but its absence in the temp table isn’t causing any immediate problems.

Works The Same In Other Environments

In this section, I use Python to show the value of production code that works the same in non-production.

Setting The Scene

Here I have a Python script that reads data from an Amazon S3 bucket using a boto3 session. I pass my AWS_ACCESSKEY and AWS_SECRET credentials in from a secrets manager, and create an s3bucket variable for the S3 bucket path:

When I deploy this script to my dev environment it works fine.

What’s The Problem?

When I deploy this script to production, s3bucket will still be s3://dev-bucket. The potential impact of this depends on the AWS environment setup:

Different AWS account for each environment:

  • dev-bucket doesn’t exist in Production. The script fails.

Same AWS account for all environments:

  • Production IAM roles might not have any permissions for dev-bucket. The script fails.
  • Production processes might start using a dev resource. The script succeeds but now data has unintentionally crossed environment boundaries.

Doing Things Differently

A solution here is to dynamically set the s3bucket variable based on the ID of the AWS account the script is running in.

I can get the AccountID using AWS STS. I’m already using boto3, so can use it to initiate an STS client with my AWS credentials.

STS then has a GetCallerIdentity action that returns the AWS AccountID linked to the AWS credentials. I capture this AccountID in an account_id variable, then use that to set s3bucket‘s value:

More details about get_caller_identity can be found in the AWS Boto3 documentation.

For bonus points, I can terminate the script if the AWS AccountID isn’t defined. This prevents undesirable states if the script is run in an unexpected account.

Speaking of which…

Prevents Undesirable States

In this section, I use PowerShell to demonstrate how to stop production code from doing unintended things.

Setting The Scene

In June I started writing a PowerShell script to upload lossless music files from my laptop to one of my S3 buckets.

I worked on it in stages. This made it easier to script and test the features I wanted. By the end of Version 1, I had a script that dot-sourced its variables and wrote everything in my local folder $ExternalLocalSource to my S3 bucket $ExternalS3BucketName:

#Load Variables Via Dot Sourcing
. .\EDMTracksLosslessS3Upload-Variables.ps1


#Upload File To S3
Write-S3Object -BucketName $ExternalS3BucketName -Folder $ExternalLocalSource -KeyPrefix $ExternalS3KeyPrefix -StorageClass $ExternalS3StorageClass

What’s The Problem?

NOTE: There were several problems with Version 1, all of which were fixed in Version 2. In the interests of simplicity, I’ll focus on a single one here.

In this script, Write-S3Object will upload everything in the local folder $ExternalLocalSource to the S3 bucket $ExternalS3BucketName.

Problem is, the $ExternalS3BucketName S3 bucket isn’t for everything! It should only contain lossless music files!

At best, Write-S3Object will upload everything in the local folder to S3 whether it’s music or not.

At worst, if the script is pointing at a different folder it will start uploading everything there instead! PowerShell commonly defaults to C:\Windows, so this could cause all kinds of problems.

Doing Things Differently

I decided to limit the extensions that the PowerShell script could upload.

Firstly, the script captures the extensions for each file in the local folder $ExternalLocalSource using Get-ChildItem and [System.IO.Path]::GetExtension:

$LocalSourceObjectFileExtensions = Get-ChildItem -Path $ExternalLocalSource | ForEach-Object -Process { [System.IO.Path]::GetExtension($_) }

Then it checks each extension using a ForEach loop. If an extension isn’t in the list, PowerShell reports this and terminates the script:

ForEach ($LocalSourceObjectFileExtension In $LocalSourceObjectFileExtensions) 

{
If ($LocalSourceObjectFileExtension -NotIn ".flac", ".wav", ".aif", ".aiff") 
{
Write-Output "Unacceptable $LocalSourceObjectFileExtension file found.  Exiting."
Start-Sleep -Seconds 10
Exit
}

So now, if I attempt to upload an unacceptable .log file, PowerShell raises an exception and terminates the script:

**********************
Transcript started, output file is C:\Files\EDMTracksLosslessS3Upload.log

Checking extensions are valid for each local file.
Unacceptable .log file found.  Exiting.
**********************

While an acceptable .flac file will produce this message:

**********************
Transcript started, output file is C:\Files\EDMTracksLosslessS3Upload.log

Checking extensions are valid for each local file.
Acceptable .flac file.
**********************

To see the code in full, as well as the other problems I solved, please check out my post from June.

Summary

In this post, I responded to November 2022’s T-SQL Tuesday #156 Invitation and gave my thoughts on some production code qualities. I gave examples of each quality and showed how they could save time and prevent unintended problems in a production environment.

Thanks to Tomáš for this month’s topic! My previous T-SQL Tuesday posts are here.

If this post has been useful, please feel free to follow me on the following platforms for future updates:

Thanks for reading ~~^~~

Categories
Me

Fixing A Broken Tap With George Pólya

In this post, I use the principles in “How To Solve It” by George Pólya to diagnose and fix my broken kitchen tap. Yes – really.

Table of Contents

Introduction

Bit of a change this time. Let me set the scene.

It’s time to top up Wolfie’s water bowl, so to the kitchen sink we go. Two unexpected events happen when the tap is turned on:

  1. The water flow goes mental and starts spraying everywhere.
  2. Something gets launched out of the tap into the water bowl:
Aerator Initial

My first thought is that the tap is broken and that I’ll need to buy a new one. And then get a plumber to fit it. Great.

But wait. Last year I fixed some broken panes in our greenhouse. This year I’ve built a potting bench, fixed a leaky water butt and mounted a shower rail. Is this a problem I can solve?

This Doesn’t Sound Like Technology

True. It is, though, a chance to write a post I’ve fancied doing for a while. And this set of circumstances was too compelling to pass up.

Last year I became aware of a book called “How To Solve It” by George Pólya. The recommendation included a chart based on the book, similar to this one:

Source: KPMathematics

What struck me was how close these steps were to the Systems Development Life Cycle I was taught at college. My interest was piqued.

Around the same time, I was getting to grips with my new Data Engineer role. Since then, I’ve used the “How To Solve It” principles to help me complete work both for my role and for this blog.

Now, faced with a new unfamiliar situation, I can demonstrate how the “How To Solve It” principles can be applied beyond mathematics. In this case I’m fixing a broken tap, but this could just as easily be a Python bug, a poorly performing SQL query or an AWS authentication issue.

Here is my plan:

  • Firstly, I’ll examine the “How To Solve It” book.
  • Secondly, I’ll look at the author of the book – George Pólya.
  • Then I’ll look at each of the George Pólya principles, relating them to the broken tap problem I want to solve.

Let’s start with the book.

How To Solve It

Source: Penguin

‘A superb book on how to think fresh thoughts … A walk inside Pólya’s mind as he builds up maxims on how to comprehend a problem, how to build up a strategy, and then how to test it.’

David Bodanis, Guardian

‘Everyone should know the work of George Polya on how to solve problems’

Marvin Minsky

How To Solve It can be bought on Penguin’s website.

History

How To Solve It was written in 1945 by George Pólya. Since then, the book has stayed in print and has been translated into over a dozen languages. It has sold more than 1 million copies, making it one of the most widely circulated mathematics books in history.

Four Principles

How To Solve It explains in non-technical terms how to think about invention, discovery, creativity and analysis. Central to this are four principles:

  1. First. You have to understand the problem.
  2. Second. Find the connection between the data and the unknown. You may be obliged to consider auxiliary problems if an immediate connection cannot be found. You should obtain eventually a plan of the solution.
  3. Third. Carry out your plan
  4. Fourth. Examine the solution obtained.

The book also poses several questions for each principle. They aim to stimulate thought and produce the answers needed to satisfy each principle.

These can be seen in the below image from the book’s first edition:

2022 10 25 HowToSolveItInsideCover

They are also available as text from the University of Utah’s summary of 1957’s second edition.

Although How To Solve It was written with mathematics in mind, the book’s principles have been applied to additional disciplines over the decades. Pólya seems to take great care not to limit the scope of How To Solve It, speaking of problems in general terms throughout the book.

One such example is this extract:

A great discovery solves a great problem but there is a grain of discovery in the solution of any problem. Your problem may be modest; but if it challenges your curiosity and brings into play your inventive faculties, and if you solve it by your own means, you may experience the tension and enjoy the triumph of discovery.

“How To Solve It” – George Pólya

How To Solve It remains in high regard to this day. The Math Sorcerer produced this video in July 2022, and his affection for the book is clear.

Sources

Next, let’s look at the book’s author – George Pólya.

George Pólya

George Pólya
Source: MacTutor

George Pólya (December 13 1887 – September 7 1985 aged 97) was a Hungarian mathematician. He was a professor of mathematics from 1914 to 1940 at ETH Zürich in Hungary, and from 1940 to 1953 at Stanford University in North America having moved there during World War 2.

After retiring from Stanford, Pólya remained active in his field. He continued his association with Stanford as Professor Emeritus well into his 90s and taught a course in their Computer Science Department in 1978.

Works

In pure mathematics, Pólya made important discoveries in fields including probability, real and complex analysis, combinatorics, geometry, number theory and mathematical physics.

Several of his discoveries bear his name, including:

Pólya also authored and contributed to numerous books and articles throughout his life, a selection of which can be seen on Wikipedia.

Recognition

Pólya was well-regarded by his peers and awards given to him included:

“He has given a new dimension to problem-solving by emphasizing the organic building up of elementary steps into a complex proof, and conversely, the decomposition of mathematical invention into smaller steps.”

and:

“Problem solving a la Polya serves not only to develop mathematical skill but also teaches constructive reasoning in general.”

Sources

There is much more to know about Pólya. The following links detail his life, works and legacy in far greater detail:

Now I’m going to apply the George Pólya principles to my broken tap!

Applying The Principles

In the following sections, I will apply each George Pólya principle from How To Solve It to my tap problem. In each section I will:

  • Quote each principle in full.
  • State the supporting questions that I’ll answer.
  • Relate these to my tap problem.

Principle 1: Understanding The Problem

First. You have to understand the problem.

“How To Solve It” – George Pólya
  • What is the unknown? What are the data? What is the condition?

The Unknown

The unknown is what I want. Here, I want to restore the tap’s original flow rate.

The Data

The data is the information available. This is what was expelled from the tap:

Aerator Initial

Other data:

  • Water was still flowing from the tap.
  • The flow of water was under more pressure than before.

The Condition

The condition is the link between the unknown and the data. Here, whatever has come out of the tap has changed the water’s flow but hasn’t obstructed it.

Principle 2: Devising A Plan

Second. Find the connection between the data and the unknown. You may be obliged to consider auxiliary problems if an immediate connection cannot be found. You should obtain eventually a plan of the solution.

“How To Solve It” – George Pólya
  • Do you know a related problem? Do you know a theorem that could be useful?
  • Here is a problem related to yours and solved before. Could you use it? Could you use its result? Could you use its method?

I searched Google for the phrase “kitchen tap water flow changed”. There was an immediate common thread in the results:

2022 10 26 GoogleResults

The Estes Services link gave a useful definition of an aerator:

“The aerator on your faucet is a mesh screen and covers the water outlet. The aerator catches minerals and other debris in your pipes. It also helps save water by introducing air into the water stream.”

“How to Fix Low Water Pressure in Kitchen” on Estes

Getting somewhere! This led me to “Everything you Need to Know About Tap Aerators” on TapWarehouse, which includes this:

“They save you water by adding oxygen to the flow (and that means saving pennies) and reduce splashing around the bowl of the basin.”

“Everything you Need to Know About Tap Aerators” on TapWarehouse

Mesh screen? Reduced splashing? This definitely sounded like the right area!

Solved Problems

At this point, what came out of the tap sounded very much like an aerator. However, there’s no cleaning something that’s disintegrated, so it was time for a replacement.

TapWarehouse to the rescue again:

“If your existing tap already has an aerator, simply turn it anticlockwise until it’s unscrewed from the tap. Then, simply screw in the new aerator until it’s secure, being careful not to screw it too tightly.”

How can I Install a Tap Aerator?” on TapWarehouse

TapWarehouse also gave advice on aerator types. There are male and female aerators depending on the tap. There are also various aerator sizes ranging from 16mm to 28mm.

Planned Solution

Based on this research, the solution needed the following steps:

  • Remove the broken aerator.
  • Confirm the aerator type.
  • Confirm the aerator size.
  • Buy a replacement aerator.
  • Fit the replacement aerator.
  • Test the replacement aerator.

Principle 3: Carrying Out The Plan

Third. Carry out your plan.

“How To Solve It” – George Pólya
  • Carrying out your plan of the solution, check each step. Can you see clearly that the step is correct?

Time to remove the broken aerator! Straight into a problem. It wouldn’t budge.

Fortunately, there’s a DIY StackExchange! Advice ranged from WD-40 to vinegar to a hammer and chisel (!), but in the end I used my heat gun on the aerator and removed it with pliers.

I then determines that I needed a 24mm male aerator as a replacement. One trip to B&Q later and:

Aerator Replacement

Fitting the new aerator was a simple matter of screwing it on.

Principle 4: Looking Back

Fourth. Examine the solution obtained.

“How To Solve It” – George Pólya
  • Can you check the result?

BEHOLD:

Tap with running water

Summary

In this post, I used the principles in “How To Solve It” by George Pólya to diagnose and fix my broken kitchen tap. I applied each of the Pólya principles to my problem, and was able to solve it by answering the relevant questions and doing some investigation with the knowledge gained.

If this post has been useful, please feel free to follow me on the following platforms for future updates:

Thanks for reading ~~^~~