Posts - Page 15 of 22 - amazonwebshark

Production Code Qualities

Post author By Damien Jones
Post date November 8, 2022

In this post, I respond to November 2022’s T-SQL Tuesday #156 Invitation and give my thoughts on some production code qualities.

Introduction
Precision
Works The Same In Other Environments
Prevents Undesirable States
Summary

Introduction

This month, Tomáš Zíka’s T-SQL Tuesday invitation was as follows:

Which quality makes code production grade?

Please be as specific as possible with your examples and include your reasoning.

Good question!

In each section, I’ll use a different language. Firstly I’ll create a script, and then show a problem the script could encounter in production. Finally, I’ll show how a different approach can prevent that problem from occurring.

I’m limiting myself to three production code qualities to keep the post at a reasonable length, and so I can show some good examples.

Precision

In this section, I use T-SQL to show how precise code in production can save a data pipeline from unintended failure.

Setting The Scene

Consider the following SQL table:

USE [amazonwebshark]
GO

CREATE TABLE [2022].[sharkspecies](
	[shark_id] [int] IDENTITY(1,1) NOT NULL,
	[name_english] [varchar](100) NOT NULL,
	[name_scientific] [varchar](100) NOT NULL,
	[length_max_cm] [int] NULL,
	[url_source] [varchar](1000) NULL
)
GO

This table contains a list of sharks, courtesy of the Shark Foundation.

Now, let’s say that I have a data pipeline that uses data in amazonwebshark.2022.sharkspecies for transformations further down the pipeline.

No problem – I create a #tempsharks temp table and insert everything from amazonwebshark.2022.sharkspecies using SELECT *:

When this script runs in production, I get two tables with the same data:

What’s The Problem?

One day a new last_evaluated column is needed in the amazonwebshark.2022.sharkspecies table. I add the new column and backfill it with 2019:

ALTER TABLE [2022].sharkspecies
ADD last_evaluated INT DEFAULT 2019 WITH VALUES
GO

However, my script now fails when trying to insert data into #tempsharks:

(1 row affected)

(4 rows affected)

Msg 213, Level 16, State 1, Line 17
Column name or number of supplied values does not match table definition.

Completion time: 2022-11-02T18:00:43.5997476+00:00

#tempsharks has five columns but amazonwebshark.2022.sharkspecies now has six. My script is now trying to insert all six sharkspecies columns into the temp table, causing the msg 213 error.

Doing Things Differently

The solution here is to replace row 21’s SELECT * with the precise columns to insert from amazonwebshark.2022.sharkspecies:

While amazonwebshark.2022.sharkspecies now has six columns, my script is only inserting five of them into the temp table:

I can add the last_evaluated column into #tempsharks in future, but its absence in the temp table isn’t causing any immediate problems.

Works The Same In Other Environments

In this section, I use Python to show the value of production code that works the same in non-production.

Setting The Scene

Here I have a Python script that reads data from an Amazon S3 bucket using a boto3 session. I pass my AWS_ACCESSKEY and AWS_SECRET credentials in from a secrets manager, and create an s3bucket variable for the S3 bucket path:

When I deploy this script to my dev environment it works fine.

What’s The Problem?

When I deploy this script to production, s3bucket will still be s3://dev-bucket. The potential impact of this depends on the AWS environment setup:

Different AWS account for each environment:

dev-bucket doesn’t exist in Production. The script fails.

Same AWS account for all environments:

Production IAM roles might not have any permissions for dev-bucket. The script fails.
Production processes might start using a dev resource. The script succeeds but now data has unintentionally crossed environment boundaries.

Doing Things Differently

A solution here is to dynamically set the s3bucket variable based on the ID of the AWS account the script is running in.

I can get the AccountID using AWS STS. I’m already using boto3, so can use it to initiate an STS client with my AWS credentials.

STS then has a GetCallerIdentity action that returns the AWS AccountID linked to the AWS credentials. I capture this AccountID in an account_id variable, then use that to set s3bucket‘s value:

More details about get_caller_identity can be found in the AWS Boto3 documentation.

For bonus points, I can terminate the script if the AWS AccountID isn’t defined. This prevents undesirable states if the script is run in an unexpected account.

Speaking of which…

Prevents Undesirable States

In this section, I use PowerShell to demonstrate how to stop production code from doing unintended things.

Setting The Scene

In June I started writing a PowerShell script to upload lossless music files from my laptop to one of my S3 buckets.

I worked on it in stages. This made it easier to script and test the features I wanted. By the end of Version 1, I had a script that dot-sourced its variables and wrote everything in my local folder $ExternalLocalSource to my S3 bucket $ExternalS3BucketName:

#Load Variables Via Dot Sourcing
. .\EDMTracksLosslessS3Upload-Variables.ps1


#Upload File To S3
Write-S3Object -BucketName $ExternalS3BucketName -Folder $ExternalLocalSource -KeyPrefix $ExternalS3KeyPrefix -StorageClass $ExternalS3StorageClass

What’s The Problem?

NOTE: There were several problems with Version 1, all of which were fixed in Version 2. In the interests of simplicity, I’ll focus on a single one here.

In this script, Write-S3Object will upload everything in the local folder $ExternalLocalSource to the S3 bucket $ExternalS3BucketName.

Problem is, the $ExternalS3BucketName S3 bucket isn’t for everything! It should only contain lossless music files!

At best, Write-S3Object will upload everything in the local folder to S3 whether it’s music or not.

At worst, if the script is pointing at a different folder it will start uploading everything there instead! PowerShell commonly defaults to C:\Windows, so this could cause all kinds of problems.

Doing Things Differently

I decided to limit the extensions that the PowerShell script could upload.

Firstly, the script captures the extensions for each file in the local folder $ExternalLocalSource using Get-ChildItem and [System.IO.Path]::GetExtension:

$LocalSourceObjectFileExtensions = Get-ChildItem -Path $ExternalLocalSource | ForEach-Object -Process { [System.IO.Path]::GetExtension($_) }

Then it checks each extension using a ForEach loop. If an extension isn’t in the list, PowerShell reports this and terminates the script:

ForEach ($LocalSourceObjectFileExtension In $LocalSourceObjectFileExtensions) 

{
If ($LocalSourceObjectFileExtension -NotIn ".flac", ".wav", ".aif", ".aiff") 
{
Write-Output "Unacceptable $LocalSourceObjectFileExtension file found.  Exiting."
Start-Sleep -Seconds 10
Exit
}

So now, if I attempt to upload an unacceptable .log file, PowerShell raises an exception and terminates the script:

**********************
Transcript started, output file is C:\Files\EDMTracksLosslessS3Upload.log

Checking extensions are valid for each local file.
Unacceptable .log file found.  Exiting.
**********************

While an acceptable .flac file will produce this message:

**********************
Transcript started, output file is C:\Files\EDMTracksLosslessS3Upload.log

Checking extensions are valid for each local file.
Acceptable .flac file.
**********************

To see the code in full, as well as the other problems I solved, please check out my post from June.

Summary

In this post, I responded to November 2022’s T-SQL Tuesday #156 Invitation and gave my thoughts on some production code qualities. I gave examples of each quality and showed how they could save time and prevent unintended problems in a production environment.

Thanks to Tomáš for this month’s topic! My previous T-SQL Tuesday posts are here.

If this post has been useful, please feel free to follow me on the following platforms for future updates:

Thanks for reading ~~^~~

Tags Amazon Web Services, GitHub, Microsoft SQL Server, PowerShell, Python, SQL, T-SQL Tuesday, Troubleshooting

Fixing A Broken Tap With George Pólya

Post author By Damien Jones
Post date October 31, 2022
2 Comments on Fixing A Broken Tap With George Pólya

2022 10 31 SolvingABrokenTapWithGeorgePolya

In this post, I use the principles in “How To Solve It” by George Pólya to diagnose and fix my broken kitchen tap. Yes – really.

Introduction

Bit of a change this time. Let me set the scene.

It’s time to top up Wolfie’s water bowl, so to the kitchen sink we go. Two unexpected events happen when the tap is turned on:

The water flow goes mental and starts spraying everywhere.
Something gets launched out of the tap into the water bowl:

My first thought is that the tap is broken and that I’ll need to buy a new one. And then get a plumber to fit it. Great.

But wait. Last year I fixed some broken panes in our greenhouse. This year I’ve built a potting bench, fixed a leaky water butt and mounted a shower rail. Is this a problem I can solve?

This Doesn’t Sound Like Technology

True. It is, though, a chance to write a post I’ve fancied doing for a while. And this set of circumstances was too compelling to pass up.

Last year I became aware of a book called “How To Solve It” by George Pólya. The recommendation included a chart based on the book, similar to this one:

What struck me was how close these steps were to the Systems Development Life Cycle I was taught at college. My interest was piqued.

Around the same time, I was getting to grips with my new Data Engineer role. Since then, I’ve used the “How To Solve It” principles to help me complete work both for my role and for this blog.

Now, faced with a new unfamiliar situation, I can demonstrate how the “How To Solve It” principles can be applied beyond mathematics. In this case I’m fixing a broken tap, but this could just as easily be a Python bug, a poorly performing SQL query or an AWS authentication issue.

Here is my plan:

Firstly, I’ll examine the “How To Solve It” book.
Secondly, I’ll look at the author of the book – George Pólya.
Then I’ll look at each of the George Pólya principles, relating them to the broken tap problem I want to solve.

Let’s start with the book.

How To Solve It

‘A superb book on how to think fresh thoughts … A walk inside Pólya’s mind as he builds up maxims on how to comprehend a problem, how to build up a strategy, and then how to test it.’
David Bodanis, Guardian

‘Everyone should know the work of George Polya on how to solve problems’
Marvin Minsky

How To Solve It can be bought on Penguin’s website.

History

How To Solve It was written in 1945 by George Pólya. Since then, the book has stayed in print and has been translated into over a dozen languages. It has sold more than 1 million copies, making it one of the most widely circulated mathematics books in history.

Four Principles

How To Solve It explains in non-technical terms how to think about invention, discovery, creativity and analysis. Central to this are four principles:

First. You have to understand the problem.
Second. Find the connection between the data and the unknown. You may be obliged to consider auxiliary problems if an immediate connection cannot be found. You should obtain eventually a plan of the solution.
Third. Carry out your plan
Fourth. Examine the solution obtained.

The book also poses several questions for each principle. They aim to stimulate thought and produce the answers needed to satisfy each principle.

These can be seen in the below image from the book’s first edition:

They are also available as text from the University of Utah’s summary of 1957’s second edition.

Although How To Solve It was written with mathematics in mind, the book’s principles have been applied to additional disciplines over the decades. Pólya seems to take great care not to limit the scope of How To Solve It, speaking of problems in general terms throughout the book.

One such example is this extract:

A great discovery solves a great problem but there is a grain of discovery in the solution of any problem. Your problem may be modest; but if it challenges your curiosity and brings into play your inventive faculties, and if you solve it by your own means, you may experience the tension and enjoy the triumph of discovery.
“How To Solve It” – George Pólya

How To Solve It remains in high regard to this day. The Math Sorcerer produced this video in July 2022, and his affection for the book is clear.

Sources

How To Solve It on Wikipedia

Next, let’s look at the book’s author – George Pólya.

George Pólya

George Pólya (December 13 1887 – September 7 1985 aged 97) was a Hungarian mathematician. He was a professor of mathematics from 1914 to 1940 at ETH Zürich in Hungary, and from 1940 to 1953 at Stanford University in North America having moved there during World War 2.

After retiring from Stanford, Pólya remained active in his field. He continued his association with Stanford as Professor Emeritus well into his 90s and taught a course in their Computer Science Department in 1978.

Works

In pure mathematics, Pólya made important discoveries in fields including probability, real and complex analysis, combinatorics, geometry, number theory and mathematical physics.

Several of his discoveries bear his name, including:

Pólya Criterions in Probability Theory.
Pólya Peaks in Complex Function Theory.
Pólya Enumeration Theorem in Combinatorics.

Pólya also authored and contributed to numerous books and articles throughout his life, a selection of which can be seen on Wikipedia.

Recognition

Pólya was well-regarded by his peers and awards given to him included:

Membership of the American National Academy of Sciences, the American Academy of Arts and Sciences and the California Mathematics Council.
Honorary Membership of the Hungarian Academy Of Sciences, the London Mathematical Society and the Swiss Mathematical Society.
Distinguished Service Award from the Mathematical Association Of America, the citation for which included comments like:

“He has given a new dimension to problem-solving by emphasizing the organic building up of elementary steps into a complex proof, and conversely, the decomposition of mathematical invention into smaller steps.”

and:

“Problem solving a la Polya serves not only to develop mathematical skill but also teaches constructive reasoning in general.”

Sources

There is much more to know about Pólya. The following links detail his life, works and legacy in far greater detail:

Now I’m going to apply the George Pólya principles to my broken tap!

Applying The Principles

In the following sections, I will apply each George Pólya principle from How To Solve It to my tap problem. In each section I will:

Quote each principle in full.
State the supporting questions that I’ll answer.
Relate these to my tap problem.

Principle 1: Understanding The Problem

First. You have to understand the problem.
“How To Solve It” – George Pólya

What is the unknown? What are the data? What is the condition?

The Unknown

The unknown is what I want. Here, I want to restore the tap’s original flow rate.

The Data

The data is the information available. This is what was expelled from the tap:

Other data:

Water was still flowing from the tap.
The flow of water was under more pressure than before.

The Condition

The condition is the link between the unknown and the data. Here, whatever has come out of the tap has changed the water’s flow but hasn’t obstructed it.

Principle 2: Devising A Plan

Second. Find the connection between the data and the unknown. You may be obliged to consider auxiliary problems if an immediate connection cannot be found. You should obtain eventually a plan of the solution.
“How To Solve It” – George Pólya

Do you know a related problem? Do you know a theorem that could be useful?
Here is a problem related to yours and solved before. Could you use it? Could you use its result? Could you use its method?

I searched Google for the phrase “kitchen tap water flow changed”. There was an immediate common thread in the results:

The Estes Services link gave a useful definition of an aerator:

“The aerator on your faucet is a mesh screen and covers the water outlet. The aerator catches minerals and other debris in your pipes. It also helps save water by introducing air into the water stream.”
“How to Fix Low Water Pressure in Kitchen” on Estes

Getting somewhere! This led me to “Everything you Need to Know About Tap Aerators” on TapWarehouse, which includes this:

“They save you water by adding oxygen to the flow (and that means saving pennies) and reduce splashing around the bowl of the basin.”
“Everything you Need to Know About Tap Aerators” on TapWarehouse

Mesh screen? Reduced splashing? This definitely sounded like the right area!

Solved Problems

At this point, what came out of the tap sounded very much like an aerator. However, there’s no cleaning something that’s disintegrated, so it was time for a replacement.

TapWarehouse to the rescue again:

“If your existing tap already has an aerator, simply turn it anticlockwise until it’s unscrewed from the tap. Then, simply screw in the new aerator until it’s secure, being careful not to screw it too tightly.”
“How can I Install a Tap Aerator?” on TapWarehouse

TapWarehouse also gave advice on aerator types. There are male and female aerators depending on the tap. There are also various aerator sizes ranging from 16mm to 28mm.

Planned Solution

Based on this research, the solution needed the following steps:

Remove the broken aerator.
Confirm the aerator type.
Confirm the aerator size.
Buy a replacement aerator.
Fit the replacement aerator.
Test the replacement aerator.

Principle 3: Carrying Out The Plan

Third. Carry out your plan.
“How To Solve It” – George Pólya

Carrying out your plan of the solution, check each step. Can you see clearly that the step is correct?

Time to remove the broken aerator! Straight into a problem. It wouldn’t budge.

Fortunately, there’s a DIY StackExchange! Advice ranged from WD-40 to vinegar to a hammer and chisel (!), but in the end I used my heat gun on the aerator and removed it with pliers.

I then determines that I needed a 24mm male aerator as a replacement. One trip to B&Q later and:

Fitting the new aerator was a simple matter of screwing it on.

Principle 4: Looking Back

Fourth. Examine the solution obtained.
“How To Solve It” – George Pólya

Can you check the result?

BEHOLD:

Summary

In this post, I used the principles in “How To Solve It” by George Pólya to diagnose and fix my broken kitchen tap. I applied each of the Pólya principles to my problem, and was able to solve it by answering the relevant questions and doing some investigation with the knowledge gained.

If this post has been useful, please feel free to follow me on the following platforms for future updates:

Thanks for reading ~~^~~

Tags Academia, Shark Shelf

Data & Analytics

Writing User Stories For An iTunes Dashboard

Post author By Damien Jones
Post date September 30, 2022

In this post, I use Agile Methodology to collect requirements and write user stories for my iTunes dashboard.

Introduction

In my recent posts, I’ve been building a basic data pipeline for an iTunes export file. So far I have:

Built a Python ETL that extracts iTunes export data into a Pandas DataFrame and transforms some columns.
Saved the transformed data to Amazon S3 as a Parquet file.
Ingested the data into an Amazon Athena table.
Established a secure connection between Athena and Microsoft Power BI using Simba Athena.

Now I can think about analysing this data. I’m building a dashboard for this as they offer advantages including:

Dashboards communicate information quickly without having to run reports or write queries.
It is easy to sort and filter dashboards.
Using dashboard visuals doesn’t require knowledge of the maths and scripting behind them.
Dashboard visuals can interact with each other and respond to changes and selections.

Before I start thinking about my dashboard I should review the data. My preferred way of doing this is to create a data dictionary, so let’s start there.

Data Dictionary

In this section, I will explain what a data dictionary is and create one for the iTunes data in my Athena table.

Data Dictionary Introduction

A data dictionary is data about data. Just like a dictionary contains information on and definitions of words, IBM’s Dictionary of Computing defines a data dictionary as:

“a centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format”
IBM Dictionary of Computing

Typical attributes of data dictionaries include:

Data type
Description
Conditions: Is the data required? Does it have dependencies?
Default value
Minimum and maximum values

There are numerous online resources about data dictionaries. I like this video as a gentle introduction to the topic using real-world examples:

Now let’s apply this to my iTunes data.

iTunes Data Dictionary

Here, I have written a short data dictionary to give my data some context. I have divided the dictionary into data types and given a brief description of each field.

Some fields have two field names because they will be renamed on the dashboard:

album will become key as iTunes has no field for musical keys, so I use album instead.
name will become title for clarity.
tracknumber will become bpm as, although iTunes does have a BPM field, this is not included in its export files.

Strings:

album / key: A track’s musical key.
artist: A track’s artist(s).
genre: A track’s music genre.
name / title: A track’s title and mix.

Integers:

myrating: A track’s rating in iTunes. Min 0 Max 100 Interval +20.
myratingint: A track’s rating as an integer. Min 0 Max 5 Interval +1.
plays: A track’s total play count.
tracknumber / bpm: A track’s tempo in beats per minute.
year: A track’s year of release.

DateTimes:

dateadded: The date & time a track was added to iTunes.
datemodified: The date & time a track was last modified in iTunes.
lastplayed: The date & time a track was last played in iTunes.
dateaddeddate: The date a track was added to iTunes.
datemodifieddate: The date a track was last modified in iTunes.
lastplayeddate: The date a track was last played in iTunes.

For context, this is a typical example of a track’s details in the iTunes GUI:

For completeness, dateadded and datemodified are recorded in the File tab.

Now that the data is defined, I can start thinking about the dashboard. But where do I start?

Beginning The Design Process

In this section, I talk about how I got my dashboard design off the ground.

I didn’t decide to write user stories for my iTunes dashboard straightaway. This post took a few versions to get right, so I wanted to spend some time here running through my learning process.

Learning From The Past

This isn’t the first time I’ve made a dashboard for my own use. However, I have some unused and abandoned dashboards that usually have at least one of these problems:

They lack clarity and/or vision, resulting in a confusing user experience.
The use of excessive tabs causes a navigational nightmare.
Visuals are either poorly or incorrectly chosen.
Tables include excessive amounts of data. Even with conditional formatting, insights are hard to find.

When I started designing my iTunes dashboard, I was keen to make something that would be useful and stand the test of time. Having read Information Dashboard Design by Stephen Few, it turns out the problems above are common in dashboard design.

In his book, Stephen gives guidance on design considerations and critiques of sample dashboards that are just as useful now as they were when the book was published in 2006.

So I was now more clued up on design techniques. But that’s only half of what I needed…

What About The User?

I searched online for dashboard design tips while waiting for the book to arrive. While I did get useful results, they didn’t help me answer questions like:

How do I identify a dashboard’s purpose and requirements?
How do I justify my design choices?
What do I measure to confirm that the dashboard adds value?

I was ultimately pointed in the direction of Agile Analytics: A Value-Driven Approach to Business Intelligence and Data Warehousing by Dr Ken Collier. In this book, Ken draws on his professional experience to link the worlds of Business Intelligence and Agile Methodology, including sections on project management, collaboration and user stories.

(I should point out that I’ve only read Chapters 1: Agile Analytics: Management Methods and Chapter 4: User Stories for BI Systems currently, but I’m working on it!)

Wait! I’ve used Agile before! Writing user stories for my iTunes dashboard sounds like a great idea!

So let’s talk about Agile.

Introducing Agile

In this section, I will introduce some Agile concepts and share resources that helped me understand the theory behind them.

Agile Methodology

Agile is a software development methodology focusing on building and delivering software in incremental and iterative steps. The 2001 Manifesto for Agile Software Development declares the core Agile values as:

Individuals and interactions over processes and tools.
Working software over comprehensive documentation.
Customer collaboration over contract negotiation.
Responding to change over following a plan.

Online Agile resources are plentiful. Organisations like Atlassian and Wrike have produced extensive Agile resources that are great for beginners and can coach the more experienced.

For simpler introductions, I like Agile In A Nutshell and this Development That Pays video:

Epics

Kanbanize defines Epics as:

“…large pieces of work that can be broken down into smaller and more manageable work items, tasks, or user stories.”
“What Are Epics in Agile? Definition, Examples, and Tracking” by Kanbanize

I found the following Epic resources helpful:

Firstly, Wrike’s Complete Guide to Agile Epics is great for newcomers, with good examples of epics and explanations of how epics link to other elements like user stories and personas.

Secondly, Atlassian’s article about Epics goes into considerable detail and introduces topics like estimates and measuring.

Finally, this Dejan Majkic video explains Epics at a high level:

User Stories

Digité defines a User Story as:

“…a short, informal, plain language description of what a user wants to do within a software product to gain something they find valuable.”
“User Stories: What They Are and Why and How to Use Them” by Digité

I found the following User Story resources helpful:

Firstly, Atlassian’s article about User Stories explains the theory behind them, introducing User Stores as part of a wider Agile Project Management section.

Secondly, Wrike’s guide on How to Create User Stories offers practical advice including five steps for writing user stories and an introduction to the INVENT acronym.

This femke.design video gives a great personable introduction to User Stories:

Finally, this Atlassian video assumes some knowledge of User Stories and has more of a training feel:

Personas

Wrike defines Personas as:

“…fictional characteristics of the people that are most likely to buy your product. Personas provide a detailed summary of your ideal customer including demographic traits such as location, age, job title as well as psychographic traits such as behaviors, feelings, needs, and challenges.”
“What Are Agile Personas?” by Wrike

The full Wrike Agile Personas article expands on this definition and offers guidance on getting started with personas.

This Atlassian video gives additional tips and advice on creating personas:

That’s all the theory for now! Let’s begin writing the user stories I’m going to use for my iTunes dashboard, starting by creating a persona.

My Persona

In this section, I will create the persona I’ll use to write the epic and user stories for my iTunes dashboard. But why bother?

Why Create A Persona?

Some sources consider personas to be optional while others prioritise them highly. I’m using one here for a few reasons:

Firstly, I’m my own customer here. Using a persona will make it easier to identify my requirements, and ringfence ‘engineer me’ from ‘user me’.
Secondly, the persona will help to focus the project. Once the persona’s goals have been defined, they will present a clear target to work towards.
Finally, creating a persona makes this post far easier to write! Pulling my requirements out of thin air doesn’t feel authentic, and writing about myself in the third person is WEIRD.

So who is this persona?

Introducing Biscuit

Meet my stakeholder, Biscuit:

PXL 20220914 165327222 2 600450 — That’s not what stakeholder means! – Ed

Being Biscuit

Biscuit is a sassy, opinionated shark that will tell anyone who listens about when he met Frank Sidebottom.

View this post on Instagram

A post shared by Damien Jones (@amazonwebshark)

Biscuit likes listening to dance music. He has a collection of around 3500 tracks and uses iTunes smart playlists to listen to them. His collection ranges from Deep House at around 115 BPM to Drum & Bass at around 180 BPM.

Biscuit is a bit of a music geek. He found out about music theory when he saw key notations on his Anjunabeats record sleeves and wondered what they meant. He uses Mixed In Key to scan his collection, so each track has rich metadata.

Biscuit has various playlists depending on his mood, location and/or activity. He likes to choose between recent favourites and tunes he’s not heard for a while.

Biscuit doesn’t use music streaming services and doesn’t want to due to the internet requirement and the bitrates offered.

Biscuit’s Challenges

The current process of creating new smart playlists involves spending time looking through existing playlists and using trial and error. This usually takes at least an hour and doesn’t guarantee good results.

As the current process of generating new playlists is undesirable, existing and legacy playlists are being used more often. This is creating problems:

Tracks in those playlists get disproportionate plays and become stale, while tracks not in those playlists don’t get played for ages.
Underused playlists consume iTunes compute resources and iPhone storage space.
Time and money are being spent on adding new tracks to the collection as no simple process exists to identify lesser played tracks.

Biscuit’s Goals

Biscuit wants a quicker way to use his iTunes data for building smart playlists. Specifically, he wants to know about the tracks he plays most and least often so that future smart playlists can be focused accordingly.

Biscuit would like to visualise his iTunes data in a dashboard. The dashboard should show:

If any listening trends or behaviours exist.
Traits of disproportionately overplayed and underplayed tracks.

Biscuit also wants to know if the following factors influence how many times a track is played:

BPM: Is there a relationship between a track’s tempo and its play count?

Date Added: Recently added tracks will usually have fewer plays, but what about tracks that have been in the collection longer? Do tracks exist with older dateadded dates and low play counts?

Rating: The assumption would be that tracks with higher ratings will get played more. Is this correct?

Year: Are tunes produced in certain years played more than others? Are there periods of time that future smart playlists should target?

My Epic And User Stories

In this section, I will write the epic and user stories that I will use to design and create my iTunes dashboard.

Designing a dashboard would usually be a user story in an epic – I’ve allocated a user story to each dashboard visual to help keep me focused, as time is currently tight and it can be challenging to time for this!

Epic: iTunes Play Counts Dashboard

As a playlist builder, Biscuit wants to use a dashboard to analyse the play counts in his iTunes data so that he can simplify the process of creating new smart playlists.

ACCEPTANCE CRITERIA:

Biscuit can analyse play totals and see how they are distributed between bpm, dateadded, rating and year fields.
Biscuit can use the dashboard for architecting new smart playlists instead of iTunes.
Biscuit can access the dashboard on PC and mobile.
The dashboard’s operational costs must be minimal.

User Story: Plays Visual

As a playlist builder, Biscuit wants to see play totals so that it is easier to review and manage his current play intervals.

ACCEPTANCE CRITERIA:

Biscuit can see and sort totals of individual plays.
Biscuit can see and sort totals of current play intervals.
The dashboard can be filtered by plays and current play intervals.
The dashboard must use the following play intervals:
- P0: unplayed.
- P01-P05: between 1 and 5 plays.
- P06-P10: between 6 and 10 plays.
- P11-P20: between 11 and 20 plays.
- P21-P30: between 21 and 30 plays.
- P31+: over 30 plays.

User Story: BPMs Visual

As a playlist builder, Biscuit wants to see how plays are distributed between track BPMs so that he can identify how BPMs influence which tracks are played most and least often.

ACCEPTANCE CRITERIA:

Biscuit can see relationships between BPMs and play totals at a high level.
Biscuit can both filter the visual and drill down for added precision.
The dashboard can be filtered by both BPMs and current BPM intervals.
The dashboard must use the following BPM intervals:
- B000-B126: 126 BPM and under.
- B127-B129: 127 BPM to 129 BPM.
- B130-B133: 130 BPM to 133 BPM.
- B134-B137: 134 BPM to 137 BPM.
- B138-B140: 138 BPM to 140 BPM.
- B141-B150: 141 BPM to 150 BPM.
- B151+: 151 BPM and over.

User Story: Ratings Visual

As a playlist builder, Biscuit wants to see how plays are distributed between iTunes ratings so that he can identify how ratings influence which tracks are played most and least often.

ACCEPTANCE CRITERIA:

Biscuit can see relationships between ratings and play totals at a high level.
Biscuit can both filter the visual and drill down for added precision.
The dashboard can be filtered by rating.

User Story: Date Added Visual

As a playlist builder, Biscuit wants to see how plays are distributed relative to when tracks were added to the collection so that he can identify tracks with abnormally high and low play totals relative to how long they have been in the collection.

ACCEPTANCE CRITERIA:

Biscuit can see relationships between the year tracks were added to the collection and play totals at a high level.
Biscuit can both filter the visual and drill down for added precision.
The dashboard can be filtered by the years that tracks were added in.

User Story: Year Visual

As a playlist builder, Biscuit wants to see how plays are distributed relative to track production years so that he can identify how production years influence which tracks are played most and least often.

ACCEPTANCE CRITERIA:

Biscuit can see relationships between production years and play totals at a high level.
Biscuit can both filter the visual and drill down for added precision.
The dashboard can be filtered by production year.

Summary

In this post, I used Agile Methodology to collect requirements and wrote user stories for my iTunes dashboard.

I created a data dictionary to give context to my iTunes data, examined some high-level Agile concepts and used a persona to write five user stories that I will use to create my iTunes dashboard.

If this post has been useful, please feel free to follow me on the following platforms for future updates:

Thanks for reading ~~^~~

Tags Agile, iTunes, Music, Project: iTunes Export Data Pipeline (2022)

Table of Contents

Introduction

Precision

Setting The Scene

What’s The Problem?

Doing Things Differently

Works The Same In Other Environments

Setting The Scene

What’s The Problem?

Doing Things Differently

Prevents Undesirable States

Setting The Scene

What’s The Problem?

Doing Things Differently

Summary

Table of Contents

Introduction

This Doesn’t Sound Like Technology

How To Solve It

History

Four Principles

Sources

George Pólya

Works

Recognition

Sources

Applying The Principles

Principle 1: Understanding The Problem

The Unknown

The Data

The Condition

Principle 2: Devising A Plan

Related Problems

Solved Problems

Planned Solution

Principle 3: Carrying Out The Plan

Principle 4: Looking Back

Summary

Table of Contents

Introduction

Data Dictionary

Data Dictionary Introduction

iTunes Data Dictionary

Beginning The Design Process

Learning From The Past

What About The User?

Introducing Agile

Agile Methodology

Epics

User Stories

Personas

My Persona

Why Create A Persona?

Introducing Biscuit

Being Biscuit

Biscuit’s Challenges

Biscuit’s Goals

My Epic And User Stories

Epic: iTunes Play Counts Dashboard

User Story: Plays Visual

User Story: BPMs Visual

User Story: Ratings Visual

User Story: Date Added Visual

User Story: Year Visual

Summary