In this post, I discuss our late German Shepherd Wolfie and outline a project that utilises music data in his memory.
Table of Contents
Introduction
On 24 January 2025, our German Shepherd Wolfie sadly passed away.

Wolfie was a long, fluffy boi with a big presence and a big mane. He enjoyed sniffing things, tilting his head and barking at foxes. He was loud, proud and bushy-browed.
Wolfie was also psychic. No one could leave the house without his knowledge, and no one could enter it without having a snout-first search.

Wolfie struggled with various health issues throughout his life, including eating difficulties, muscle problems and genetic defects. Despite this, Wolfie was a cherished family member for four years before he sadly lost his battle with kidney failure and suspected cancer.

Our walks often involved music, and I regularly exposed Wolfie to my music library. He grew so accustomed to iPods during walks that he would bark whenever I picked one up!

After he was gone, I found myself revisiting the songs we shared. Music became a way to cherish those memories, so I wanted to create something meaningful in his memory…
Project Wolfie
This section explores what Project Wolfie is, the music data it utilises and its goals.
Definition
This project has been on my mind for some time now, and I suppose this was the push it needed to take shape. Project Wolfie is a data-driven initiative that explores the patterns hidden in my music collection. It analyses track metadata, listening habits and technical attributes to find insights, trends and recommendations.
Here, Wolfie is short for:
Waveform Observations Library For Intelligence Engineering
Let’s break this down:
Waveform: A visual illustrating a track’s traits like timbre, pitch and dynamics. Time is represented on the horizontal axis, while the vertical axis reflects amplitude.
Here is a sample waveform:

Observations Library: A consolidated data repository containing information about my music’s properties and my listening habits. The data consists of various types, structures and formats, and will be stored, cleaned and enriched for further use.
For Intelligence Engineering: The AI and BI use cases for the observations library. Here, interactive data visualisation and machine learning services will use the data to uncover patterns, predict trends and generate personalised recommendations.
Data
Music files contain more than just sound – they hold layers of metadata that are crucial to Project Wolfie.
This section explores the different types of metadata related to my music collection, highlighting their functions and purposes. I have assigned these categories using my understanding and intended use of the data.
Technical Metadata
Technical Metadata refers to the measurable and technical attributes of a music file. It tends to include numerical values and audio properties, and is commonly found by analysing the track using applications like Audacity, foobar2000 and MixedInKey, as well as Python libraries like Librosa.
Examples include:
- What is the track’s initial tempo and key?
- What is the track’s duration, and how loud is it?
- What are the track’s spectrographic and harmonic properties?
Descriptive Metadata
Descriptive Metadata refers to the contextual and identifying information about a music track. It tends to include text-based details and is commonly found both within the track’s properties and on websites like Beatport and Discogs.
Examples include:
- Who produced the track, and what is it called?
- What is the track’s genre?
- Which label published the track, and when?
Interaction MetaData
Interaction Metadata refers to engagement and listening behaviours. It typically includes dates, integers and timestamps, and is commonly generated by digital music services like iTunes and Spotify.
Examples include:
- When was the last time a track was played or skipped?
- How many times has a track been played?
- What rating has a track been assigned?
Deliverables
Here are the objectives I’m pursuing in Project Wolfie. Given their complexity, they will be divided into multiple epics and spread out over an extended period.
Data Lakehouse
So far, I have discussed the importance, types, and applications of data. To this end, I need to fulfil a few requirements:
- Ingesting and storing data from multiple sources.
- Transforming and cleaning data at scale.
- Enriching and aggregating data for analytics and consumption.
In short, I need a Data Lakehouse. I’ve written about them before and have followed the Medallion Architecture through bronze, silver and gold layers. For Project Wolfie and moving forward, I’ll be using the well-documented and supported AWS reference architecture:

I find this clearer and more regimented than the Medallion Architecture. It also aligns with the points made in Simon Whiteley‘s Advancing Analytics video, which I agree with.
Of course, a good Data Lakehouse isn’t possible without good data…
Quality & Observability
A Data Lakehouse’s effectiveness depends on data quality and observability. Project Wolfie must address factors like:
Veracity & Validation Checks: Verify data accuracy. Checks such as schema validation, null checks and data quality rules can identify issues early, stopping incorrect data from propagating downstream.
Anomaly Detection: Identify patterns often missed by validation like volume spikes and missing periods. Timely anomaly detection shields downstream resources from requiring remedial measures and lowers unforeseen cloud and developer expenses.
Lineage Tracking: Track the data’s journey from ingestion to consumption, documenting all transformations and processes. Vital for debugging, auditing and validation.
Governance & Security
A Data Lakehouse must balance accessibility and control. Governance and security protocols protect data while encouraging responsible usage.
I own all Project Wolfie data, so I have permission to process it. Additionally, there is no sensitive information or PII. However, there are other factors to consider:
Access Controls: Establish guidelines for who and what can access Project Wolfie resources. This safeguards data and services from unauthorised access, misuse and malicious activities.
Data Controls: Establish criteria for availability, backups, and structure. This aids in managing costs, ensuring disaster recovery, and maintaining schema consistency.
Monitoring & Logging: Track access patterns and record changes to data and infrastructure. This improves visibility into both potential threats and cost-related opportunities and vulnerabilities.
AI & BI Use Cases
Finally, I want to extract value and insights from Project Wolfie using Artificial Intelligence (AI) and Business Intelligence (BI). I have data from 2021 onwards from a music collection I started in the early 2000s, so I have lots to work with!
BI Use Cases (Dashboards, Analytics, Insights)
Listening Trends: Identify traits of my collection’s most frequently played and best-represented music. Analyse listening patterns over time to find trends.
Library Optimisation: Find rarely played tracks to add to playlists. Recognise songs that are often played and recommend alternatives for variety.
Distribution Analysis: Analyse my collection’s main genres, publishers and record labels, and investigate the connections between different elements (e.g., “The most popular tracks are typically in the 120-130 BPM range”). Create reports that show diversity and spread (e.g., “90% of house tracks are in five minor keys”).
AI Use Cases (Machine Learning, Automation, Predictions)
AI-Powered Personalised Playlists: Create playlists using the existing library based on properties like BPMs, keys and previous listening patterns, similar to Spotify Wrapped.
Smart Music Recommendations: Use collaborative filtering to suggest search criteria for new music based on my existing collection and listening habits (e.g., “Try G minor tracks at 128 BPM from the early 2010s”).
Predictive Analysis: Use Technical and Descriptive Metadata from new tracks to predict how they will be rated based on my existing library’s metadata (e.g., “This track has harmonic similarities to 70% of your highly rated tracks.“).
Summary
In this post, I discussed our late German Shepherd Wolfie and outlined a project that utilises music data in his memory.
Wolfie enjoyed scent games and retrieving toys, making Project Wolfie’s mission to find and return data and insights a fitting tribute. As the project evolves, I will strengthen its capabilities using new architectures and technologies, honouring Wolfie’s spirit – one track at a time.
Wolfie was more than just a pet; he was a companion and a guardian each day. I miss you big man. Take care out there.

If this post has been useful then the button below has links for contact, socials, projects and sessions:
Thanks for reading ~~^~~
 
		