← Back Published on

Ryan's Listening Data Journal

This has been a personal little side project of mine. I’ve been creating data sets of my friend’s spotify statistics to figure out, statistically, what their favorite songs and bands are. For this, I will be using my own data.

To start, I got the data into an excel spreadsheet. This is what it looks like:

It’s a little messy here, but the data that matters to us are columns ‘B’ and ‘D’. These give us the song title and artist title.

Then with a little excel math, we can find how many times the artist appears in this spreadsheet:

All this is doing is displaying how many times each artist appears. For example: blink-182 appears twice. This isn’t perfect. For example, if a song has multiple artists, it will be counted separately from the count of the artist. In this screenshot you can see there is 1 occurrence of a song by Linkin Park and Kiiara. This song is a Linkin Park song called heavy. But Linkin Park alone appears 6 times. So “Heavy” is being counted separately from the rest of Linkin Park’s discography. We have to go in and manually fix this.



Now I’ve gone in and done this. I checked the data and made sure it’s consistent. I’ve fixed the duplicates and corrected any mistakes. The data is now in a new sheet and is ready to be used. Let’s track my Spotify data.

I’ve put my data into Tableau and now it’s all nice and neat.

Now here’s one way I can present this data. Although I don’t love the way this looks. I think I’d rather use a treemap.

That looks a lot better to me. I’m just going to put the finishing touches on this and I have my data presented neatly.

This is just an example of what I would use a dataset like this for. A test run basically. But in the future I would use this to track listening trends on a larger scale. Perhaps I could collect the raw data of a group of people (like Emerson maybe) and then present it like this. Show demographics of the school’s listening habits. It could make for an interesting story.