top of page

Enhancing Data Processing and Trend Prediction in SoundSoar

Sep 29, 2024

5 min read

0

8

0

Introduction

This month has been a pivotal period of growth and refinement for my capstone project, SoundSoar. Central to my efforts was solidifying data processing pipelines and enhancing machine learning models to deliver more accurate trend predictions. A key focus was on feature importance analysis, which played a critical role in improving the model’s predictions by identifying which Spotify attributes, such as tempo, valence, and popularity, are most influential. Ensuring the accuracy of these calculations required careful attention to the population history I’ve been tracking and storing over time. Getting this right was essential for generating reliable trend predictions and enhancing the app's overall performance.


Below, I'll dive into the specific features and functionalities that brought this vision to life, from integrating detailed track data to presenting it interactively. These developments, shown in the images below, illustrate how users engage with trending songs and individual track details:


Features Developed


  1. Spotify Authentication and API Integration:

    I implemented a login system using the allauth library to allow users to authenticate with their Spotify credentials. This lays the groundwork for Spotify SSO integration, which I plan to complete in the next phase. The Spotify Web API has been integrated to retrieve relevant data, such as playlists, track details, and audio features. This integration is essential for the core functionality of the application, as it provides real-time data for trend prediction and user playlists.


  2. Data Management:

    To maintain an up-to-date and structured database, I built a pipeline to handle Spotify data. This pipeline manages multiple aspects of track records, such as storing their popularity history, audio features, and retrieval frequency. The database is designed to handle relational data efficiently, allowing the system to map each track's attributes dynamically while keeping track of changes in popularity over time. On top of that, I optimized data integrity with validation steps before each update to avoid potential duplication or inconsistency in data.


  3. Automating Data Sync:

    Instead of relying on manual updates, I automated the process of retrieving new Spotify data using Windows Task Scheduler. I scheduled PowerShell scripts to call Python functions periodically, ensuring that the system always had the most recent track data. By automating this process, the application can continuously improve its trend predictions without the need for manual input. The periodic sync also ensures that any new tracks added to Spotify or updates to existing tracks are captured promptly, contributing to more accurate trend predictions.


  4. Feature Importance Analysis:

    Understanding which features influence trend predictions the most is key for fine-tuning the model. To do this, I implemented feature importance analysis using machine learning algorithms like Random Forest and HistGradientBoosting. These models are particularly suited for handling high-dimensional data with non-linear relationships. I used these algorithms to rank features based on their contribution to the prediction output. This analysis not only provided insights into which Spotify attributes (such as tempo, valence, or danceability) had the greatest impact but also helped me focus the model on the most relevant data, improving prediction accuracy.


  5. Trend Prediction Model:

    The trend prediction model is a crucial component of SoundSoar, designed to analyze and forecast the popularity of songs over time. I employed machine learning algorithms, specifically Random Forest and HistGradientBoosting, to develop robust predictive models. By utilizing a carefully selected set of features—such as valence, tempo, danceability, and historical popularity metrics—I was able to train the models to recognize patterns in the data. The training process involved hyperparameter tuning to optimize model performance and improve prediction accuracy. The model evaluates these features to classify songs into trending categories, enabling users to discover emerging hits effectively. This data-driven approach enhances the overall user experience by providing personalized recommendations based on reliable predictions.

    Effective storage and updating of trend prediction models are vital for maintaining the accuracy and reliability of SoundSoar's recommendations. I implemented a structured approach to save the trained models using joblib, which allows for efficient serialization and deserialization. This ensures that the models can be easily retrieved for future predictions without the need for retraining. Additionally, I established a systematic process for updating the models by scheduling regular synchronization tasks that incorporate new data from Spotify. By doing so, the models continuously adapt to changing trends and user preferences, ensuring that the application remains relevant and useful. Below, you can see a screenshot displaying both the active and historical trend models, highlighting their performance and the evolution of predictions over time.



  6. Custom Playlist Integration:

    I defined a new custom playlist type for charts and integrated the Sound Soar Suggestions playlist, showcasing the top 25 trending-up tracks. This feature enhances user experience by providing personalized song recommendations based on my predictive analysis.


Retrospective


This month has marked significant progress in integrating Spotify into SoundSoar, which has been a key element in enhancing the app's functionality. The authentication process using allauth has paved the way for a seamless user experience, allowing users to connect their Spotify accounts effortlessly. This integration is not just about data retrieval; it’s about creating a personalized experience for users who want tailored song recommendations based on real-time trends. Although I initially aimed to incorporate data from social media platforms for a broader perspective, I found that focusing on Spotify provides a solid foundation. As I move forward, I plan to explore additional integrations that could enrich user insights.


Configuring data synchronization tasks was another critical component of this month’s work. Automating data retrieval using Windows Task Scheduler has streamlined the process and ensured that the application remains updated with the latest track information. This approach minimizes manual input, allowing me to concentrate on refining the model and user experience. Reworking how I calculate popularity metrics has led to more meaningful insights into trending tracks. By adapting these metrics to align better with user behavior, I’ve made the predictions more accurate and relevant.


Initially, I considered using Matplotlib and Seaborn for visualizations, but I quickly recognized the need for interactivity to enhance user engagement. My attempts to use Bokeh were met with installation issues and configuration challenges, which proved frustrating. Ultimately, I transitioned to Plotly, and I’ve found it to be an excellent choice. Its interactive capabilities have significantly improved how users interact with data, allowing them to explore trends more dynamically. This shift has not only enhanced the user interface but also deepened my understanding of effective data presentation.

Looking ahead, I’m eager to continue improving SoundSoar. I’ve learned a lot about the intricacies of data management and machine learning model optimization this month.


Looking ahead, I’m excited to keep building on what I’ve learned this month. Diving into data management and optimizing my machine learning models has been eye-opening, and I see plenty of room for improvement. While I’m currently working in a Windows environment, I know that getting comfortable with Unix systems will help me tackle deployment and scalability challenges down the road. Overall, this experience has been incredibly valuable, and I’m committed to using these insights to enhance SoundSoar and make it a more effective tool for users.

Sep 29, 2024

5 min read

0

8

0

Comments

Share Your ThoughtsBe the first to write a comment.
bottom of page