A notable component of my pipeline was leveraging modern transformer-based language models for feature creation. By applying BERT embeddings , to short text fields, I was able to infuse deep semantic understandinginto a classical machine learning setting. This approach bridged the gap between big-data-driven, pre-trained NLP techniques and a relatively quick-to-deploy regression model.
While building and refining this ticket price predictor, I encountered significant hurdles due to the highly skewed nature of the dataset, where most ticket prices clustered at lower levels while a minority extended into disproportionately high tiers. Attempts to address this skewness with log transformations and alternative loss functions, such as Poisson, yielded limited benefits due to inherent data constraints.
Nevertheless, by integrating advanced feature engineering, optimizing hyperparameters, and experimenting with multiple regressors, I achieved a Mean Absolute Error of 8.52 using Random Forest on the final test set. This performance demonstrates the model’s practical utility, although the skewed data distribution and limited predictive power of features still constrains further accuracy gains.
In the future, I intend to enrich the dataset with more extensive textual descriptions to make even better use of BERT for identifying semantic information, these descriptions could come from wikipedia venue descriptions or spotify artist bios.