DeepFit: Deep Learning Based Fitness Center Equipment Use Prediction

Introduction

According to the US department of Health and Human Services, “Physical activity, along with proper nutrition, is beneficial to people of all ages, backgrounds, and abilities. And it is important that everyone gets active: over the last 20 years, there’s been a significant increase in obesity in the United States”. Given the importance of physical activity and exercise and the long sedentary working hours of modern life, fitness centers play a crucial role in promoting good health and wellness. Having access to fitness center equipment that works with an individual’s schedule is essential for the person to engage in physical activity and maintain a healthy lifestyle.

Therefore, in today’s busy modern life, modeling and accurately predicting fitness center equipment usage and availability is essential for improving human fitness and well-being as it provides people the flexibility to plan their schedule and exercise at their convenience. In addition to its crucial role in ensuring a healthy and sustainable future, adopting a data-driven approach for modeling and predicting fitness center equipment usage is necessary for planning the optimal square footage for developing a fitness center, and determining the kinds of equipment to purchase and install.

In this article, we discuss DeepFit, a deep learning based system that predicts future fitness center equipment usage based on historical data. To this end, we design a Long Short Term Memory (LSTM) based sequence-to-sequence model that captures the dependencies in the data. The sequence-to-sequence model comprises of an encoder and a decoder, each of which separately is a deep Recurrent Neural Network (RNN). The basic cell structure in the RNN architecture is an LSTM cell. We evaluate DeepFit on equipment usage data collected from a university campus fitness center over a period of 1.5 years and demonstrate that it is able to accurately predict future fitness center equipment usage. We show that DeepFit significantly outperforms the linear regression and ARIMA baselines in terms of Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE).

Problem Statement and Data

We tackle the fitness center equipment usage prediction problem, where the objective is to accurately predict future equipment usage based on past data. We cast this problem as a time series prediction problem, where given an input sequence (i.e., x₁, x2, …., xn), which corresponds to the machine usage for the last n time steps, the goal is to generate predictions (i.e., y₁, y2, …., yk) for the machine usage for the next k time steps.

We collect machine usage data from a university campus fitness center for around 515 days (28 November 2017 to 27 April 2019). We collect data for four machines—Cardio, Free Weights, Strength Machines and Synergy360 for three time periods during the day (i.e., morning, afternoon and evening). Data is collected between 6 am to 10.59 am for the morning session, 11.00 am to 3:59 pm for the afternoon session and 4.00 pm to 8.59 pm for the evening session. We record multiple values during each session and take an average over them to get one reading in each session. The gym is closed during holidays such as Christmas, Thanksgiving and Memorial day. Hence, we do not record usage data during the holidays. Apart from this, the data contains 9% missing values. We fill all these missing values by taking an average of the five previous readings for that particular day and session

Figures 1 a and 1b shows the best times to visit the recreational center based on the availability of machines with respect to time of day and month of year. From Figure 1a, we observe that usage for all machines is lowest in the morning, followed by the afternoon and highest in the evenings. This is expected as college students generally wake up late and have more classes scheduled during the afternoons. As expected, Figure 1b shows that months of June and July have lowest usage as most students are not on campus during the summer session. Similarly, May, August, December and January also have lower usage because the school is in session only for some days during these months.

RNN based Encoder-Decoder Model

RNN model — Figure 3: RNN based Sequence-to-Sequence Model

DeepFit comprises of an encoder-decoder based sequence-to-sequence deep model as shown in Figure 3. The model consists of two components—an encoder and a decoder, each of which is an RNN. An RNN consists of a network of neural nodes that are organized in layers, with there being directed connections from one layer to the next. At the highest level, the encoder accepts an input sequence x1 , x2, …., xn, which corresponds to the equipment usage in the last n time steps and generates a hidden encoded vector C which encapsulates information for the input sequence. This encoded vector is given as an input to the decoder which generates y1, y2, …., yk, the predicted equipment usage for the next k time steps.

Internally, at each time step t, an RNN consists of a hidden state ht that gets updated based on the input xt and the previous hidden state (i.e., ht-1 ) using some non-linear function f. ht serves as memory and after the entire input sequence is read, the hidden state is the summary C capturing the information of the entire input sequence. This summary C is then used by the decoder to generate the output sequence by predicting the next value yt given the hidden state. We use ReLU activation function after each decoder output to prevent prediction of negative equipment usage values.

In the standard RNN architecture, the neural nodes are usually composed of basic activation functions such as tanh and sigmoid. During the training phase, the weights are learned by the backpropagation algorithm that propagates errors through the network. However, the use of these basic activation functions can cause RNNs to suffer from the vanishing/exploding gradient problem that causes the gradient to have either infinitesimally low or high values, respectively. This prevents RNN from being able to learn long-term dependencies based on the data. To overcome this problem, we use LSTM cells as the basic cell in both encoder and decoder to capture and store relevant long-term temporal dependencies in the data. LSTM cells circumvent the well-known vanishing/exploding gradient by incorporating the ability to ‘forget’.

Training and Implementation Details

We split the data into two parts— 75% for training and 25% for testing and use TensorFlow for implementing the deep learning model. We use equipment usage data for the past 2 weeks to predict 1 week into the future. We train our models on a high computing cluster available at our university. The configuration used on the cluster for all experiments is 4 cores and 8 GB RAM.

At training time, the encoder and decoder are trained jointly using the backpropagation algorithm. We adopt unguided training as the training methodology. In unguided training, the decoder uses the previous predicted output value as an input to the next step of the decoder. One of main benefits of unguided training is that it enables a better exploration of the state space, which results in superior prediction performance at test time. At both training and test times, for a given equipment usage value, we use a sliding window of one step to obtain the input sequences. This ensures that we achieve the maximum overlap of sequences used. We incorporate L2 regularization in our model to minimize overfitting.

Experimental Results

In this section, we demonstrate the superior prediction performance of DeepFit by comparing it with two baseline approaches, linear regression and ARIMA.

Linear Regression – It is a statistical model that produces the best fit straight line based on the data.

ARIMA – Auto-Regressive Integrated Moving Average, popularly known as ARIMA is a statistical model that comprises of three terms. The first term is the Autoregressive term (AR), the second is the differencing term (I) and the third is the moving average term (MA).

The main metrics used in our evaluation are root mean squared error (RMSE) and mean absolute error (MAE). We present results for RMSE below.

Figure 4 shows the results for Free Weights for all three sessions (i.e., morning, afternoon and evening). We observe from the figure that DeepFit significantly outperforms the baselines for each session. We also observe that DeepFit is able to make better predictions into the future as its RMSE values only increase gradually. In contrast, the prediction performance of linear regression and ARIMA become considerably worse as they predict further into the future. We attribute the superior performance of DeepFit to its sequence-to-sequence modeling aspect that takes the entire input sequence into account to predict the output. From our experiments, we observe that DeepFit outperforms the baselines for all machines and all three sessions. DeepFit provides an average performance improvement of 16% over ARIMA and 18% over linear regression.