This article provides an analysis and forecast of a future three-day demand for the public transport system in Santiago de Chile. Before the company can invest resources to reconstruct the public transit system, they need a reliable prediction of the future demand.
The goal is to build a forecast model for a future three-day demand: the number of passengers arriving at the terminal. The model will utilize the company’s three-week historical transit data from their data warehouse. The historical demand was recorded with 15- minute interval between 6:30 and 22:00.
- PROPOSED MODEL:
After exploring the data, I’ve developed a regression-based model which takes into account the multiple trends in the weekly and daily data. For example, during the week, the weekends have the fewer demands; during the day, I observed two demand peaks during the rush hours.
The forecasting model was then validated by another set of data called validation set which is not used in modeling building stage (training set). The performance seems adequate for forecasting a three-day demand with 15-minute intervals.
The forecast of a future three-day period shows below. Table 1 in Appendix A shows the forecasted values from the linear regression model. The numbers are rounded up to the nearest integer. The forecast patterns correspond to what we found in the historical dataset in which the demand is highest during the weekdays, and during the day, the peak hour is between 18:00 and 19:00.
The forecasting model is fairly automated and can continue to produce forecasts for future periods as long as new data are provided. Most statistical software (such as SAS, Minitab and XLMiner) can be used to deploy this model. An important assumption of the forecasting model is that travel on future weeks behaves similar to behavior during the period for which we had data. The model is not likely to forecast demand accurately when demand patterns change drastically, such as during a festival, large event, or strike.