Short description
The main objective was to compare the estimates of emissions and fuel consumption, calculate using GPS data and On-board Device data. My task was to explore both the datasets and compare them.
GPS data
The dataset contained GPS trajectories of 2910 vehicles for 7 days. There were 2910 folders, each containing 7 csv files, one for each day’s record. The folders contained another csv file which stored the start and the end time of each trip.
I used MATLAB (now I completely shifted to R) to trim the GPS files and create a csv file for each trip. Below is the algorithm –
- create list of folder names (vehID)
- loop over all folder
- read csv file which contains the start and end time of trips
- create list of remaining csv files in the folder (dayID)
- loop over all 7 GPS csv files
- trim and create new files based on trip information
- write them as csv files with name – vehID_dayID_tripID (e.g. 4210_121225_4.csv)
43,000 csv files corresponding to trips were created. Similarly, trip level files were created for the OBD datasets. When we compared the speed profiles of both GPS and OBD datasets and found some issues with OBD set, which was not reported in the documentation but later confirmed by the data.
Below is a figure showing speed (kmph) vs time (in second), the dark green line represents the GPS and red line represents the OBD. It was observed that there is a lag in the records of OBS and the lag was varying with time.
