Functional Data Privacy Algorithms for User Based Insurance

Tamsin Spelman, David Wood, Robert Whittaker, Michal Kubiak

Research output: Book/ReportCommissioned report


By monitoring each driver’s driving characteristics individually, car insurance premiums can be set to directly reflect that driver’s risk. Since such premiums tend to be cheaper, their uptake has increased in recent years, even outside the original market of young drivers. However, a lot of data (including GPS) is collected about each car journey in order to judge the driver’s ability which raises privacy issues, with a particular concern being that every car journey can be reconstructed from that data. We looked at what to change (or remove) from the collected data so that driver’s journeys couldn’t be reconstructed, while retaining as much information about each driver as possible so the insurance company can still study a driver’s charcateristics and potentially use the bulk data for testing new methods in future.
Control F1 does not want to use private key cryptography for customer relation reasons. We have deduced that GPS data would have to be deleted to retain privacy, however a quick experiment and a literature review suggests using heading and distance travelled data would still be sufficient to reconstruct journeys.
We considered deleting GPS data and time data and then randomising all the data points of a journey. This removes most of the information about the journey but the ”estimated journey vector” constructed from the bearing and distance data will still be retained. The journey vector most accurately matched the actual GPS calculated journey vector for longer journeys and non-circular journeys. Also, particularly for longer journeys, it can be used to identify similar journeys e.g. someone’s commute.
Cars slow down at junctions and traffic lights so data points are more commonly taken just before a turn. This effects the accuracy of the bearing and distance data. We studied the error in the ”local estimated journey vector” caused by a junction. We suspect distance errors grow faster than bearing errors, which agrees with the analysed data.
Original languageEnglish
Publication statusPublished - 2015

Cite this