K-means cluster analysis and seismicity partitioning for Pakistan

Khaista Rehman, Paul W. Burton, Graeme A. Weatherill

Research output: Contribution to journalArticlepeer-review

41 Citations (Scopus)


Pakistan and the western Himalaya is a region of high seismic activity located at the triple junction between the Arabian, Eurasian and Indian plates. Four devastating earthquakes have resulted in significant numbers of fatalities in Pakistan and the surrounding region in the past century (Quetta, 1935; Makran, 1945; Pattan, 1974 and the recent 2005 Kashmir earthquake). It is therefore necessary to develop an understanding of the spatial distribution of seismicity and the potential seismogenic sources across the region. This forms an important basis for the calculation of seismic hazard; a crucial input in seismic design codes needed to begin to effectively mitigate the high earthquake risk in Pakistan. The development of seismogenic source zones for seismic hazard analysis is driven by both geological and seismotectonic inputs. Despite the many developments in seismic hazard in recent decades, the manner in which seismotectonic information feeds the definition of the seismic source can, in many parts of the world including Pakistan and the surrounding regions, remain a subjective process driven primarily by expert judgment. Whilst much research is ongoing to map and characterise active faults in Pakistan, knowledge of the seismogenic properties of the active faults is still incomplete in much of the region. Consequently, seismicity, both historical and instrumental, remains a primary guide to the seismogenic sources of Pakistan. This study utilises a cluster analysis approach for the purposes of identifying spatial differences in seismicity, which can be utilised to form a basis for delineating seismogenic source regions. An effort is made to examine seismicity partitioning for Pakistan with respect to earthquake database, seismic cluster analysis and seismic partitions in a seismic hazard context. A magnitude homogenous earthquake catalogue has been compiled using various available earthquake data. The earthquake catalogue covers a time span from 1930 to 2007 and an area from 23.00° to 39.00°N and 59.00° to 80.00°E. A threshold magnitude of 5.2 is considered for K-means cluster analysis. The current study uses the traditional metrics of cluster quality, in addition to a seismic hazard contextual metric to attempt to constrain the preferred number of clusters found in the data. The spatial distribution of earthquakes from the catalogue was used to define the seismic clusters for Pakistan, which can be used further in the process of defining seismogenic sources and corresponding earthquake recurrence models for estimates of seismic hazard and risk in Pakistan. Consideration of the different approaches to cluster validation in a seismic hazard context suggests that Pakistan may be divided into K = 19 seismic clusters, including some portions of the neighbouring countries of Afghanistan, Tajikistan and India.

Original languageEnglish
Pages (from-to)401-419
Number of pages19
JournalJournal of Seismology
Issue number3
Early online date30 Dec 2013
Publication statusPublished - 1 Jul 2014


  • K-means
  • Pakistan
  • Seismic clusters
  • Seismic point source
  • Seismicity

Cite this