In many applications such as in financial and e-commerce areas, it is valuable have a profound understanding of the type of customers one is dealing with. The focus of this work is on detecting customer segments in online peer-to-peer (P2P) lending data from Lending Club Club (2019) with cluster analysis and tracking these segments over time. We use the framework proposed by Sangam and Om (2018), where we cluster using K-means, K-prototypes, agglomerative clustering, DBSCAN and Gaussian Mixture Models (GMM) to identify customer segments and employ label strategies by extra trees classifiers and an Hybrid Data Labeling Algorithm (HDLA). In the application we conclude that by their incremental approach, K-means, K-prototypes and GMM allow for smooth cluster tracking while retaining the overall partition structure over time. DBSCAN is ineffective in detecting multiple segments in this setting whereas agglomerative clustering proves to be ineffective in tracking clusters by their significant shift in quarterly structure.

cluster analysis, cluster tracing, customer segmentation, customer classification, alternative finance, peer-to-peer lending, K-means clustering, K-prototypes clustering, hierarchical clustering, DBSCAN, Gaussian Mixture Models
Zhelonkin, M.
Erasmus School of Economics

Aerts, L. (2020, May 28). Tracking Customer Segments in Alternative Finance using time-evolving Cluster Analysis. Econometrie. Retrieved from