Big Data and Credit Scoring in Indonesia

Darmawan Zaini, Chief Technology & Product Officer, UangTeman

Darmawan Zaini, Chief Technology & Product Officer, UangTeman

As one of the hottest emerging markets in the world, forecasted by the IMF to overtake Brazil and Russia by the end of the decade, Indonesia is a pretty exciting market to be in right now. With over 104.2 million people accessing the internet and 65.2 million people owning a smartphone in 2016, it is one of the biggest online markets worldwide. Having said that, it is also known that only a small number of people in emerging markets actually have their data in public credit registries. This is the challenge we’re currently facing at UangTeman, as the first online short-term micro loan provider in Indonesia: how to assess someone’s credit worthiness when they have no credit history whatsoever?

Finding alternative/nontraditional data is easier said than done in Indonesia, as most of these data are not publically accessible. Which brings us to two other data sources: online data (browsing history, social media data) and mobile data (call logs, location data, top up information, etc).

These kinds of data are easier to attain, with the proper user consent.

“Combined with other publically available regional financial data, we were able to muster a smart engine that helps with our creditworthiness assessments at scale, letting us perform experiments on subsets of our applicants, continuously fine-tuning the engine.”

Online and mobile behavioral data may not be enough for making good credit assessments, but it definitely can improve the results if used on top of traditional credit scoring methods. We can, for instance, look at users’ social media and/or email behavior, and correlate that to repayment/default behavior. As we assumed the existence of these correlations, we must always conduct experiments to validate them. At the end of the day, behavioral data can really be an indicator of someone’s financial behavior.

Having validated our assumptions on these online and mobile “Big Data”, we actually ended up with some “Smart Data” that we could create algorithms, patterns and math models off of. Combined with other publically available regional financial data, we were able to muster a (potentially) smart engine that helps with our creditworthiness assessments at scale, letting us perform experiments on subsets of our applicants, continuously fine-tuning the engine. As these experiments might prove costly in the beginning, the iterations are surely giving us a smarter engine each time. Seeing our engine’s Gini Coefficient to surpass the 0.6 mark, we find ourselves having better quality borrowers, reflected by lower delinquency and default rates.

As we strive to provide responsible financial inclusion to the under-banked at speed and scale, data definitely is our Holy Grail. Data is everywhere, just sitting around waiting to be analyzed. Theoretically, you can assess creditworthiness out of anything, as long as you have the risk appetite for it. Although traditional underwriting methods and human judgments are not to be undermined, we dream of building a credit scoring engine that can make good informed decisions out of all sorts of alternative data. Who knows, soon we’d be able to score someone’s creditworthiness out of their sleep cycle.