When an algorithm decides on the loan
Whether it's a satnav, a dating app or a smart watch: algorithms are becoming steadily more powerful and are now an integral part of many people’s everyday lives. But they often have unwelcome side effects. A detailed study by BaFin (see info box “BaFin’s approach”) illustrates how banks use algorithms in their lending operations and what benefits and problems may be involved.
Algorithms make decisions in the shadows. They use a variety of criteria and processes to determine whether clients are creditworthy. Although this can be both a sensitive and a risky decision, it is increasingly being handled by computers.
Banks do not tell their clients if they are using Machine Learning to process their loan applications. This is a fundamental difference from fully automated portfolio management, for example, when investors knowingly opt for this type of management of their securities investments, which is usually also cheaper.
At a glanceBaFin's approach
For a number of years now, a variety of BaFin's sectors have been addressing Artificial Intelligence, Big Data and Machine Learning in interdisciplinary teams. Last year, BaFin experts from banking supervision, consumer protection and innovations in financial technology were able to gain an overview of how banks in Germany use algorithms in their client-facing business.
Their analysis of how algorithms are used in the lending client business was particularly revealing. Supplementing the analysis of their own data on less significant institutions (LSIs) in Germany, the BaFin supervisors exchanged information with banking associations, IT service providers and credit agencies. These organisations gave them a deep dive into their processes and explained their positions on the associated technological and social uncertainties. This resulted in a practice-based, empirically saturated and issues-focused overview that can serve as a basis for BaFin’s future positioning.
Automated procedures in retail banking
In non-risk-relevant retail business, institutions are using both semi-automated and fully automated procedures in the lending process. Institutions determine the risk relevance threshold on the basis of the Minimum Requirements for Risk Management (MaRisk). This mainly depends on the institution’s size and business strategy.
Some institutions use fully automated procedures, for example in the instalment loan business. In most cases, bank employees are no longer involved, unless new clients are involved. In this case, for example, they document the results of the credit rating, confirm the lending conditions and enter them in the contract.
BaFin did not identify any fully automated procedures in the building finance business. Nor did it expect this, as such processes are not likely to meet the current high requirements for expert collateral appraisal in the real estate sector.
For this reason, even loan decisions on building loans that banks are able to send out within a few minutes in the course of online procedures are subject to verification by bank employees of the data and the value of collateral.
Corporate banking: Not without human involvement
Corporate loans are considered to be risk-relevant business above a certain threshold. In such cases, a second vote is mandatory and bank employees must be involved. They must document their risk assessment. They can use a variety of automated and algorithm-based assessment procedures to do this.
In practice, client advisors obtain information about the company and also generate the application scoring in some of the cases examined by BaFin. In the case of commercial loans, they often merely record the qualitative information, while back-office credit analysts prepare the quantitative rating information using standardised financial statement analyses and debt-servicing capacity calculations.
The credit institutions then process the manually entered or automatically recorded data using algorithm-based procedures for the business client rating. Some banks automatically import the underlying annual financial statements and, if applicable, additional data from financial and earnings planning into the relevant rating system. These processes are acceptable as long as they meet the requirements for the explainability, transparency and reliability of the data used in automated processes specified in the seventh amendment to MaRisk.
Banks develop their own processes
Some larger credit institutions themselves develop the algorithm-based rating, scoring or decision-making systems used in their lending processes. However, most institutions or their associations involve external service providers in the development of these systems. These service providers increasingly employ Machine Learning and Artificial Intelligence methodologies.
The automated decision-making techniques used in the lending process are currently typically based on proven Machine Learning techniques such as logistic regression. In some cases, systems like this can get by without explicit rules.
Alternatively, the system determines these rules independently by identifying patterns in the data sets (“training data sets”) or recognising correlations, in each case using a predefined mathematical-statistical method.
The system then uses them to analyse and evaluate new data sets. In the case of lending, the training data sets consist of data gathered from past loan applications and the decisions made or their outcome. The systems thus generated can be readjusted over and over with new training data sets (recalibration).
Machine Learning needs balanced training data
The algorithm‘s prediction quality depends critically on the scope and balance of the training data sets. If the training data is not balanced in certain respects, for example if it is not representative as far as the target user group is concerned, distortions may arise that lead to certain groups of people being disadvantaged or privileged. Another factor is that it is not clear at the outset what should be considered as balanced training data.
The various Machine Learning approaches differ in how well they can model complexity and deal with unstructured data, among other things. Logistic regression, which is often used in credit assessment, captures linear relationships. Its calculations are relatively easy to understand. It allows an explanation of how a decision is reached and how strongly a characteristic is factored into a decision. This means that this approach is transparent in principle, and also that it can be controlled.
Algorithms may discriminate
Algorithm-based decision-making systems offer many advantages from the perspective of the banks. They can automate and accelerate many processes, and they save on staff. In an ideal scenario, the systems make decisions based on common standards and are therefore objective. Clients can apply for a loan at any time and receive an approval or rejection within a few minutes. In the case of small loans, the money can be made available almost immediately via an instant payment.
At the same time, experts are constantly drawing attention to the social risks of automated decision-making systems. They are warning that algorithms can seize on, reinforce and widen existing forms of discrimination. They point out that human prejudices, misunderstandings and bias flow into these software systems, which will determine our everyday life to an ever greater
extent.1 BaFin already highlightedthis several years ago.
Against this backdrop, BaFin also examined the specific characteristics of the clients whose applications are evaluated by the automated systems. This revealed only a few cases in which rating procedures involved discrimination, for example by gender. The procedures have since been revised.
According to BaFin's findings, the banking industry has been working for a number of years to eliminate discrimination against applicants by its algorithmic procedures. For example, banks explicitly no longer use critical characteristics such as residence and origin as parameters for risk classification procedures. This also applies to the Schufa scores collected by the credit institutions.
However, it is not possible to entirely eliminate correlations between financial parameters such as net income or assets and personal characteristics such as gender, residence or origin. With regard to such correlations, however, it needs to be emphasised that considering the economic circumstances of borrowers is required by supervisory law.
In addition to potential discrimination, consumer protection organisations are concerned, especially in times like these, that consumers could become increasingly overindebted through the use of automated systems or that they could be denied a loan despite having a good credit rating. They are also calling for greater transparency about the use of automated decision-making systems, as well as the establishment of bodies that can explain the decisions made and can handle complaints. From a data protection perspective, it is also necessary to keep an eye on how personal data is handled, stored, secured and shared – especially in the case of external service providers.
Supervisory requirements are technology-neutral
The MaRisk amendment also requires banks to pay even greater attention to consumer protection aspects in future. Risk classification procedures should therefore be examined even more closely by the banking supervisory authorities in future to determine whether they use parameters that are unlawfully discriminatory. However, not every form of discrimination is against the law.
Rather, it is essential to consider which factor can be considered appropriate for lending, for example.
The question of whether algorithm-based techniques in the lending business are allowed under supervisory law does not depend on the degree of automation or the technology used. The decisive factor is the risk relevance of the loans as well as the compliance of these processes with the minimum requirements for prediction quality, data quality, explainability and transparency. In future, the MaRisk amendment will combine these requirements in a separate module AT 4.3.5 MaRisk. They are intentionally worded to be technology-neutral.
In addition, human involvement in the decision-making processes for model development and validation must always remain recognisable and documented in accordance with supervisory law.
Glossary of terms
Instant payment: Transfers that make the amount being transferred available to the payee within ten seconds of receipt of the order. In the EU, this has been possible since 2017 using the SEPA Instant Credit Transfer scheme.
Machine Learning (ML): A sub-field of Artificial Intelligence. It is based on applied statistics and mathematical optimisation. There are various definitions of ML. In its study “Big Data meets Artificial Intelligence”, BaFin defines Machine Learning very broadly as the notion of giving computers the ability to learn from data and experience through suitable algorithms. Compared with rule-based approaches, the system learns without the programmer specifying which outcomes should be derived from certain data constellations, and how. Computers can thus construct a model of their world and better solve the tasks that are assigned to them.
Linear/logistic regression: A classic statistical methodology that has proven its worth in practice for many decades and is now used in ML. The coefficients (influencing variables) of the models generated by the regression can be easily analysed and interpreted. This makes the model transparent and a powerful tool for data analysis. However, regression is limited to linear models and therefore cannot normally adequately capture more complex correlations.2
MaRisk: Based on section 25a of the German Banking Act (Kreditwesengesetz – KWG), the Minimum Requirements for Risk Management (MaRisk) provide banks with a comprehensive, principle-based framework developed together with the banking industry, which at the same time still gives institutions scope for tailored implementation.
Schufa score: Schufa Holding AG is a credit agency in which savings banks and cooperative banks hold a majority interest. The Schufa score provides information on the probability that an individual or company will meet their payment obligations. To do this, Schufa evaluates large quantities of data, some of which it receives directly from partner banks. Various characteristics play a role here, such as whether the client has repaid previous loans or how many accounts, loans and credit cards they have. The Schufa score is used by very many banks in Germany when lending money, although mobile phone companies also use it. Consumers can request their personal Schufa score once a year free of charge (copy of the personal data under Article 15 of the GDPR) and review it for any errors in the data.
Footnotes:
First, please LoginComment After ~