OCBC Bank has found a way to tweak the MAS’ Veritas open-source toolkit so that artificial intelligence/machine learning (AI/ML) developers can test the "fairness" of their models faster and more frequently, and has successfully contributed the optimisations back to the project via GitHub.
Head of data science Andrea Pisoni revealed the bank’s “first ever open source contribution” in a brief LinkedIn post late last month.
More details of the successful pull request - it was formally merged, or approved, on Monday - have now been revealed via a briefing held on September 9 involving MAS and OCBC speakers.
Veritas is a large-scale initiative involving the Singapore government and financial sector that aims to “strengthen internal governance around the application of AI and the management and use of data.”
OCBC Bank came into Veritas under phase two.
Earlier this year, a Veritas toolkit was released that is designed to help financial institutions automate assessments of the "fairness" of their AI and ML models.
“When the MAS released the toolkit, we wanted to make sure this toolkit could fully integrate into all the different components of machine learning engineering and MLOps that we already made available to our data scientists,” Pisoni told the September 9 webinar.
“What we did at the time is that we forked the GitHub repository and we spent a significant amount of 'calories’ [energy] expanding the capabilities of the Veritas toolkit and sometimes, we think, enhancing it.
“At a certain point, we realised some of these enhancements were actually pretty good ideas and it would have been a good idea to share them back to the wider community.
“So we basically merged some of these enhancements back into the open source branch and we made it public.”
When Pisoni spoke about the submission of the pull request, its status was still for approval.
GitHub repository updated
The GitHub repository was updated this week to show that OCBC’s optimisations have now been incorporated into the toolkit.
The changes by OCBC aim to enhance Veritas’ performance.
“We haven't changed the methodology, we haven't added new functionalities,” Pisoni said. “We have made the current functionalities of the tool run a lot faster.”
Performance tests published by OCBC Bank claim at least a 10x “speed-up” for “computing AI fairness metrics” is possible.
“The speed-up scales linearly with the dataset, so you will see [improvements] both with small and large datasets, but of course when you have large data sets it gets really significant,” Pisoni said.
“With a 10 million row dataset, today it would take about one hour and 40 minutes to compute fairness, while after our changes it only takes 10 minutes."
Pisoni said there are good reasons to want to calculate the fairness of models a lot faster than the toolkit initially allowed, mostly due to the iterative and experimental way in which models are created.
“It means now you can compute [fairness] iteratively while you develop and then… compute it potentially many times for different combinations of models,” he said.
“Usually, you wouldn't compute fairness only one time - you are developing your model, you are trying different things, and as you try different things you want to compute fairness to understand where you stand in the fairness requirements.
“So you probably have to compute fairness many times, and if every time it takes hours to compute, that's not going to fly.”
Pisoni noted that developers - and the institutions they worked for - also wanted quick feedback on the fairness of models.
“Computing fairness more efficiently allows for what we call a better granularity of unfairness identification, where you can really go and understand what are the pockets of customers in your group that are being treated unfairly,” he said.
‘Big step’ for the bank
Pisoni said the decision to open-source its optimisation efforts was “a big step” for the bank.
“It was the first time for OCBC making any code public,” he said.
“You can imagine we had to go through some [internal] approval processes and some difficult questions to answer, but I'm happy to say the leadership at OCBC was very supportive in our innovation journey and so hopefully this is just the first of many open source contributions that OCBC will make.”
Pisoni said that OCBC backed the utility of its changes and saw a possibility for them to benefit the broader sector as well as Singapore; Veritas is part of the Singapore National AI Strategy and a high-visibility program for the government.
“It’s not just all the other banks that are going to be able to compute [fairness checks] faster, but their customers are probably not going to be impacted by unfair decisions a lot more, and that fosters a better overall fair AI ecosystem for the nation,” Pisoni said.
More contributions possible
Pisoni added that the performance boosts are for the two-phase banking-related use cases for AI/ML models - credit risk scoring and customer marketing.
He said that OCBC had “other ideas on how to increase the performance even further which we haven't merged back into the open source system yet.”
“We don't want to overwhelm the reviewers,” he said. “The [first] pull request is quite limited so that they can focus on these [optimisations], review them, get confidence and then merge [them].
“But we will have more to come.”