iTnews Asia
  • Home
  • News
  • Data and Analytics

Singapore launches generative AI evaluation sandbox

Singapore launches generative AI evaluation sandbox

To create standard set of benchmarks to assess products.

By Abbinaya Kuzhanthaivel on Nov 2, 2023 10:38AM

The Infocomm Media Development Authority of Singapore (IMDA) along with the AI Verify Foundation has launched the Generative AI Evaluation Sandbox, a new initiative to enable the evaluation of trusted artificial intelligence (AI) products and reveal potential gaps.

IMDA said the Sandbox will make use of a new Evaluation Catalogue to set out common baseline methods and recommendations to assess generative AI products.

It is guided by the catalogue that compiles commonly used technical testing tools, organising these according to what they test for and their methods, and recommends a baseline set of tests.

The aim is to establish a common language and support "broader, safe and trustworthy adoption of Gen AI," it added.

IMDA said the Sandbox will also help build evaluation capabilities beyond what currently resides with model developers.

It will involve players in the third-party testing ecosystem, to help model developers understand what external testers would look for in responsible AI models.

"Where possible, each Sandbox use case should involve an upstream Gen AI model developer, a downstream application deployer and a third-party tester to demonstrate how the different players in the ecosystem can work together," IMDA explained.

The initiative has garnered support from global market players like Amazon Web Services, Anthropic, Google, and Microsoft and several other participants including Deloitte, Nvidia, EY, IBM, OCBC Bank and Singtel.

By involving regulators like the Singapore Personal Data Protection Commission (PDPC), the Sandbox will offer space for experimentation and development and allow all parties along the supply chain to be transparent about their needs.

Further, IMDA expects Sandbox to uncover gaps in the current evaluations, including domain-specific applications, such as human resources and cultural-specific areas, which are currently underdeveloped.

"Sandbox will develop benchmarks for evaluating model performance in specific areas that are important for use cases, and for countries like Singapore because of cultural and language specificities," IMDA said.

The authority said it is collaborating with Anthropic on a Sandbox project that uses the catalogue to identify aspects for red teaming, which challenges plans, policies and assumptions used in AI by adopting an adversarial approach.

IMDA will deploy Anthropic's models and research tooling platform to develop red-teaming methodologies for Singapore's diverse linguistic and cultural landscape, for instance, AI models will be tested for their abilities to perform within the country's multi-lingual context.'

The agency is now inviting industry partners to collaboratively build evaluation tools and capabilities in the new sandbox.

Earlier in July, IMDA partnered with Google to launch privacy-enhancing technologies (PET) x Privacy Sandbox to support businesses that wish to pilot PET projects.

The Singapore government had also launched two sandboxes, one for exclusive use by government agencies and the other for enterprises to develop and test Gen AI applications.

To reach the editorial team on your feedback, story ideas and pitches, contact them here.
© iTnews Asia
Tags:
ai verify foundation anthropic data and analytics google imda microsoft software

Related Articles

  • Gulf Marine upgrades core systems to streamline global operations
  • As AI moves to production, enterprises must confront limits of current stacks
  • Trust is the catalyst for Agentic AI innovation
  • Sunday unifies operations to support multi-market insurance expansion
Share on Twitter Share on Facebook Share on LinkedIn Share on Whatsapp Email A Friend

Most Read Articles

As AI moves to production, enterprises must confront limits of current stacks

As AI moves to production, enterprises must confront limits of current stacks

Bupa elevates digital, data chief to APAC executive team

Bupa elevates digital, data chief to APAC executive team

DBS Bank leverages data to raise operational efficiency and customer engagement

DBS Bank leverages data to raise operational efficiency and customer engagement

Jollibee Group unifies feedback data to enhance customer experiences

Jollibee Group unifies feedback data to enhance customer experiences

All rights reserved. This material may not be published, broadcast, rewritten or redistributed in any form without prior authorisation.
Your use of this website constitutes acceptance of Lighthouse Independent Media's Privacy Policy and Terms & Conditions.