DeepMind & Databricks: Revolutionizing AI with Kaggle Game Arena

OpenAI released GPT5 to public during the first week of August 2025, that was one of the biggest AI news. On the other hand, in the chess world, Google quietly partnered with the world chess champion Magnus Carlsen to launch Kaggle Game Arena, a new benchmarking platform where AI models and agents compete head-to-head in a variety of strategic games to help chart new frontiers for trustworthy AI evaluation. The code is also open sourced here: https://github.com/google-deepmind/game_arena/

Kaggle Game Arena Launch

Onboarding Game Arena to Databricks

While Google released the source code but it is not always easy to manage the infrastructure to run the tournaments and then review not only the results but also the AI thinking behind the scenes, not to mention a way to manage various AI models.

Fortunately, Databricks has all the components readily available:

Mosaic AI Gateway — allows us to register external models and manage performance as well as API keys
Foundation Model API — it comes with Llama, Claude and gpt-oss models out of the box
Managed MLflow — experiments tracking and agents tracing
Databricks Apps — securely host the interface for Game Arena
Databricks SQL — for post-game analysis

Vibe coding with GPT5 in Github Copilot

With so much hype on GPT5, we will see if it lives up to the expectation. We will start with the following prompts:

Can you migrate the code so that it uses Databricks’ features

Foundation model API if available
AI Gateway that encapsulates external models
MLflow tracing: https://mlflow.org/docs/latest/genai/tracing/ for agent chain of thoughts, pls create one experiment per game, aka no autolog()
chess move logging in databricks sql, along with the mlflow experiment id and a unique gameid, format is one game per PGN per row
create a databricks app to host game areana (https://docs.databricks.com/aws/en/dev-tools/databricks-apps/get-started)
create a UI wrapper on top of game areana for parameters selection
Allow tournament setup in UI
Allow live boardcast in lichess: https://github.com/lichess-org/broadcaster
For each round, display a game bracket in the UI: https://www.kaggle.com/benchmarks/kaggle/chess-text/tournament
Update ReadMe for Databricks Apps deployment instructions

We still need to understand code and Databricks

I’d say GPT created 50% of the code in no time, which was a great start but the remaining requires expertise in Databricks to make it happen.

We will start by examine what GPT5 did correctly:

It understands Game Arena and able to write a wrapper
It created a very barebone UI that works but not impressive
It understands how to do MLflow and got the code setup correctly
It updated the documentation for me, despite it follows the favor of GameArena (we will host it on Databricks Apps, so it’s a little different)

Now where do we go from here?

It was a good start because very quickly I understand what needs to be changed in the codebase, so I don’t need to spend a lot of time to understand Game Arena. Unfortunately, I can only dream of driving this from start to end with English. Perhaps GPT5 still needs to read some Databricks books to understand how things are done.

Below are things that I had to teach GPT5:

We need to prioritize the Foundation Model API endpoints and use Databricks SDK to query them.
We need to create a special class to in Game Arena (model_generation_sdk) to handle all Databricks requests
Logging the thoughts into managed MLflow is the most tricky one because it involves client side and server side both. We need very comprehensive understanding of auto vs manual tracing. In this case, I opted to use the decorator approach.
GPT5 does not understand the folder structure and the app.yml file for Databricks Apps. We need to provide documentations.

Finally, after some back and forth with AI, we can capture the Game Arena output in Databricks:

The results align with the recordings from Game Arena:

Finally, we can also replay the game in our Databricks Chess App! The possibilities are endless!

The source code can be found on GitHub:

https://github.com/rwforest/game_arena/tree/databricks_apps

AUTHOR - FOLLOW
Jason Yip
Director of Data and AI, Tredence Inc.

Next Topic

Do You Need a Transportation Management System Now? A Complete Guide for Modern Logistics

Next Topic

Game Arena — DeepMind on Databricks

Like the blog

Table of contents

Like the blog

Table of contents

Onboarding Game Arena to Databricks

Vibe coding with GPT5 in Github Copilot

We still need to understand code and Databricks

Now where do we go from here?

Do You Need a Transportation Management System Now? A Complete Guide for Modern Logistics

Do You Need a Transportation Management System Now? A Complete Guide for Modern Logistics

recommended articles

Thank you for a like!

Share this article

Industries

Services

Solutions

Blogs

Data & AI 101

Client Success

Life at Tredence

Careers

Contact us

C.A.R.E.

Certifications

Follow us on