gsopt

GSOpt: Human-in-the-loop Batched Optimization

Contents

  1. Overview
  2. Google Sheet Usage
  3. Backend Architecture
  4. The Optimizer
  5. Pros and Cons
  6. Recommendations and Examples
  7. Benchmark Test Results

Find the latest version here.

Overview

Optimizing real-world engineering problems is often challenging. When the function you want to optimize is expensive, noisy, and lacks a simple mathematical form (aka “black box”, with no information on the derivatives of the function), many traditional methods fall short. GSOpt is designed for this type of problem by implementing a human-in-the-loop batched submission optimization workflow for noisy black box problems. The optimization process is orchestrated through a Google Sheet, so it’s easy to use for engineers.

The basic workflow is:

  1. The optimizer suggests a batch of initial points based on the parameter settings.
  2. You run the experiments and copy the results into the Google Sheet.
  3. Ask the optimizer for new promising points, and use the analysis plots to keep track of the optimization. Repeat as necessary.

The Google Sheet acts as the central hub, storing all experimental data, providing basic analysis plots, and serving as the user interface for interacting with the optimization engine. All the optimization code and macros are in this repo: https://github.com/PaulENorman/gsopt.

Google Sheet Usage

Make a Copy of the Sheet

Make a Copy

To use GSOpt, first save a copy of the template to your own Google Drive:

Opening the Sidebar

Top Menu

To begin, click Extensions → GSOpt → Open Sidebar from the top menu to open the optimization sidebar.

Sheet Tabs Overview

Sheet Tabs

The workbook contains three main tabs:

Optimizer Settings

Optimizer Settings

The sidebar provides controls for configuring the optimization algorithm:

For recommendations on which settings to use, see Recommendations and Examples.

Optimizer Controls

Optimizer Controls

Use the sidebar buttons to control the optimization process:

Entering Data After Initialization

Data Post-Initialization

After clicking Initialize, the optimizer will populate the Data sheet with initial parameter combinations to test. Enter the results in the Objective column after running your experiments. In-sheet charts on the Analysis tab will update automatically.

Entering Data After Ask

Data Post-Ask

After clicking Ask, new parameter combinations will be added to the Data sheet. Run your experiments and enter the objective values. Repeat the Ask → Experiment → Enter Data cycle to continue optimization.

Convergence Plot

Convergence Plot

Shows the best objective value found over iterations. For minimization problems, it should generally decrease as better solutions are found.

Evaluations Matrix

Evaluations Matrix

Visualizes sampling locations in parameter space. Over time, you should see clustering around promising regions.

Objective Partial Dependence

Partial Dependence Plots

Shows how the modeled objective varies with each parameter while marginalizing over the others. See the scikit-optimize documentation.

Parallel Coordinate Plots

Parallel Coordinate Plots

Provides a multi-axis view of parameters and objective values.

Parameter Settings

Parameter Settings

Configure your optimization problem:

When you update names or the objective in Parameter Settings, headers in the Data sheet update automatically. Charts in the Analysis tab refresh when objective values are edited.

Backend Architecture

The gs-opt backend is a lightweight Python web service built with Flask. It’s designed to be stateless, which makes it robust and easy to deploy on serverless platforms like Google Cloud Run. For each API call, the optimizer is rebuilt from the settings and data provided in the request.

The key components are:

The Optimizer

The optimization is powered by scikit-optimize, a robust and popular library for sequential model-based optimization. gs-opt uses its Bayesian optimizer to intelligently navigate the search space. You can find more details about the library on the scikit-optimize website.

Bayesian Optimization with scikit-optimize

Bayesian optimization is an efficient strategy for finding the maximum or minimum of black-box functions. It works by building a probabilistic model of the objective function (the “surrogate model”) and using that model to select the most promising points to evaluate next. This approach is effective for problems where each function evaluation is costly (e.g., time-consuming experiments or expensive computations).

In gs-opt, you can configure the scikit-optimize backend by choosing the surrogate model (regressor) and the acquisition function.

Pros and Cons

Pros:

Cons:

Recommendations and Examples

Based on our testing, we have the following recommendations for starting your optimization:

Benchmark Test Results

The plots below show the performance of different optimizer configurations on standard benchmark functions. These tests were run using the evaluate.py script in the repository.

Test Configuration

The results were generated with the following settings to simulate a realistic use case:

Regressor Performance

The following plot compares the performance of different surrogate models (regressors) on the Rosenbrock function, a classic difficult non-convex problem. All optimizers used the gp_hedge acquisition function. The SKOPT-GP (Gaussian Process) model consistently finds a better solution faster than the tree-based methods.

Rosenbrock Regressor Comparison

Acquisition Function Performance

This plot compares different acquisition functions for the SKOPT-GP optimizer on the Ackley function, which has many local minima. The gp_hedge strategy shows strong, consistent performance. LCB with a high kappa (k=4.0) is also effective at exploring, while LCB with a low kappa (k=0.5) exploits more and converges slower on this particular problem.

Ackley Acquisition Function Comparison