Exercise 3 moves to the second dataset, MATH, which holds benchmark results on MATH Level 5 — high-school competition math — for a long list of AI models with their release dates. The three subparts build toward the hardest LAMBDA in the exam: 3a asks you to find the top-performing model in each quarter of a year (or range of years), returning a dynamic array of year / quarter / model / score; 3b asks you to make the year arguments optional so FINDBEST() with no arguments queries the whole dataset; 3c asks you to plot score against date and write a paragraph about what the pattern means. The LAMBDA in 3a is where most students slow down, because it has two things happening at once; a helper function that does the per-quarter work, and an outer function that sweeps over all the years and quarters and assembles the output. Build the helper first, confirm it works on one quarter, and only then wrap it in the outer loop.

Quick orientation to the MATH table

MATH has four columns: id, model, score, date. id is a long string like o1-mini-2024-09-12_high; model is the friendlier name like o1-mini; score is a fraction between 0 and 1; the proportion of problems solved correctly; date is the release date as an Excel date value. One row per model release. (Solutions has date and score swapped; see note at 3c.) The table spans from mid-2023 (earliest entry is gpt-4-0613 from June 2023) through early May 2025, with 95 rows. Several models show up multiple times as they're rerun at different settings (Claude 3.7 Sonnet has entries for 16K, 32K, and 64K thinking budgets, for example); each gets its own row with its own score.

You'll use every column. date drives the quarter filtering, score is what you aggregate, model is what you report, and id is uninteresting for this exercise.

Excel Labs: same reminder as before; the LAMBDAs in 3a and 3b live in Excel Labs under the names FINDBEST and FINDBEST_. Lecture 4 introduces the tool; Lecture 6 introduces the LET-and-optional-arguments idiom that 3b relies on.

The shape of 3a, before any code

Try it first.Given a year and a quarter, how would you find the best-performing model? Write the steps in plain English before looking at the LAMBDA.

Building the helper first — FINDBEST_

The official solution splits the work into two named LAMBDAs: FINDBEST_ (with a trailing underscore) does the per-quarter work, and FINDBEST wraps it with a sweep over all quarters. The trailing-underscore convention is just a name — it says "this is a helper, normally called from inside FINDBEST, but you can still call it yourself for testing".

Excel
FINDBEST_ = LAMBDA(year, quarter,
    LET(
        start,   DATE(year, (quarter - 1) * 3 + 1, 1),
        end,     EOMONTH(start, 2),

        dates,   MATH[date],
        models,  MATH[model],
        scores,  MATH[score],

        mask,            (dates >= start) * (dates <= end),
        filtered_scores, IFERROR(FILTER(scores, mask), NA()),
        filtered_models, IFERROR(FILTER(models, mask), NA()),

        has_data,   AND(COUNTA(filtered_scores) > 0,
                        ISNUMBER(SUM(filtered_scores))),
        max_score,  IF(has_data, MAX(filtered_scores), NA()),
        best_model, IF(has_data,
                       XLOOKUP(max_score, filtered_scores, filtered_models),
                       "no data"),
        score_out,  IF(has_data, max_score, "no data"),

        HSTACK(year, quarter, best_model, score_out)
    )
)

That is the longest LAMBDA in the exam, so take it in pieces.

The date boundaries. DATE(year, month, day) constructs an Excel date value from three integers. DATE(2024, 7, 15) is the date value for July 15, 2024. The expression (quarter - 1) * 3 + 1 turns a quarter number into the starting month: Q1 → 1, Q2 → 4, Q3 → 7, Q4 → 10. So start is the first day of the quarter.

EOMONTH(start_date, months_to_add) returns the last day of the month that is months_to_add months away from start_date. EOMONTH("2024-01-01", 2) is "2024-03-31" — start of January plus two months lands in March, and EOMONTH snaps to the last day of that month. So end is the last day of the third month of the quarter — i.e., the end of the quarter. That's what the hint was getting at.

The column aliases. dates, models, scores are just shorthand for the three MATH columns. LET lets you name them once and reuse them.

The filter. mask is a boolean column: TRUE on rows whose date falls between start and end inclusive. The (a >= b) * (c <= d) shape is the same boolean-multiplication-as-AND idiom you saw in Exercise 1 (1f, 1g). filtered_scores and filtered_models apply the mask to the score and model columns respectively, and wrap the result in IFERROR(..., NA()) for the case where zero rows match — FILTER errors if nothing passes, and IFERROR swaps the error for an #N/A marker.

The has-data check. Two conditions wrapped in AND. The real work is done by ISNUMBER(SUM(filtered_scores)): if filtered_scores is #N/A (i.e., IFERROR replaced a no-match FILTER error), then SUM(#N/A) propagates the error, and ISNUMBER(#N/A) is FALSE, so the quarter is correctly flagged as having no data. The COUNTA(filtered_scores) > 0 check is mostly defensive — COUNTA counts any non-empty cell, including error cells, so it returns 1 even when filtered_scores is just a single #N/A. So COUNTA > 0 is essentially always TRUE in this design; it doesn't change the outcome but it's a belt-and-braces sanity check that the array isn't somehow empty.

The outputs. max_score is the best score in the quarter, or #N/A if no data. best_model is the model that achieved it, found with XLOOKUP(lookup_value, lookup_array, return_array) — look up max_score inside filtered_scores, return the matching entry of filtered_models. If there's no data, it becomes "no data". score_out mirrors the score: real value if data, text "no data" otherwise. The reason score_out and max_score are separate is that max_score has to stay numeric for the XLOOKUP, while the reported score can be text.

The assembly. HSTACK(year, quarter, best_model, score_out) places the four outputs in a one-row, four-column block. That's what FINDBEST_(year, quarter) returns: a single row with four cells.

Save this as FINDBEST_ in Excel Labs, then test it: =FINDBEST_(2024, 3) should spill 2024 | 3 | o1-mini | 0.8918; the highest score in July–September 2024 belongs to the high-effort o1-mini reasoning model (id o1-mini-2024-09-12_high) released 12 September 2024. If you get "no data" back in both slots, check the quarter mapping — Q3 2024 should map to July–September 2024, not the wrong months.

Common mistake: forgetting the -1 in (quarter - 1) * 3 + 1. Writing quarter * 3 + 1 maps Q1 to 4 (April), not January. Read your DATE(year, ..., 1) back and sanity-check the month.

Another easy slip: EOMONTH(start, 3) instead of EOMONTH(start, 2). Adding 3 months from January 1 lands in April — you'd include April in Q1, which is wrong. You want 2 months, so the quarter ends at the last day of March.

Lecture tie-in. LET with many locals is Lecture 6 (06lecturefilled.xlsx sheet "Local variables with LET"). For FILTER with a boolean mask, 03lecturefilled.xlsx sheet "6 UNIQUE and FILTER" has the core examples, and Solutions.LARGEFILTERBY in 04exercises.xlsx is a worked parallel. For XLOOKUP, see 03lecturefilled.xlsx sheet "4 The LOOKUP functions". The AND(COUNTA(...) > 0, ISNUMBER(SUM(...))) has-data pattern isn't explicitly taught in lecture, but it's a direct descendant of the missing-value guards in Solutions.IFBLANK (04exercises.xlsx) — same idea of checking emptiness before using a value.

Wrapping with FINDBESTMAKEARRAY over quarters

The outer function loops over every quarter in the year range and stacks the results, with a header row on top.

Excel
FINDBEST = LAMBDA([start_year], [end_year],
    LET(
        years, YEAR(MATH[date]),
        start, IF(ISOMITTED(start_year), MIN(years), start_year),
        end_,  IF(ISOMITTED(end_year),   MAX(years), end_year),

        n, (end_ - start + 1) * 4,

        results, MAKEARRAY(n, 4,
            LAMBDA(i, j,
                LET(
                    q_index, i - 1,
                    year,    start + INT(q_index / 4),
                    quarter, MOD(q_index, 4) + 1,
                    best,    FINDBEST_(year, quarter),
                    INDEX(best, j)
                )
            )
        ),

        VSTACK({"Year","Quarter","Model","Score"}, results)
    )
)

Walk it top to bottom.

Resolving the year range. years is the column of years from the MATH table's date column. start is start_year if supplied, otherwise the smallest year in the dataset. end_ is end_year if supplied, otherwise the largest. ISOMITTED(arg) returns TRUE if the function was called without that argument — that's the whole 3b optionality trick, which we'll come back to below.

start and end_ are renamed to avoid collision with Excel's internal keyword end. Some Excel versions let you use end as a local name; others don't. Picking end_ with a trailing underscore sidesteps the question.

Counting total quarters. n is the total number of quarters in the range: years times four. (2025 - 2023 + 1) * 4 = 12, so for years 2023-2025 you build a 12-row output.

The core loop: MAKEARRAY. MAKEARRAY(rows, cols, LAMBDA(i, j, body)) builds a dynamic array with rows rows and cols columns by calling the supplied two-argument LAMBDA for each (i, j) position. i is the row number (1 through rows), j is the column number (1 through cols). Here rows = n (total quarters) and cols = 4 (year, quarter, model, score).

                 j=1     j=2       j=3        j=4
                 year  quarter    model      score          q_index  year  quarter
               +------+---------+----------+----------+
         i=1   | 2023 |    1    | no data  | no data  |        0    2023     1
         i=2   | 2023 |    2    | GPT-4    | 0.230    |        1    2023     2
         i=3   | 2023 |    3    | Claude 2 | 0.117    |        2    2023     3
         ...
         i=12  | 2025 |    4    | no data  | no data  |       11    2025     4
               +------+---------+----------+----------+

For each (i, j), the inner LET does: - q_index = i - 1; a zero-based quarter index, running 0, 1, 2, ..., n-1. - year = start + INT(q_index / 4) — integer division of q_index by 4 gives the year offset from start. Quarters 0-3 are year 0, 4-7 are year 1, etc. INT(x) rounds toward negative infinity, which for non-negative integers is just truncation. - quarter = MOD(q_index, 4) + 1 — modulo 4 plus 1 maps 0, 1, 2, 3 to 1, 2, 3, 4. So the first quarter in each year is Q1, as expected. - best = FINDBEST_(year, quarter); a one-row, four-column result from the helper. - INDEX(best, j) — pick out column j of the helper's result. j = 1 is year, j = 2 is quarter, j = 3 is model, j = 4 is score.

MAKEARRAY repeats that per-cell LAMBDA n * 4 times and assembles an n-by-4 block. For start = 2023, end_ = 2025, you get 12 rows × 4 columns.

The header. VSTACK(a, b, ...) stacks arrays vertically; VSTACK({"Year","Quarter","Model","Score"}, results) puts a one-row header on top of the 12-row body. The {...} literal is Excel's array-literal syntax — comma separates columns inside a row, semicolon separates rows. That's the header row you want.

Save this as FINDBEST in Excel Labs. Then =FINDBEST(2023); a call with only the first argument — spills a 13-row, 4-column block: a header plus all 12 quarters from Q1 2023 through Q4 2025. That happens because end_year is omitted and ISOMITTED defaults it to MAX(years) = 2025 (the latest year in the MATH dataset), so the function covers three years rather than just one. The first five rows match the exam's expected example output for FINDBEST(2023):

Year  Quarter  Model     Score
2023  1        no data   no data
2023  2        GPT-4     0.230
2023  3        Claude 2  0.117
2023  4        GPT-4     0.400

Those numbers are in the exam spec — use them as your sanity check for the 2023 portion of the output. Rows 6–13 carry on through 2024 and 2025; the exam example only shows 2023 because part (a) of the question imagines a simpler version of the function that returned just the year you asked for. The official (b) version with ISOMITTED does more, and that's the version saved in the solution workbook.

One efficiency note people sometimes raise: every MAKEARRAY cell calls FINDBEST_(year, quarter) and picks out one column with INDEX. That means FINDBEST_ gets called four times for the same (year, quarter); once for each column — when it really only needs to be called once. That's the only technical wart in the official solution. Rewriting it to call the helper once per quarter would speed things up, but the exam doesn't penalise you for the direct-translation approach; it's a recognised trade-off.

One thing that trips people up: mixing up INT(q_index / 4) with q_index / 4. The latter gives you 0, 0.25, 0.5, 0.75, 1, 1.25, ...; a float series, and start + 0.25 isn't a valid year. INT (or FLOOR) is the floor toward zero you need.

Second common mistake: hardcoding years in the MAKEARRAY loop, e.g., year = 2023 + INT(q_index / 4). Do that and FINDBEST(2024, 2025) still starts in 2023. Keep the loop driven by start.

Lecture parallel. MAKEARRAY is introduced in 05lecturefilled.xlsx sheet "BYROW, BYCOL, and MAKEARRAY". Open that sheet and read the worked examples carefully. The LAMBDA-inside-LAMBDA nesting you see in MAKEARRAY(n, 4, LAMBDA(i, j, ...)) is the defining shape of MAKEARRAY, so if the outer LAMBDA(i, j, ...) looks weird, that's the sheet to re-read. For VSTACK and array literals, same lecture's sheet "HSTACK and VSTACK". For the LET scaffolding, 06lecturefilled.xlsx sheet "Local variables with LET". Open Solutions.AVERAGEN and Solutions.FOOBAR to see the same name-then-use rhythm.

3b — Optional arguments with ISOMITTED

3b is a small but important modification: FINDBEST should work with no arguments. FINDBEST() must return the same result as FINDBEST(smallest_year_in_data, largest_year_in_data).

Predict.How would you make both arguments optional so that calling FINDBEST() still works? Write the idea in plain English before reading on.

years is the column of years across the whole dataset. MIN(years) is the earliest year present; MAX(years) is the latest. If the caller supplied start_year, use it; if not, fall back to MIN(years). Same for end_year with MAX.

The exam is explicit that you must not hardcode 2023 and 2025 here that would break if the dataset gets updated. Querying the data for its own extremes is the right shape.

Save this as your updated FINDBEST. Now =FINDBEST() with no arguments spills the whole dataset's worth of quarters, and the sanity check is that =FINDBEST() equals =FINDBEST(2023, 2025) (the exam says so directly).

Easy to get wrong: forgetting the square brackets around the optional arguments. If the LAMBDA header says LAMBDA(start_year, end_year, ...) (no brackets), Excel rejects the call as having missing arguments before ISOMITTED ever runs — ISOMITTED only works when the parameters are declared optional with brackets. The definition must be LAMBDA([start_year], [end_year], ...) — bracketed names are optional.

Second common mistake: defaulting only one argument. FINDBEST(2024) should give 2024 to 2025 (the default end year), and FINDBEST() should give 2023 to 2025. You need both IF(ISOMITTED(...), ...) lines.

From the lectures. ISOMITTED and optional LAMBDA arguments are introduced in 06lecturefilled.xlsx sheet "Optional arguments" that is the direct lecture for this subpart. Read Solutions.FOOBAR in that file's Excel Labs; it's a three-argument LAMBDA with two optional arguments defaulted via exactly this pattern. For a longer worked example, read Solutions.AVERAGEN in the same file — it takes an optional largest flag with a TRUE default, same idiom.

3c — Plot score against date, and write about the trend

3c: scatter plot with date on the x-axis and score on the y-axis, then a short paragraph on the pattern and its implications.

Click path

  1. Pick date and score and select them with date on the left. (score is actually left of date in the starter; see below for the fix.)

    But score is left of date in mine — what do I do?

  2. Insert tab → Scatter → first option (dots, no lines).
  3. The y-axis is a fraction from 0 to 1 — no log scale needed here, and log scale would actually be wrong for a fraction. Leave the y-axis linear.
  4. The x-axis should be a date axis, so Excel lays out points by true date position (not just categorically). Right-click the x-axis → Format Axis → Axis type → Date axis (this is usually selected automatically for scatter plots, but worth confirming).
  5. Title and label the axes — "MATH Level 5 score by release date", Date, Score.

What you should see

A roughly linear climb on the linear axis from mid-2023 (scores around 0.03 to 0.40 — Llama-2-70B and Mistral 7B near the floor, GPT-4-0613 around 0.23, gpt-4-1106-preview reaching 0.40 by Q4) to early 2025 (top scores clustering 0.90 to 0.97 — o3-mini-high at 0.965, DeepSeek-R1 at 0.931, Claude 3.7 Sonnet 64K at 0.912), with a lot of spread — different-size models have very different scores but a clear upward trend. Towards the end, the top scores crowd against the ceiling at 1.0 because the benchmark has a maximum possible score.

What to write

Four points to mention in the paragraph. The solution workbook has a model version you can paraphrase.

  1. The trend is roughly linear over two years on the score axis. Scores grow from near-zero to near-1 over about eighteen months, and the rate looks roughly constant on the linear axis.
  2. Current top scores are essentially 1.0. Models like o4-mini-2025-04-16_high and o3-2025-04-16_high score around 0.978; o3-mini-2025-01-31_high scores 0.965, and Gemini 2.5 Pro reaches 0.959. The benchmark ceiling is being hit.
  3. If the trend continued past 1, we'd expect future models to do well on harder math too but since the benchmark caps at 1.0, we can't actually measure further progress with this benchmark.
  4. This is benchmark saturation. When a test maxes out, a harder test is needed to differentiate the next generation of models. The MATH benchmark is already being replaced in industry by successors like Putnam-2024 and FrontierMath.

The official wording from the solution: "The scores have increased roughly linearly since the first model was released. Currently, the top score is approximately 1. This suggests that modern AI models are very strong at solving high school-level math problems. If the trend continues beyond a score of 1, we would expect future models to perform well on more advanced mathematics too. However, since this benchmark measures scores on a scale from 0 to 1, we cannot observe performance beyond that upper limit. This limitation is known as benchmark saturation." Paraphrase rather than copy.

The "which AI company is most focused on mathematical capabilities" question from 3a (asked alongside the FINDBEST(2023) output); the expected answer is OpenAI, which dominates the quarterly best-in-quarter leaderboard with GPT-4 (Q2 and Q4 2023), o1-mini (Q3 2024), o1 (Q4 2024), o3-mini (Q1 2025), and o4-mini (Q2 2025) — six of the nine quarters in the dataset. Anthropic also tops three quarters (Claude 2 in Q3 2023, Claude 3 Opus in Q1 2024, Claude 3.5 Sonnet in Q2 2024). Google and DeepSeek post strong individual scores (Gemini 2.5 Pro at 0.959, DeepSeek-R1 at 0.931) but don't win any quarter.

Watch out: using a line chart instead of scatter. Line chart connects dots, which looks like you're claiming a continuous function of time — not what you want when models are discrete releases. Scatter is correct.

Second common mistake: putting log scale on the score axis. Scores are fractions; log(0.1) is -1, log(0.9) is -0.046. A log axis squeezes high scores together and stretches low scores. On a linear axis 0 to 1 the visual spacing matches the quantities, which is what you want.

Where this came from. Chart construction is in Part 2 of the course; the chart lecture 02Charts.pdf and its companion workbook 02Charts.xlsx. For scatter plots with date x-axes specifically, 02Charts.xlsx sheet "2-AxesScales" has the worked examples; sheet "1-BadCharts" has the cautionary cases. The benchmark-saturation commentary isn't formally taught but it's the reason Jonas and Adam keep emphasising that the choice of benchmark matters; an observation worth making explicit in your write-up.

If this still feels shaky

If the MAKEARRAY construction is the block, stop working on FINDBEST and open 05lecturefilled.xlsx sheet "BYROW, BYCOL, and MAKEARRAY". Work through each worked example in Excel Labs until you can predict what a 3x3 MAKEARRAY(3, 3, LAMBDA(i, j, i * 10 + j)) call produces. Then come back to FINDBEST.

If the helper function FINDBEST_ errors on a particular quarter, test it in isolation: =FINDBEST_(2023, 1) should return {2023, 1, "no data", "no data"} (there's no 2023 Q1 data in MATH). =FINDBEST_(2024, 3) should give a real model and real score. Getting each quarter right in isolation is much easier than debugging it through the MAKEARRAY wrapper.

If ISOMITTED is the block, open 06lecturefilled.xlsx sheet "Optional arguments" and read Solutions.FOOBAR and Solutions.AVERAGEN. Both are two- or three-argument LAMBDAs that fill in defaults for missing arguments exactly the way 3b wants.

Glossary

Array literal

A constant array written inline, e.g., {"Year","Quarter","Model","Score"}. Commas separate columns inside a row; semicolons separate rows. Used in VSTACK inside FINDBEST.

COUNTA(range)

Counts non-empty cells in range, including cells containing errors (COUNTA(NA()) is 1, not 0). Used in FINDBEST_'s has_data check as a defensive sanity check; the real work of distinguishing "real scores" from "no data" is done by the ISNUMBER(SUM(...)) clause that sits next to it inside the AND.

DATE(year, month, day)

Constructs an Excel date value from three integers. Used to build the start-of-quarter date in FINDBEST_.

Date axis

A chart axis type that lays out values by real date position, not by ordered position in the data. The right setting for scatter plots with date x-axes.

Dynamic array

A formula result that spills into a rectangular block of cells. FINDBEST returns a dynamic array; the top-left cell holds the formula, the rest of the block is filled in by Excel.

EOMONTH(date, months)

Returns the last day of the month that is months months after date. EOMONTH("2024-01-01", 2) is "2024-03-31". Used to compute the end-of-quarter date.

Excel Labs

The Microsoft Research add-in that provides the Advanced Formula Environment; the pane where LAMBDAs are saved by name. Covered in Lecture 4.

FILTER(array, keep_condition)

Returns rows of array where keep_condition is TRUE. Used to restrict MATH to one quarter's rows.

HSTACK(a, b, ...)

Stacks arrays horizontally. Used at the end of FINDBEST_ to combine year, quarter, model, score into one row.

IFERROR(value, alt)

If value is any error, return alt instead. Used to handle quarters with no matching data by substituting NA().

INDEX(array, row, column)

Returns the single cell at (row, column) of array. Used in FINDBEST to pull one column from the helper's four-column result.

INT(x)

Rounds x toward negative infinity. For non-negative values it's the same as truncation. Used in FINDBEST to convert a zero-based quarter index into a year offset.

ISOMITTED(arg)

Returns TRUE if the LAMBDA was called without arg. Only valid on arguments declared optional with [brackets] in the LAMBDA header. The central trick of 3b.

LAMBDA

A reusable function, defined with LAMBDA(args, body) and saved by name in Excel Labs. Optional arguments are written LAMBDA([arg1], [arg2], ...).

LET(name1, val1, ..., final)

Binds local names to values. Doesn't change results; makes long formulas readable. FINDBEST_ alone uses twelve locals; FINDBEST adds another nine across its outer and inner LETs.

MAKEARRAY(rows, cols, LAMBDA(i, j, body))

Builds a rows-by-cols array by calling the two-argument LAMBDA once per cell. i is the row index (1-based), j is the column index. The sweep-and-assemble primitive behind FINDBEST.

MATH

The Excel table on the MATH sheet. Columns: id, model, date, score. One row per model release with its MATH Level 5 benchmark score.

MAX(range) / MIN(range)

Largest / smallest numeric value. Used in FINDBEST to compute default year bounds and in FINDBEST_ to find the best score.

MOD(n, divisor)

Remainder of n / divisor. MOD(5, 4) is 1. Used in FINDBEST to convert a zero-based index into a quarter number 1-4.

NA()

Produces the #N/A error deliberately. Used to mark "no data" in FINDBEST_.

Scatter plot

A chart type where each point sits at its real (x, y) coordinate with no connecting line. Used in 3c with date on x and score on y.

SUM(range)

Adds numeric values in range. Used inside the has_data check to detect whether the filtered scores are real numbers (summing #N/A gives #N/A).

Trailing-underscore helper

A naming convention: FINDBEST_ is the helper called by FINDBEST. Nothing special to Excel, just a hint to readers.

VSTACK(a, b, ...)

Stacks arrays vertically. Used at the end of FINDBEST to place the header row above the body.

XLOOKUP(lookup, in_array, return_array)

Finds lookup in in_array and returns the corresponding entry from return_array. Used in FINDBEST_ to map the max score back to its model.

YEAR(date)

Returns the four-digit year of an Excel date. Applied to MATH[date] in FINDBEST to compute the dataset's year range when start_year or end_year is omitted.