Exercise 2 — Charts, log axes, and FLOP-by-year

Exercise 2 pivots from pure LAMBDA work into charts. It still uses the AIMODEL table from Exercise 1, but now you're plotting three scatter charts (subparts a and b), writing a paragraph or two about what each plot shows, then building two more LAMBDAs (MAXFLOPS and TOTALFLOPS) that aggregate FLOP by year, and finally charting those and tying them to NVIDIA's revenue. The four subparts form a natural progression: look at the raw data, spot the exponential trend, cut the data by year, then reason about what the trend implies about real-world spending. Written commentary is a first-class deliverable here — there are no special marks for a slick chart if you say nothing about it.

Quick orientation to the data

The AIMODEL table has eight columns; Exercise 2 uses four of them. Date is each model's release date as a real Excel date value (so YEAR(date) gives the calendar year). FLOP is the training compute in floating-point operations — huge numbers, typically 1E+23 to 5E+26. Tokens is how much training data the model saw (text for language models, frames for video, etc.), also huge (most language models sit roughly between 1E+12 and 4E+13, with the rare video model going much higher — Cosmos-1.0-Diffusion-14B Video2World hits 9E+15). Parameters is the model size — usually 1E+9 to 1E+12.

Two things you already noticed in Exercise 1 will bite you again here. Some rows have blanks for FLOP, Tokens, or Parameters. Excel's chart engine skips genuinely blank cells, so they don't break the chart; the rows that do break it are the ones where you accidentally left a 0 in place of a missing value (the spurious zeros from 1a's IF/0 confusion). log(0) is undefined, so a 0-valued point silently vanishes on a log axis; the row appears in your data but no dot shows up, which can look like it's plotted at the bottom but actually isn't there at all. Second, every column is an Excel table column, which means AIMODEL[FLOP] refers to exactly those values — you don't need to drag or hardcode ranges.

Because this is the second report, Excel Labs gets only a one-line reminder here: you still need it; the two LAMBDAs in 2c (MAXFLOPS and TOTALFLOPS) should be saved there under those names. Lecture 4 introduced the tool, and Solutions.* LAMBDAs inside every lecture exercise file from 4 onward are worked parallels worth opening when something feels unfamiliar.

2a — Scatter plot of FLOP vs. date with a log y-axis

The exam asks for a scatter plot with release date on the x-axis and FLOP count on the y-axis. FLOP counts span from roughly 1E+23 to almost 5E+26 — three to four orders of magnitude. Plotted on a normal linear y-axis, the largest model towers over the page and every smaller model hugs the x-axis in an unreadable smear. A logarithmic axis — one where each tick mark multiplies by 10 instead of adding a constant — compresses that range into a readable plot.

Predict.If FLOP values range from 1E+23 to almost 5E+26, what happens to the smaller values when you plot them on a linear y-axis?

Click path for the chart

Select the two columns of interest. Click the Date header of AIMODEL, then Ctrl+click the FLOP header — this selects both columns including the header cells, which Excel uses for axis labels.
Insert tab → Scatter chart group (it's the one with the dots) → pick the first option, "Scatter" (just dots, no connecting lines). Excel drops a chart into the sheet with Date on the x-axis and FLOP on the y-axis.
Right-click the vertical axis (the y-axis on the left, where the huge FLOP numbers are) → Format Axis. A pane slides in from the right.
In the Format Axis pane, tick the Logarithmic scale checkbox. Leave the base as 10.
Click the chart title and retype it to something descriptive like "Training FLOPs by release date". Click the axis labels and make sure they say Date and FLOP.

That's it for the load-bearing work. Cosmetic polish — colours, marker size, gridlines, removing the empty legend — is nice but doesn't earn marks. The load-bearing bits are the chart type (scatter, not line), the axis type (log, base 10, on FLOP), and which column is which.

What you should see

Dots scatter in a wide vertical band each year, with the band's upper edge climbing year-over-year — from 1E+23 to a few times 1E+24 in 2022, all the way up to 4.6E+26 for the Grok-3 outlier in 2025. The lower edge stays near 1E+23 throughout: even in 2025, smaller models like Evo 2 and Qwen3-0.6B sit at the floor. On the log axis the upper edge looks roughly linear; a straight-line shape on a log axis means exponential growth in the underlying numbers.

What to write (two or three sentences is enough)

The solution workbook makes roughly these points: the maximal FLOP grows linearly on the log axis (so the underlying trend is exponential), there's a floor somewhere around 1E+23 that holds throughout the dataset, and each year contains a wide spread of models — not just frontier ones like Grok-3, but also smaller ones near the floor. Two or three sentences in your own words capturing those three points is enough.

Common mistake: confusing a scatter plot with a line chart. Line charts connect consecutive points with lines, which implies continuity — fine for evenly-spaced monthly data (you'll see this in Task 3), wrong for discrete model releases at uneven dates. Scatter just places each point at its (date, value) without connecting them.

Another easy slip: forgetting the log step and writing commentary about a linear plot where everything looks like zero. If your plot is readable only for the top few models, the log checkbox is what you missed.

Lecture tie-in. Charts are covered in Part 2 of the course (Adam Lee's half — PivotTables, Charts, Programming, TypeScript, ExcelScripts). The direct lecture reference is 02Charts.pdf and its companion workbook 02Charts.xlsx. For scatter plots with log y-axes specifically, sheet "2-AxesScales" has the worked examples; sheet "1-BadCharts" shows what goes wrong without log scaling. Part 1 of the course (Jonas Moss's LAMBDA lectures 01-07 lecturefilled.xlsx) doesn't cover charts directly.

2b — The same idea, twice: tokens vs. date and parameters vs. date

This is mechanical repetition with two different y-columns. Repeat the click path exactly — select Date + Tokens, insert scatter, log the y-axis; then Date + Parameters, insert scatter, log the y-axis. You end up with three scatter plots total (2a's FLOP plot, plus tokens and parameters).

What you should see

Tokens: roughly linear trend on the log axis again, but noisier than FLOP. A big outlier sits near 1E+16 in January 2025 — that's Cosmos-1.0-Diffusion-14B Video2World, a robotics/video model from NVIDIA that trained on video frames, where each frame is many tokens by count. Tokens don't grow as dramatically as FLOP year-over-year because model builders have gotten better at squeezing more compute into the same training data (more epochs, better sampling).

Parameters: a much flatter trend on the log axis than FLOP. Most params sit between 1E+10 and 1E+12, with a few high outliers. The line stays flatter because most of the recent big models are mixture-of-experts: you can have a trillion total parameters but only activate a slice of them per query, so the total stops being the meaningful number.

What to write

The solution's commentary is short and to the point: "The number of tokens increase roughly linearly by date on the log-axis, but there are some outliers, especially the model with token count of ~1.00E16, which is Cosmos-1.0-Diffusion-14B Video2World. Perhaps video models use more tokens than text models? The number of parameters used is relatively constant compared to FLOPs, but with some weaker outliers." Again, paraphrase this rather than copy. The substance is: tokens grow-ish with a noisy outlier that is worth naming, and parameters are more stable than you might have expected.

Partial-credit alternative: if three separate scatter plots feel like a lot of clicking, you can combine FLOP, Tokens, Parameters into one scatter plot with three data series. Right-click the chart → Select Data → Add each column as a series against Date. It's less readable but gets you most of the marks because the axes and trends are all visible. Just make sure the log scale still applies; each series shares the y-axis.

Easy to get wrong: worrying about blank Tokens / Parameters cells. Excel's chart engine skips blank entries entirely — they're not plotted at all. (If you'd replaced the blanks with 0 first and then tried to chart on a log axis, log(0) is undefined and those points silently vanish — different mechanism from blanks but the same end result, no dot in the plot.) For this subpart, leave the blanks as blanks and don't filter the data before charting.

Lecture parallel. Same as 2a — nothing new on this one.

2c — `MAXFLOPS` and `TOTALFLOPS`, aggregating FLOP by year

Back to to LAMBDAs. Define two functions:

MAXFLOPS(year); the largest single FLOP count among models released in that calendar year.
TOTALFLOPS(year); the sum of all FLOP counts across models released that calendar year.

Both share the same core pattern: build a boolean mask that says "this row's year matches the argument", apply it to the FLOP column, aggregate.

The official LAMBDAs

Excel

MAXFLOPS = LAMBDA(year,
    LET(
        flops, AIMODEL[FLOP],
        dates, AIMODEL[Date],
        mask,  YEAR(dates) = year,
        MAX(FILTER(flops, mask))
    )
)

TOTALFLOPS = LAMBDA(year,
    LET(
        flops, AIMODEL[FLOP],
        dates, AIMODEL[Date],
        mask,  YEAR(dates) = year,
        SUM(FILTER(flops, mask))
    )
)

Walking the LET locals in dependency order: flops is the whole FLOP column; dates is the whole Date column; mask is a boolean array the same height as the table, TRUE on every row where YEAR(date) = year and FALSE elsewhere. FILTER(flops, mask) keeps only the FLOP values from the matching rows, and then MAX (or SUM) aggregates.

Two functions you haven't met yet in this report:

YEAR(date) takes an Excel date value and returns its four-digit year as a number. YEAR("2024-07-15") is 2024. Applied to a whole column AIMODEL[Date], it returns a column of years.

SUM(range) adds up the numeric values in range. Ignores blanks and text. You've seen it a hundred times; it behaves exactly the same when fed a filtered array.

Save both LAMBDAs in Excel Labs. Then anywhere in the workbook, =MAXFLOPS(2024) gives the largest FLOP count from 2024 releases, and =TOTALFLOPS(2024) gives the sum across all 2024 releases.

The two LAMBDAs differ in exactly one character — MAX becomes SUM. That's a tell that the shared shape is worth pulling out into one helper: "filter FLOP by year, aggregate". If you wanted to, you could collapse them into one parameterised function, but the exam wants two named ones, so give it what it asks for.

Watch out: forgetting that dates is a column of dates, not a column of years. Writing mask, dates = year never matches anything — you'd be comparing a date value like 2024-07-15 to the integer 2024, which is always false. YEAR(dates) = year is the fix.

Second common mistake: applying the mask to dates instead of flops. FILTER(dates, mask) gives you a column of dates in that year — then MAX and SUM still work, but on dates, not FLOPs. It'll return wrong numbers. Make sure you filter the column you want to aggregate.

From the lectures. LET with multiple local variables is the backbone of Lecture 6 (06lecturefilled.xlsx sheet "Local variables with LET" and "Simplify functions with LET"). Open that file and read Solutions.AVERAGEN or Solutions.FOOBAR in Excel Labs — you'll see the same "introduce locals, then do the aggregate" shape. For the YEAR(dates) = year mask-on-a-column pattern, see 03lecturefilled.xlsx sheet "6 UNIQUE and FILTER"; every FILTER(column, column = x) example is the same idea with a different predicate.

2d — The fiscal-year table and the bar charts

2d puts MAXFLOPS and TOTALFLOPS to work. Fill in the table for fiscal years 2021 through 2026 using the functions you just defined, build two bar charts (one for Max FLOPS, one for Total FLOPS), and comment on which of the two correlates more strongly with NVIDIA's revenue. Then interpret the 2026 value.

The fiscal-year mapping; a small trap

NVIDIA's fiscal year ends January 31, so their fiscal 2025 covers calendar-year 2024 (February 2024 through January 2025, mostly). The exam maps model FLOPs to "the following fiscal year": a 2024 calendar-year model counts toward NVIDIA's FY2025. Concretely, the table you fill in is

Fiscal year	Calendar year covered	MAX FLOPS	TOTAL FLOPS
FY2021	CY2020	`MAXFLOPS(2020)`	`TOTALFLOPS(2020)`
FY2022	CY2021	`MAXFLOPS(2021)`	`TOTALFLOPS(2021)`
FY2023	CY2022	`MAXFLOPS(2022)`	`TOTALFLOPS(2022)`
FY2024	CY2023	`MAXFLOPS(2023)`	`TOTALFLOPS(2023)`
FY2025	CY2024	`MAXFLOPS(2024)`	`TOTALFLOPS(2024)`
FY2026	CY2025	`MAXFLOPS(2025)`	`TOTALFLOPS(2025)`

If your numbers disagree with the solution workbook by one year, the fiscal-year shift is almost always the culprit.

A couple of the seed cells are pre-filled in the exam sheet to help you check yourself: C38 = 3.14E+23 (MAX in FY2021) and D40 = 1.03E+25 (TOTAL in FY2023). Match those and you know your fiscal-year shift is correct.

Building the bar charts

Select B37:C43 (the header row at row 37 plus the six fiscal-year rows in rows 38 through 43, covering FY2021 through FY2026 with their Max FLOPS values). Insert → Column/Bar Chart → Clustered Column. Give it a title like "Max FLOPs by fiscal year". If the bars are all visually flat except the last, apply log scale to the y-axis (right-click axis → Format Axis → Logarithmic). The exam asks for bar charts specifically — log scale isn't required but makes the exponential trend visible.
Repeat for the Total FLOPS chart: select B37:B43 then Ctrl+select D37:D43 (two non-contiguous ranges; the year column and the Total FLOPS column, both with their headers). Insert → Clustered Column, title it "Total FLOPs by fiscal year".

What to write

This subpart has two commentary pieces: which metric tracks NVIDIA revenue better, and what the 2026 value tells you.

Pause.Between the MAX FLOP and the TOTAL FLOP each year, which do you expect to track NVIDIA revenue more closely, and why? Take a minute before reading on.

The honest answer is both are close, and the total is slightly cleaner. NVIDIA's revenue scales with how many chips get sold, and chips get sold in proportion to how much total training compute the industry is running — that's TOTALFLOPS. MAXFLOPS captures only the frontier model, which is one customer's order, not the whole market. You'd expect the total to correlate more tightly with NVIDIA's revenue. The solution workbook backs this up: from FY2023 to FY2025, TOTALFLOPS jumps about 14× then about 3× year-over-year (uneven, but always upward), while NVIDIA revenue grows about 2× per year over the same window. Compute is growing faster than revenue, but they move together.

The NVIDIA revenue table, cleaned (the exam wording lists revenue for FY2022 through FY2025 only — FY2021 and FY2026 are not provided):

NVIDIA fiscal year	Revenue (USD billions)
FY2021	not provided
FY2022	26.914
FY2023	26.974
FY2024	60.922
FY2025	130.497
FY2026	not provided

Revenue is essentially flat from FY2022 to FY2023, then roughly doubles year-on-year from FY2023 onward (FY2023 → FY2024 is about 2.3×, FY2024 → FY2025 about 2.1×). TOTALFLOPS grew faster — about 14× from FY2023 to FY2024, then about 3× from FY2024 to FY2025 which is much steeper than revenue's roughly 2× per year. That gap is consistent with compute efficiency (FLOP per dollar) improving on the chip side, so dollars-of-revenue can double while the training compute they enable multiplies several times.

Interpreting the 2026 value

The FY2026 row maps to calendar 2025, which is the year the dataset cuts off in. TOTALFLOPS(2025) sums FLOPs for all models released in 2025 that made it into the dataset. Because the cutoff is partway through the year, 2025 is incomplete — models released after the cutoff aren't in there. Your 2026 number is a lower bound on the full-year total.

The solution workbook makes roughly these points: the FY2026 value corresponds to calendar 2025, which isn't finished yet, so total and max FLOPs both have room to climb. The last couple of years have been roughly linear on the log scale (i.e. exponential), and a similar extrapolation suggests another ~10× over last year is plausible. The points to make in your write-up: the value is already large, 2025 is incomplete, the exponential trend extrapolates to roughly 10× more, and somebody has to pay for that compute — NVIDIA is the main seller of the chips that run it, so NVIDIA revenue likely roughly doubles again.

If that's too fiddly: if you can't get MAXFLOPS and TOTALFLOPS to work as LAMBDAs, you can compute the per-year max and total with MAXIFS and SUMIFS — two built-ins that do exactly filter-then-aggregate. For example:

Excel

= MAXIFS(AIMODEL[FLOP],
         AIMODEL[Date], ">=2024-01-01",
         AIMODEL[Date], "<=2024-12-31")

and similarly SUMIFS for the total. You lose the "parameterise by year" convenience, but the table numbers will match. The exam wants named LAMBDAs for full marks, but this still earns partial credit.

Where this came from. The LET-with-locals + FILTER construction in MAXFLOPS / TOTALFLOPS is the direct image of 06lecturefilled.xlsx sheet "Simplify functions with LET" — that's exactly the pedagogical purpose of the LAMBDAs defined there (things like AVERAGEN, MEDIANN, PRODUCTNEW). For the bar chart construction, the Part 2 chart lecture 02Charts.pdf is the relevant material; its companion workbook 02Charts.xlsx has worked examples on sheets "1-BadCharts", "2-AxesScales", and "4-AmountsProportions". For the year-masking idea, 03lecturefilled.xlsx sheet "6 UNIQUE and FILTER" is your home base.

If this still feels shaky

The three scatter plots in 2a and 2b are mostly click-path, so if the result doesn't look right, the commonest cause is forgetting to log-scale the y-axis — right-click axis → Format Axis → Logarithmic. For MAXFLOPS and TOTALFLOPS, the LET-with-FILTER pattern is worked in 06lecturefilled.xlsx. Open Excel Labs there and read Solutions.AVERAGEN, which sets up locals then wraps an aggregate. If the fiscal-year mapping is where you get stuck, sketch it on paper: NVIDIA FY ends Jan 31, so FYn covers most of calendar year (n-1); check your MAXFLOPS(2020) = 3.14E+23 (which is the FY2021 row, since FY2021 maps to CY2020) against the exam's pre-filled C38 before filling in the rest.

Glossary

AIMODEL: The Excel table on the sheet of the same name in exam_part1.xlsx. Each row is one AI model; the columns used by Exercise 2 formulas and charts are Date, FLOP, Tokens, Parameters. (Model shows up in the written commentary about outliers (Cosmos, Grok-3) but isn't referenced by any formula.)
Axis type: Charts have two axes (x and y) and each has a type. For scatter plots both are value axes; for bar charts the x-axis is typically a category axis. The scale of a value axis can be linear or logarithmic.
Bar chart: A chart type where each category gets a rectangular bar whose length encodes a value. In Excel, "Clustered Column" is the vertical variant; "Bar" is the horizontal variant. Used in 2d for the fiscal-year totals.
Category axis: An axis where positions are labels rather than numeric values. A bar chart's x-axis is typically categorical.
Dynamic array: A formula result that spills into multiple cells. MAXFLOPS and TOTALFLOPS return single values, but the table in 2d uses six calls to them.
Excel Labs: The Microsoft Research add-in that provides the Advanced Formula Environment where LAMBDAs are saved by name. Lecture 4 introduces it.
FILTER(array, keep_condition): Returns the rows of array where keep_condition is TRUE. Used in both MAXFLOPS and TOTALFLOPS.
FLOP: Floating-point operation. A count of arithmetic operations (plus, minus, times, divide on decimal numbers) used to train a model. Values in this exam range from roughly 1E+23 (the smallest model) to almost 5E+26 (Grok-3).
Fiscal year: The 12-month period a company uses for financial reporting. NVIDIA's ends January 31, so fiscal 2025 mostly covers calendar 2024.
LAMBDA: A reusable function, defined with LAMBDA(args, body) and saved by name in Excel Labs or Name Manager.
LET(name1, val1, ..., final): Binds local names to values inside a formula. Doesn't change results; makes formulas readable and avoids re-typing the same range.
Legend: The small box on a chart that labels each data series. In a one-series scatter plot, it's usually unnecessary and can be deleted.
Logarithmic axis: An axis where each tick multiplies by a constant (usually 10) instead of adding one. Compresses many orders of magnitude into a readable range. Required in 2a and 2b for the FLOP / Tokens / Parameters y-axes.
MAX(range) / MIN(range): Largest / smallest numeric value in range. Ignores blanks and text.
MAXIFS / SUMIFS: Built-in functions that compute max / sum across a range subject to one or more conditions. Partial-credit alternative to LAMBDAs in 2d.
Scatter plot: A chart where each data point is placed at its (x, y) coordinate with no connecting lines. Used in 2a and 2b with Date on x and a quantity on y.
SUM(range): Adds the numeric values in range. Used in TOTALFLOPS.
Trendline: A best-fit line or curve Excel can overlay on a scatter plot. Not required for this exam, but useful for quantifying the exponential trend in 2a if you want to go beyond the written commentary.
Value axis: An axis where position encodes a numeric value. Both axes of a scatter plot are value axes. Supports linear or logarithmic scaling.
YEAR(date): Returns the four-digit year of an Excel date value. YEAR("2024-07-15") is 2024. Applied to a column of dates, returns a column of years — central to the masks inside MAXFLOPS and TOTALFLOPS.

Quick orientation to the data

2a — Scatter plot of FLOP vs. date with a log y-axis

Click path for the chart

What you should see

What to write (two or three sentences is enough)

2b — The same idea, twice: tokens vs. date and parameters vs. date

What you should see

What to write

2c — MAXFLOPS and TOTALFLOPS, aggregating FLOP by year

The official LAMBDAs

2d — The fiscal-year table and the bar charts

The fiscal-year mapping; a small trap

Building the bar charts

What to write

Interpreting the 2026 value

If this still feels shaky

Glossary

2c — `MAXFLOPS` and `TOTALFLOPS`, aggregating FLOP by year