About Inference

Inference is a short competition for people who like thinking carefully about uncertainty. Three problems, three weeks, no code required. Each round we put up a question pulled from the kind of reasoning that shows up in trading, gambling, and statistics. You read the problem, work out an answer, write a short explanation, and submit. Once the round closes we score everyone and post a leaderboard.

The whole thing runs on the honour system. Submit what you reckon is right, show your working, and try to learn something.

Rules

Three submissions per round, total. You can update your answer up to three times before the round closes. The latest one is the one we score. Use them wisely.
Submissions need an answer in the format the problem asks for and at least 50 characters of reasoning.
Talking through a problem with friends is fine. Submitting the same answer and reasoning as someone else is not.
Using an LLM is allowed but discouraged. The point of this is to learn something, and the way you learn is by working through it yourself.
Tied scores share the same rank. If two people end up on the same score, they both get that position.
Be civil. Display names that are abusive will be hidden from the leaderboard.

FAQ

How is scoring done?

Each round has its own grading rule, described in the problem statement. Your submission is simulated and scored on raw performance (profit, PnL, or whatever the round measures).

Raw scores are then normalised per round:

points = 1000 × (your_raw_score - worst_raw_score) / (best_raw_score - worst_raw_score)

The best performer in each round gets 1000 points and the lowest raw score gets 0 points. Everyone else is placed linearly between those endpoints.

Your season score is the sum of your normalised scores across all rounds (max 3000). The season leaderboard ranks by this total. Per-round raw scores are also shown for reference.

Reasoning is not directly scored, but it gets read and is used to spot dodgy submissions.

When do rounds open and close?

Each round opens at 10:00 AM GMT and closes at midnight GMT seven days later. Rounds run back-to-back, one per week. A countdown timer on each round page shows exactly how long you have left.

When are answers and rankings released?

After the round closes. The full answer key and a write-up go on the round's page. Final rankings appear on the leaderboard at the same time.

Can I use an LLM?

You can, but try not to. This is meant as a learning exercise. Most of the value is in working through the problem yourself, getting it wrong, and working out why.

Is there a prize?

Bragging rights and a spot on the leaderboard, for now. As the comp grows the plan is to bring on sponsors and offer real prizes for top finishers.

What happens to my data?

Your email is used for sign-in and the verification link, and never shared. Your display name and scores appear publicly on the leaderboard for closed rounds. Your written reasoning is private to you and to whoever is scoring.

How does the rating system work?

We use Glicko-2 (Glickman, 2001), adapted for ranked competitions rather than head-to-head games. Each round acts as a rating period. Within a round, participants who scored higher are treated as having beaten those who scored lower in a set of virtual pairwise matches.

Every participant starts at a rating of 1500 with a high rating deviation (RD). RD measures how uncertain the system is about your true skill. It decreases as you play more rounds and increases if you skip seasons, reflecting the fact that your ability may have changed while you were away.

A rating marked “provisional” means the RD is still above 150, so the estimate is rough. After a few rounds, your RD drops and the rating stabilises.

Ratings become publicly visible after Season 1. You can view them on the Leaderboard under the “Ratings” tab.

Issues

Found a bug or have a question? Open an issue on GitHub.