The Interledger Community 🌱

Cover image for Metaculus Futures: Building a Sustainable Model for Forecasting β€” Grant Report #1
Gaia Dempsey
Gaia Dempsey

Posted on

Metaculus Futures: Building a Sustainable Model for Forecasting β€” Grant Report #1

Image (optional)

Project Update

Metaculus is a science-inspired forecasting technology platform that enables our community to collectively reason better and empowers public, nonprofit, and private-sector entities to predict, plan for and better respond to a range of developments. Our most popular forecasting work covers Covid-19, science and technology progress, artificial intelligence development, the global economy, and much more.

In highly uncertain times, tools such as ours are especially valued: in 2020, a global pandemic, a historically unprecedented US Presidential election, a vacant Supreme Court seat, massive wild fires and widespread riots across the United States, and the most impressive worldwide scientific collaboration effort in history all drove high levels of traffic and participation to our platform. Through moderated community questions and statistically aggregated forecasts, Metaculus enables anyone with an internet connection to access the best current human knowledge about the future.

Our forecasting scoring algorithms keep track of the individual forecasts made by participants over time, and when we empirically identify forecasters that tend to be both accurate and well-calibrated, we differentially weight their input in our optimized community aggregation, the Metaculus Prediction, so that we can single out the signal from the noise with the greatest possible accuracy.

This solution means that brand new forecasters who are learning about judgmental, model-based, and probabilistic forecasting can test and hone their skills on Metaculus without worrying that they are "ruining" the accuracy of the Metaculus Prediction. Our system elegantly provides a training ground for newcomers, a sifting mechanism that identifies the most skilled forecasters, and accountability and transparency for all.

At the beginning of September 2020, we had approximately 12,000 registered forecasters on the Metaculus platform. As of the beginning of March 2021, we now have well over 16,000, meaning that our userbase grew by a full third in the last 6 months.

In that time, we've launched the largest AI forecasting tournament in the world, (the launch was covered by Forbes), partnered with The Economist on forecasting 25 key events in 2021, published an op-ed in The Hill highlighting our forecasting work on Covid variants and our related public health policy recommendations (which the Biden administration later adopted, though probably not just at our behest), and trialled scoring a public figure – Vox co-founder Matt Yglesias – on his forecasts for the year ahead. (This last foreshadows some future planned features related to public figures yet to come.)

Our forecasts are featured in three new regular publications read by forecasters in both business and academia.

At the same time, we've made significant progress on upgrading our platform infrastructure, usability, UX, and ease-of-use.

Since the beginning of this year, we've brought our founding CTO, Max, back on board full-time. His presence has not only resulted in much faster engineering progress overall, but also in scalability being at the core of all our major engineering decisions, giving us all confidence in the long-term viability of our platform tools and capabilities.

Here is an overview of the most important recent feature, design, and overall product updates:

  • Design upgrades including "Bright Mode" – a toggle-able view that provides a white background and dark text, as opposed to the reverse, which makes Metaculus much easier to read for non-programmers – plus a new clean header design that aligns with modern web application design expectations
  • Comment Up/Down votes
  • Moderator Workflow Improvements, including the ability to sort the Pending Queue by questions that individual moderators are actively moderating, by questions that have had no moderation yet, and by newest/oldest. Also, the ability to temporarily suspend user accounts, update question categories, and more.
  • New Community Guidelines with information about what we expect from platform participants in terms of Etiquette, and what they can expect in terms of Moderation Rules and Sanctions. This represents an important milestone in setting expectations with our users, especially newcomers, as much of the etiquette of Metaculus has developed over time and has been implicitly understood by long-time veterans as a sort of tribal knowledge. Writing it down helps to clarify the behavior we expect to see – treating others with respect, for instance – and levels the playing field for all. As we said in the Guidelines themselves, "We greatly value the contributions of our diverse community of forecasters, question authors, and forum participants, and we hope that these guidelines will promote, enhance, and safeguard a vibrant community forecasting space for many years to come."
  • Infrastructure upgrades: upgrades to Python 9.6 and the latest version of PostgreSQL, plus MVP automated testing infrastructure and feature flags for beta testers
  • Possibly the most significant change is a "seismic shift" in our backend architecture that enables three new data objects called Projects, Organizations, and Notebooks. Only with this new db architecture do we feel we can scalably implement Web Monetization.

Progress on objectives

One of the core goals of the project was to implement the Web Monetization standard in order to enable a fully scalable forecasting bounty system on the Metaculus platform.

When we got into the weeds of Web Monetization technology, we found that it only enables streaming payments suitable for content consumption, and not the sort of directed micropayments that we would need in order to adopt the system as part of a functioning bounty system.

As a Plan B, we came up with a way to meaningfully integrate Web Monetization into the Metaculus platform by rewarding question and notebook authorship with the streamed payments of Coil subscribers. Our Plan B still aligns with our core values, which put our community at the center, along with knowledge and truth-seeking, accountability, transparency, accuracy, and collective intelligence.

We would generally have preferred to have the option to stick to Plan A because it would've enabled a truly remarkable new capability on the platform that elegantly solves a major incentive challenge, but Plan B still creates positive incentives that generate value for multiple stakeholders, so it is something we are happy to pivot toward.

In part because of this pivot, and in part because we haven't been able to staff up as rapidly as we'd hoped, we have requested an extension that will enable us to implement Plan B with excellence.

The core Web Monetization implementation is already completed. What remains to be done is mostly centered about UX and UI (which is also why the "Design" line item in our budget has yet to receive much of a dent).

In addition to the platform upgrades described above, a few more are currently underway:

  • On-platform notifications, which we expect will increase both ease-of-use and engagement
  • A new search filter capability, enabling users to more easily find questions and forecasts and navigate the platform
  • Updated Scoring Rule and Track Record capabilities that enable better incentives and more detailed comparisons of forecaster performance
  • And most significantly, we have been working on a tremendous new capability entitled Forecasting Causes, which is a bundle of features that allow us to work much more closely and effectively with nonprofit partners.

There are many reasons why we have chosen to focus on Causes first. First, our forecasting community includes a strong contingent of self-identified Effective Altruists who are motivated by a desire to utilize rationality tools, including forecasting, to do the greatest possible good in the world. Second, we have strong interest from nonprofit partners in utilizing Metaculus forecasts in real-world decision-making scenarios. And third, the scalable set of tools that we can build for nonprofits will enable us to learn and test out new capabilities, which we believe will be equally useful with modest modifications to three additional core audiences: educators, businesses, and government agencies.

Key activities

There are four major Activities in our proposal:

  1. The implementation of a Web Monetization integration with the Metaculus platform.
  2. Improving the usability and discoverability of the Metaculus platform.
  3. The development of a communications infrastructure for Metaculus content, including web-monetizable content.
  4. The development of partnerships with organizations that can use Metaculus forecasts in the real world, including in public health and public policy.

Earlier sections in this report have given a status update on Activity #1 (specifically our pivot from Plan A to Plan B, which is well-defined). I've also mentioned numerous improvements we've already made and plan to make soon with respect to Activity #2 – in short, we are in the midst of a usability revolution, thanks in no small part to the tremendous effort and leadership of our design advisor and consultant Steven!

Let's focus for a moment then on Activity #3. First, some context. Metaculus has long been a place where the statistically-minded – quants, physicists, researchers - have been happy making rigorous probabilistic forecasts and obsessing over the details of our (fairly mathematically complex)scoring rules (click on "Here are the details" to get at the underlying equations). The not-so-statistically-minded, on the other hand, have often told me that they find the platform to be a bit "intimidating."

Activity #3 is for this second group of people (and actually, so are Activities #2 and #4).

The goal of our new communications infrastructure is to take a platform that has science, mathematics, and data engineering deeply ingrained in its DNA and produce outputs that are legible and interesting to the average layperson. If this seems difficult to do, consider that it is what science journalists do,Β what the entire field of science communication is dedicated to.

There are multiple levels at which this goal may be approached. I think of them in terms of a) design and user experience – making the experience feel comprehensible and familiar, b) object-level content – for instance, writing about forecasting in a way that's interesting and relatable, and c) wide and diverse distribution – ensuring that we can and do reach people beyond the already-statistically-minded bubble that Metaculus first appealed to.

An absolutely critical piece of infrastructure that underpins the achievement of all three levels a), b), and c) is a feature we call Notebooks. There's nothing particularly magical or mysterious about Notebooks – they should function in a "comprehensible and familiar" way after all – but they need to enable both our team and our user community to "write about forecasting in a way that's interesting and relatable." Meaning, basically, that instead of simply viewing a forecasting question and related probability distribution on say, cumulative two-dose vaccinations by the end of the month you can read an eloquent essay on the topic of Covid forecasts (this one written by our primary Covid researcher, Juan), which puts the forecasts in their proper context.

Crucially, Metaculus Notebooks should be able to fully integrate Metaculus forecasts natively.

As mentioned above, the backend of Notebooks has been built, and the frontend will be ready and tested on approximately Monday 3/22, in time for a forecasting essay contest we plan to launch shortly thereafter utilizing Notebooks, which will serve to familiarize our community with the feature, incentivize the generation of lots of interesting content (there will be several thousand dollars allocated as prize funds for the contest, plus highly regarded judges), and to work out any technical and UX kinks before turning on the spigot of monetization.

In addition to Notebooks themselves, there is a panoply of smaller features that nonetheless support the broader goal of a highly-usable communications infrastructure, including things like forecast embedding improvements, social sharing preview improvements, a bit of font and styling work, and deeper integration capability with the news ecosystem.

We now turn to Activity #4. Over the past several months, we have been developing partnerships with organizations that we simply could not be more excited to work with. I already mentioned The Economist, and though he's not "an organization," I'll reiterate that we're very excited about our recent collaboration with the author Matt Yglesias.

We've also been very effectively collaborating with Lehigh University's Computational Uncertainty Lab on Consensus Forecasting to Improve Public Health: Mapping the Evolution of COVID-19 in the U.S. – more on some of the latest results of this collaboration can be read in this recent LU write-up.

Several other important partnerships are well underway and will be announced soon, but since this is a public forum, I won't mention them here just yet. I would be happy to discuss them privately with the GftW grant administration team.

Communications and marketing

While we frequently discuss our work in public, I'll focus this section just on the marketing budget envisioned for this project.

Once the Notebook capability is up and running and Forecasting Causes have been launched, we are planning a podcast ad campaign. (We plan to test out podcast ads since we've found declining returns over the last few months with Reddit, Facebook, and Google ads.)

We will also likely need to come up with a viable marketing alternative to the account drops we had envisioned utilizing as a promotional tool in web monetization Plan A which involved a scalable bounty system, and identify a more suitable promotional tool that aligns with the structure in our Plan B, namely a reward system for the top monthly authors on the Metaculus platform. We haven't nailed down exactly what this should look like yet, but perhaps an expanded essay contest would work well here.

What’s next?

We plan to complete and launch Notebooks, complete the implementation of our web monetization Plan B, which mostly means the completion of the relevant UI/UX work and executing the marketing launch (unless by some miracle we find out that there's a way for us to circle back to our Plan A, perhaps based on new changes in the WM ecosystem that we're currently unaware of).

We will also launch our first essay contest, as well as the Forecasting Causes capability, as well as announcing our first major FC nonprofit partnership at launch.

What community support would benefit your project?

For technical members of the WM community, we'd love to hear of any updates to the Web Monetization ecosystem that would enable us to build a scalable, micropayments-based bounty system for for forecasting on Metaculus.

If you work with or for a nonprofit and you're interested in probabilistic forecasting and related tools to support complex real-world decision-making, we'd love to hear from you.

Looking further ahead, we're also very interested in working with educators who want to explore the use of forecasting tools in the classroom, at graduate, undergraduate and high school levels. Current experiments underway are in math and physics courses, biostatistics, psychology and decision science, as well as in business schools. We'd be thrilled to work with experts in the biosciences, biosecurity, sociology, complexity science, or anyone interested in multidisciplinary academic efforts.

And of course, we always love UX feedback from new users of Metaculus – fresh eyes are always incredibly helpful!

Additional comments

Thanks for reading our report!

Relevant links/resources (optional)

Metaculus
Pandemic Metaculus
Economist 2021 Series
Matt Yglesias collaboration

Top comments (0)