Leaderboard
Matbench is an automated leaderboard for benchmarking state of the art ML algorithms predicting a diverse range of solid materials' properties. It is hosted and maintained by the Materials Project.
134
total task submissions19
algorithms1
benchmark test suites
Scroll down to learn more.
Leaderboard: General Purpose Algorithms on matbench_v0.1
Find more information about this benchmark on the benchmark info page
Task name | Samples | Algorithm | Verified MAE (unit) or ROCAUC | Notes |
---|---|---|---|---|
matbench_steels | 312 | MODNet (v0.1.12) | 87.7627 (MPa) | |
matbench_jdft2d | 636 | MODNet (v0.1.12) | 33.1918 (meV/atom) | |
matbench_phonons | 1,265 | MegNet (kgcnn v2.1.0) | 28.7606 (cm^-1) | structure required |
matbench_expt_gap | 4,604 | MODNet (v0.1.12) | 0.3327 (eV) | |
matbench_dielectric | 4,764 | MODNet (v0.1.12) | 0.2711 (unitless) | |
matbench_expt_is_metal | 4,921 | AMMExpress v2020 | 0.9209 | |
matbench_glass | 5,680 | MODNet (v0.1.12) | 0.9603 | |
matbench_log_gvrh | 10,987 | ALIGNN | 0.0715 (log10(GPa)) | structure required |
matbench_log_kvrh | 10,987 | MODNet (v0.1.10) | 0.0548 (log10(GPa)) | |
matbench_perovskites | 18,928 | ALIGNN | 0.0288 (eV/unit cell) | structure required |
matbench_mp_gap | 106,113 | ALIGNN | 0.1861 (eV) | structure required |
matbench_mp_is_metal | 106,113 | CGCNN v2019 | 0.9520 | structure required |
matbench_mp_e_form | 132,752 | ALIGNN | 0.0215 (eV/atom) | structure required |
Scaled errors for regressions on this leaderboard plot are assessed as the ratio of mean absolute error to mean absolute deviation:
$$ \text{Scaled Error} = \frac{\text{MAE}}{\text{MAD}} = \frac{\sum_i^N | y_i - y_i^{pred} |}{\sum_i^N | y_i - \bar{y} | } $$
Discovery Leaderboard: General Purpose Algorithms on matbench_discovery 0.1.0
Matbench Discovery is an interactive leaderboard and associated PyPI package which together make it easy to benchmark ML energy models on a task designed to closely simulate a high-throughput discovery campaign for new stable inorganic crystals. Matbench-discovery compares ML structure-relaxation methods on the WBM dataset for ranking ~250k generated structures according to predicted hull stability (42k stable). Matbench Discovery is developed by Janosh Riebesell.
model | F1 | R² | DAF | Precision | Recall | Accuracy | TPR | FPR | TNR | FNR | MAE | RMSE |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Voronoi Random Forest | 0.34 | -0.32 | 1.51 | 0.26 | 0.52 | 0.66 | 0.52 | 0.31 | 0.69 | 0.48 | 0.14 | 0.21 |
BOWSR + MEGNet | 0.44 | 0.15 | 1.90 | 0.32 | 0.74 | 0.68 | 0.74 | 0.33 | 0.67 | 0.26 | 0.11 | 0.16 |
Wrenformer | 0.48 | -0.04 | 2.13 | 0.36 | 0.71 | 0.74 | 0.71 | 0.26 | 0.74 | 0.29 | 0.10 | 0.18 |
MEGNet | 0.49 | -0.35 | 2.94 | 0.51 | 0.48 | 0.83 | 0.48 | 0.10 | 0.90 | 0.52 | 0.13 | 0.21 |
CGCNN+P | 0.51 | 0.02 | 2.38 | 0.41 | 0.69 | 0.78 | 0.69 | 0.21 | 0.79 | 0.31 | 0.11 | 0.18 |
CGCNN | 0.52 | -0.61 | 2.62 | 0.45 | 0.60 | 0.81 | 0.60 | 0.15 | 0.85 | 0.40 | 0.14 | 0.23 |
M3GNet + MEGNet | 0.53 | 0.46 | 2.65 | 0.45 | 0.64 | 0.80 | 0.64 | 0.16 | 0.84 | 0.36 | 0.09 | 0.13 |
M3GNet | 0.58 | 0.59 | 2.66 | 0.45 | 0.79 | 0.80 | 0.79 | 0.20 | 0.80 | 0.21 | 0.07 | 0.12 |
Overview
Matbench is an ImageNet for materials science; a curated set of 13 supervised, pre-cleaned, ready-to-use ML tasks for benchmarking and fair comparison. The tasks span a wide domain of inorganic materials science applications including electronic, thermodynamic, mechanical, and thermal properties among crystals, 2D materials, disordered metals, and more.
The Matbench python package provides everything needed to use Matbench with your ML algorithm in ~10 lines of code or less. The web pages and repository online contain full result files, citations, methodologies, and code for the algorithms shown.
What can Matbench offer?
This website
- Leaderboard of results for state-of-the-art materials ML algorithms on standardized test problems
- Interactively explore and download the tasks on MPContribs-ML, a platform hosted by The Materials Project. See Benchmark Info for links to each dataset.
- Each and every result is backed by a peer-reviewed publication and/or a jupyter notebook (similar to Papers With Code) - i.e., how were these results were obtained?
- Glossary of all algorithms' results on the Matbench problems
The Matbench Python package
- Probe ML algorithms strengths and weaknesses across a wide range of materials property prediction tasks
- Run a full benchmark in ~10 lines of code
- Submit results as a PR to the Matbench repo to compare with other algorithms and appear on the leaderboard
- Benchmark both general purpose ML models as well as algorithms specialized for particular domains
Summary of Matbench's Tasks
Matbench's 13 tasks can be broken down into various categories; it includes both the small - less than 10,000 samples - datasets that characterize experimental materials data as well as larger datasets from computer modelling methods like density functional theory (DFT).
Each task in Matbench consists of a three things:
- A set of inputs: crystal structures or chemical compositions.
- A set of outputs: target properties, such as formation energy.
- A test procedure: a way to get a score for your algorithm
The Matbench Python package provides functions for getting the first two (packaged together for each task as a dataset) as well as running the test procedure. See the How to use documentation page to get started.
Citing Matbench
You can find details and results on the benchmark in our paper Benchmarking materials property prediction methods: the Matbench test set and Automatminer reference algorithm. Please consider citing this paper if you use Matbench v0.1 for benchmarking, comparison, or prototyping.
You can cite Matbench using this reference:
Dunn, A., Wang, Q., Ganose, A., Dopp, D., Jain, A.
Benchmarking Materials Property Prediction Methods:
The Matbench Test Set and Automatminer Reference Algorithm.
npj Computational Materials 6, 138 (2020).
https://doi.org/10.1038/s41524-020-00406-3