LumenStone Dataset

Description

The material used was collected from 30 ore deposits of the CIS. Prepared samples of polished sections were analyzed using a Carl Zeiss AxioScope 40 microscope, photographing was carried out using a Canon Powershot G10. Samples represent the main ore associations and are categorized by deposit genesis.

sample source imagecorresponding overlay of source image and colorized mask

LumenStone dataset was created to unite high-quality geological images of different mineral associations and provide pixel-level semantic segmentation masks. It consists of several sets:

  • S1: association of hydrothermal ore of Berezovskoe deposit consisting of sphalerite, pyrite, galena, bornite, tennantite-tetrahedrite group, chalcopyrite minerals,
  • S2: association of Layered Ultramafic Deposits (Deposits of the Norilsk Group), consisting of pyrrhotite, chalcopyrite, pentlandite, magnetite minerals,
  • S3: common association of high-temperature hydrothermal ores consisting of pyrite, arsenopyrite, covelline, bornite, chalcopyrite, magnetite (ordinary and copper-bearing magnetite), hematite (secondary and hydrothermal) minerals (coming soon).

Versioning system

LumenStone dataset is actively updated and expanded so for convenient usage each set supports versioning system. Every new version of set besides offering new images will contain all images from the previous version. Image annotations (including the number of supported classes of histological structures) can be modified and expanded from version to version. All outdated versions as well as the latest ones are available for download.

Summary

setversionnumber of images [train/test]number of classesmagnitudeimage sizes in pixelsrelease datedownload
S1v159 / 167x1003396x254724.02.2021link1, link2
S2v123 / 65x1003396x254724.02.2021link1, link2

Directory structure

The LumenStone dataset is split into sets and samples (train/test). Each set represents a stable mineral association that corresponds to the main types of deposits. Every image in the dataset has 4 representations:
  1. source image,
  2. pixel-level ground-truth semantic mask (values 0, 1, 2… in the mask correspond to classes 0, 1, 2, …),
  3. colorized mask for simple visualization,
  4. an overlay of source image and colorized mask,
which are stored in folders imgs, masks, masks_human, masks_overlay correspondingly.
The short geological description of sets, as well as the description of all annotated classes of minerals and their color bindings can be found in description.xls table. The current version of description.xls can be found here or here. The directory structure of LumenStone dataset is the following:
LumenStone
|- description.xls
|- S1
|   |- v1
|   |   |- test
|   |   |   |- imgs
|   |   |   |   |- 01.jpg
|   |   |   |   |- 02.jpg
|   |   |   |   |  . . .
|   |   |   |- masks
|   |   |   |   |- 01.png
|   |   |   |   |- 02.png
|   |   |   |   |  . . .
|   |   |   |- masks_human
|   |   |   |   |- 01.png
|   |   |   |   |- 02.png
|   |   |   |   |  . . .
|   |   |   |- masks_overlay
|   |   |   |   |- 01.jpg
|   |   |   |   |- 02.jpg
|   |   |   |   |  . . .
|   |   |- train
|   |   |   |- imgs
|   |   |   |   |- 01.jpg
|   |   |   |   |- 02.jpg
|   |   |   |   |  . . .
|   |   |   |- masks
|   |   |   |   |- 01.png
|   |   |   |   |- 02.png
|   |   |   |   |  . . .
|   |   |   |- masks_human
|   |   |   |   |- 01.png
|   |   |   |   |- 02.png
|   |   |   |   |  . . .
|   |   |   |- masks_overlay
|   |   |   |   |- 01.jpg
|   |   |   |   |- 02.jpg
|   |   |   |   |  . . .
|   |- v2
|   |   |- test
|   |   |   |- imgs
|   |   |   |  . . .
|   |   |   |- masks
|   |   |   |  . . .
|   |   |   |- masks_human
|   |   |   |  . . .
|   |   |   |- masks_overlay
|   |   |   |  . . .
|   |   |- train
|   |   |   |  . . .
|- S2
|   |- v1
|   |   |- test
|   |   |   |  . . .
|   |   |- train
|   |   |   |  . . .
|   |  . . .
|  . . .

Download the Dataset

Please download the required sets and versions of LumenStone dataset using the links from Summary section. The all-in-one version of LumenStone dataset including all sets and versions is available for download here or here.

Data Usage Agreement

You are free to use the provided data in your own research work. If you intend to publish research work that uses this dataset, you have to cite the references whenever appropriate.

Contact

For questions on LumenStone dataset please contact Alexander Khvostikov: khvostikov@cs.msu.ru.

Our team

LumenStone dataset was collected, prepared and annotated by Laboratory of Mathematical Methods of Image Processing, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University and Department of Geochemistry and Economics of Mineral Resources, Faculty of Geology, Lomonosov Moscow State University:

Alexander Khvostikov
khvostikov@cs.msu.ru
ORCID: 0000-0002-4217-7141
Laboratory of Mathematical Methods of Image Processing, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University

Dmitry Korshunov
Dmit0korsh@gmail.com
ORCID: 0000-0002-8500-7193
Department of Geochemistry and Economics of Mineral Resources, Faculty of Geology, Lomonosov Moscow State University

Andrey Krylov
kryl@cs.msu.ru
ORCID: 0000-0001-9910-4501
Professor, Head of the Laboratory of Mathematical Methods of Image Processing, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University

Mikhail Boguslavsky
mikhail@geol.msu.ru
ORCID: 0000-0002-1582-8033
Department of Geochemistry and Economics of Mineral Resources, Faculty of Geology, Lomonosov Moscow State University

Acknowledgement

In 2019, the work on the LumenStone was carried out with financial support from the Innovation Promotion Foundation, project UMNIK 14582GU/2019.

In 2021, the work on the LumenStone was carried out with financial support from the school “Brain, Cognitive Systems, Artificial Intelligence”, Lomonosov Moscow State University.

Bibliography

2020

A. Kochkarev, A. Khvostikov, D. Korshunov, A. Krylov, M. Boguslavskiy. “Data balancing method for training segmentation neural networks” // In: CEUR Workshop Proceedings, Vol. 2744. 2020, pp. 1−10.

Д. М. Коршунов, А. В. Хвостиков, А. В. Кочкарёв, М. А. Богуславский, А. С. Крылов. «Использование алгоритмов глубокого обучения для сегментации и анализа минералов на изображениях аншлифов» // в: Новое в познании процессов рудообразования: Труды молодых учёных, посвящённые 90-летию ИГЕМ РАН. 2020, с. 66−68.