LumenStone Dataset

Description

The material used was collected from 30 ore deposits of the CIS. Prepared samples of polished sections were analyzed using a Carl Zeiss AxioScope 40 microscope, photographing was carried out using a Canon Powershot G10. Samples represent the main ore associations and are categorized by deposit genesis.

sample source image with corresponding pixel-level mask

LumenStone dataset was created to unite high-quality geological images of different mineral associations which can be used for different tasks of geological image analysis:

  • S1: association of hydrothermal ore of Berezovskoe deposit consisting of sphalerite, pyrite, galena, bornite, tennantite-tetrahedrite group, chalcopyrite minerals. Aimed for segmentation task, pixel-level masks provided.
  • S2: association of Layered Ultramafic Deposits (Deposits of the Norilsk Group), consisting of pyrrhotite, chalcopyrite, pentlandite, magnetite minerals. Aimed for segmentation task, pixel-level masks provided.
  • S3: common association of high-temperature hydrothermal ores consisting of pyrite, arsenopyrite, covelline, bornite, chalcopyrite, magnetite (ordinary and copper-bearing magnetite), hematite (secondary and hydrothermal) minerals (coming soon). Aimed for segmentation task, pixel-level masks provided.
  • V1: A specialized dataset featuring the same samples as for S1 imaged under varying conditions, aimed at developing and testing color adaptation methods.
  • P1: A collection of images of polished sections, designed for constructing panoramic microscopic images. Each sample is represented with 20-25 partially overlapping images.
  • P2: A collection of images of polished sections, designed for constructing big panoramic microscopic images. Each sample is represented with 36-100 partially overlapping images.
  • ICM1: A collection of images of polished sections with text-based tag annotations.

Petroscope Package

All methods of image processing and analysis for microscopic geological images of polished sections, developed by our team, are collected in the Python package petroscope. This open-source tool is available for free use, offering a comprehensive suite of functionalities tailored for geological image analysis tasks such as segmentation and color adaptation.

Versioning system

LumenStone dataset is actively updated and expanded so for convenient usage each set supports versioning system. Every new version of set besides offering new images will contain all images from the previous version. Image annotations can be modified and expanded from version to version. All outdated versions as well as the latest ones are available for download.

Summary

set ver. number of images task magn resolution release date download
S1 v1 59 train, 16 test segmentation
(7 classes)
x50 3396x2547 24.02.2021 images + masks + visualizations
(535 MB)
v2 64 train, 20 test segmentation
(7 classes)
x50 3396x2547 15.06.2025 images + masks + visualizations
(646 MB)
S2 v1 23 train, 6 test segmentation
(5 classes)
x50 3396x2547 24.02.2021 images + masks + visualizations
(242 MB)
v2 37 train, 12 test segmentation
(5 classes)
x50 3396x2547 15.06.2025 images + masks + visualizations
(419 MB)
S3 v1 27 train, 8 test segmentation
(9 classes)
x50 3396x2547 05.12.2024 images + masks + visualizations
(261 MB)
v2 33 train, 14 test + XPL rotations segmentation
(9 classes)
x50 3396x2547 30.12.2025 images + masks + visualizations
(5.2 GB)
V1 v1 30 (10 images x 3 variations) color adaptation
(3 variations)
x50 3396x2547, 4272x2848 05.12.2024 images (3 variations)
(100 MB)
P1 v1 765 (33 panoramas) panorama stitching x50 3396x2547 05.12.2024 source images (3 GB)
preprocessed images (1.7 GB)
panoramas (proposed method) (0.7 GB)
v2 1552 (62 panoramas) panorama stitching x50 3396x2547 30.12.2025 source images
(5.9 GB)
P2 v1 671 (8 panoramas) panorama stitching (big panoramas) x50 3396x2547 30.12.2025 source images
(2.6 GB)
ICM1 v1 1590 text description x50 1132x1132 coming soon

Data Usage Agreement

You are free to use the provided data in your own research work. If you intend to publish research work that uses this dataset, you have to cite the references whenever appropriate.

Contact

For questions on LumenStone dataset please contact Alexander Khvostikov: khvostikov@cs.msu.ru.

Our team

LumenStone dataset was collected, prepared and annotated by Laboratory of Mathematical Methods of Image Processing, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University and Department of Geochemistry and Economics of Mineral Resources, Faculty of Geology, Lomonosov Moscow State University:

Alexander Khvostikov
khvostikov@cs.msu.ru
ORCID: 0000-0002-4217-7141
Laboratory of Mathematical Methods of Image Processing, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University

Dmitry Korshunov
Dmit0korsh@gmail.com
ORCID: 0000-0002-8500-7193
Department of Geochemistry and Economics of Mineral Resources, Faculty of Geology, Lomonosov Moscow State University

Dmitry Sorokin
dsorokin@cs.msu.ru
ORCID: 0000-0003-3299-2545
Laboratory of Mathematical Methods of Image Processing, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University

Andrey Krylov
kryl@cs.msu.ru
ORCID: 0000-0001-9910-4501
Laboratory of Mathematical Methods of Image Processing, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University

Mikhail Boguslavsky
mikhail@geol.msu.ru
ORCID: 0000-0002-1582-8033
Department of Geochemistry and Economics of Mineral Resources, Faculty of Geology, Lomonosov Moscow State University

Acknowledgement

In 2019, the work on the LumenStone was carried out with financial support from the Innovation Promotion Foundation, project UMNIK 14582GU/2019.

In 2024-2025 the work on the LumenStone is carried out with financial support from the Russian Science Foundation grant 24-21-00061.

Bibliography

  • From pixels to panoramas: A deep learning pipeline for mineral image analysis / A. Khvostikov, G. Nikolaev, S. Krasnova et al. // 2025 Fourteenth International Conference on Image Processing, Theory, Tools & Applications (IPTA). — IEEE: 2025. — P. 1–6.
  • Korshunov D.M., Khvostikov A.V., Nikolaev G.V., Sorokin D.V., Indychko O.I., Boguslavskii M.A., Krylov A.S. From visual diagnostics to deep learning: automatic mineral identification in polished section images. Mining Science and Technology (Russia). 2025;10(3):232-244.
  • Indychko O., Korshunov D., Khvostikov A. Using uncertainty to expand training sets for mineral segmentation in geological images // The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. — 2025. — P. 123–129.
  • Nikolaev G., Korshunov D., Khvostikov A. Automatic stitching of panoramas for geological images of polished sections // ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences. — 2024. — Vol. 48, no. W5.
  • D. I. Razzhivina, D. M. Korshunov, M. A. Boguslavsky, A. V. Khvostikov, and D. V. Sorokin. Registration and segmentation of ppl and xpl images of geological polished sections containing anisotropic minerals. Computational Mathematics and Modeling, 2024. DOI
  • Color adaptation in images of polished sections of geological specimens / O. I. Indychko, A. V. Khvostikov, D. M. Korshunov et al. // Computational Mathematics and Modeling. — 2023. — Vol. 33, no. 4. — P. 487–500. DOI
  • A. V. Khvostikov, D. M. Korshunov, A. S. Krylov, and M. A. Boguslavskiy. Automatic identification of minerals in images of polished sections. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 44:113–118, 2021. DOI