Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Open DM3 Images, Spectra, Spectrum-Images and Image-Stacks with pyTEMlib

Chapter 1: Introduction


Open DM3 Images, Spectra, Spectrum-Images and Image-Stacks with pyTEMlib

Download

OpenInColab

part of

MSE672: Introduction to Transmission Electron Microscopy

Spring 2026
by Gerd Duscher

Microscopy Facilities
Institute of Advanced Materials & Manufacturing
Materials Science & Engineering
The University of Tennessee, Knoxville

Background and methods to analysis and quantification of data acquired with transmission electron microscopes.


Reading a dm file and translating the data in a python dictionary of sidpy datasets.

The data can be stored in pyNSID style hf5py file to be compatible with all packages in the pycroscopy exosystem.

Because, many other packages and programs for TEM data manipulation are based on the hdf5 file-formats it is relatively easy to convert back and forward between them.

Import packages for figures and

Check Installed Packages

import sys
import importlib.metadata
def test_package(package_name):
    """Test if package exists and returns version or -1"""
    try:
        version = importlib.metadata.version(package_name)
    except importlib.metadata.PackageNotFoundError:
        version = '-1'
    return version

# pyTEMlib setup ------------------
if test_package('pyTEMlib') < '0.2026.1.0':
    print('installing pyTEMlib')
    !{sys.executable} -m pip install --upgrade pyTEMlib
# ------------------------------
print('done')

Load the plotting and figure packages

Note for Google Colab

Restart Session in the Runtime Menu

%matplotlib widget
import matplotlib.pylab as plt
import numpy as np
import sys
import os

import pyTEMlib

if 'google.colab' in sys.modules:
    from google.colab import output
    output.enable_custom_widget_manager()
    from google.colab import drive
    drive.mount("/content/drive")

# For archiving reasons it is a good idea to print the version numbers out at this point
print('pyTEM version: ',pyTEMlib.__version__)
__notebook__='CH1_04-Reading_File'
__notebook_version__='2026_01_09'
pyTEM version:  0.2026.1.3

Open a file

This function opens a hfd5 file in the pyNSID style which enables you to keep track of your data analysis.

Please see the Installation notebook for installation.

We want to consolidate files into one dataset that belongs together. For example a spectrum image dataset consists of:

  • Survey image,

  • EELS spectra

  • Z-contrast image acquired simultaneously with the spectra.

So load the top dataset first in the above example the survey image.

Please note that the plotting routine of matplotlib was introduced in Matplotlib and Numpy for Micrographs notebook.

Use the file p1-3hr.dm3 from TEM_data directory for a practice run

# ------ Input ------- #
load_example = True
# -------------------- #

if load_example and 'google.colab' in sys.modules:
    if not os.path.exists('./AL-DFoffset0.00.dm3'):
        !wget  https://github.com/gduscher/MSE672-Introduction-to-TEM/raw/main/example_data/AL-DFoffset0.00.dm3
        !wget  https://github.com/gduscher/MSE672-Introduction-to-TEM/raw/main/example_data/p1-3-hr3.dm3
        
# Open file widget and select file which will be opened in code cell below
if not load_example:
    drive_directory = pyTEMlib.file_tools.get_last_path()
    file_widget = pyTEMlib.file_tools.FileWidget(drive_directory)
if load_example:
    file_name = '../example_data/p1-3-hr3.dm3'
    if 'google.colab' in sys.modules:
        file_name = './p1-3-hr3.dm3'
    datasets = pyTEMlib.file_tools.open_file(file_name)
    main_dataset = datasets[list(datasets.keys())[0]]
else:
    main_dataset = file_widget.selected_dataset
    datasets = file_widget.datasets

view = main_dataset.plot()
Loading...
chooser = pyTEMlib.file_tools.ChooseDataset(datasets)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[2], line 1
----> 1 chooser = pyTEMlib.file_tools.ChooseDataset(datasets)

NameError: name 'datasets' is not defined
chosen_dataset = chooser.dataset
v = chosen_dataset.plot()
Loading...

Use the selection tool on left of the image and select a part of the image and then a part of the colorbar.

Data Structure

The data themselves reside in a python dictionary and the selected sidpy dataset which we name main_dataset.

The current_dataset has additional information stored as attributes which can be accessed through their name.

print(main_dataset)
main_dataset
print(f'size of current dataset is {main_dataset.shape}')

The current_dataset has additional information stored as attributes which can be accessed through their name.

There are two dictionaries within that attributes:

  • metadata

  • original_metadata

which contain additional information about the data

print('title: ', main_dataset.title)
print('data type: ', main_dataset.data_type)

for key in datasets:
    print(key)
    print(datasets[key].original_metadata.keys())
    
main_dataset.metadata  

Data Structure

The datasets variable is a dictionary (like a directory in a file system) which containes contains datasets.

Below I show how to access one of those datasets with a pull down menu.

chooser = pyTEMlib.file_tools.ChooseDataset(datasets)
current_dataset = chooser.dataset
view = current_dataset.plot()

An important attribute in current_dataset is the original_metadata group, where all the original metadata of your file reside in the attributes. This is usually a long list for dm3 files.

current_dataset.original_metadata.keys()

The original_metadata attribute has all information stored from the orginal file.

No information will get lost

for key,value in current_dataset.original_metadata.items():
    print(key, value)
print(current_dataset.h5_dataset)    

Any python object will provide a help.

help(current_dataset)

All attributes of a python object can be viewed with the * dir* command.

As above: too much information for normal use, but it is there if needed.

Adding Data

To add another dataset that belongs to this measurement we will use the h5_add_channel from file_tools in the pyTEMlib package.

Here is how we add a channel there.

We can also add a new measurement group (add_measurement in pyTEMlib) for similar datasets.

This is equivalent to making a new directory in a file structure on your computer.

datasets['Copied_of_Channel_000'] = current_dataset.copy()

We use above functions to add the content of a (random) data-file to the current file.

This is important if you for example want to add a Z-contrast or survey-image to a spectrum image.

Therefore, these functions enable you to collect the data from different files that belong together.

datasets.keys()

Adding additional information

Similarly, we can add a whole new measurement group or a structure group.

This function will be contained in the KinsCat package of pyTEMlib.

If you loaded the example image, with graphite and ZnO both are viewed in the [1,1,1] zone axis.

import ase
                                                                                 
graphite = pyTEMlib.crystal_tools.structure_by_name('Graphite')
print(graphite)
current_dataset.structures['Crystal_000'] = graphite
                                                            
zinc_oxide = pyTEMlib.crystal_tools.structure_by_name('ZnO')
current_dataset.structures['ZnO'] = zinc_oxide               

Keeping Track of Analysis and Results

A notebook is notorious for getting confusing, especially if one uses different notebooks for different task, but store them in the same file.

If you like a result of your calculation, log it.

Use the datasets dictionary to add a analysed and/or modified dataset. Make sure the metadata contain all the necessary information, so that you will know later what you did.

The convention in this class will be to call the dataset Log_000.

new_dataset = current_dataset.T
new_dataset.metadata = {'analysis': 'Nothing', 'name': 'Nothing'}
datasets['Log_000'] = new_dataset

An example for a log

We log the Fourier Transform of the image we loaded

First we perform the calculation

fft_image = current_dataset.fft().abs()
fft_image = np.log(60+fft_image)

view = fft_image.plot()

Now that we like this we log it.

Please note that just saving the fourier transform would not be good as we also need the scale and such.

fft_image.title = 'FFT Gamma corrected'
fft_image.metadata = {'analysis': 'fft'}
datasets['Log_001'] = fft_image

view = fft_image.plot()

We added quite a few datasets to our dictionary.

Let’s have a look

chooser = pyTEMlib.file_tools.ChooseDataset(datasets)
view = chooser.dataset.plot()

Save Datasets to hf5_file

Write all datasets to one h5_file, which we then close immediatedly

h5_group = pyTEMlib.file_tools.save_dataset(datasets, filename='./nix.hf5')

Close the file

h5_group.file.close()

Open h5_file

Open the h5_file that we just created

datasets2= pyTEMlib.file_tools.open_file(filename='./nix.hf5')

chooser = pyTEMlib.file_tools.ChooseDataset(datasets2)

Short check if we got the data right

we print the tree and we plot the data