# First, import the modules we'll be using later. All of this Python code was written
# by a multitude of other people and will do an enormous amount of work "behind the scenes".
# These are some of the standard packages used by most Python programs
import os, re, json, hashlib
from datetime import datetime
from pathlib import Path
# The next packages are specialized ones for math, drawing, and handling image files.
import numpy as np
import matplotlib.pyplot as plt
import tifffile as tiff
#import h5py # Used for HDF5, if we decide to actually use that.
from scipy import ndimage
from scipy.signal import fftconvolve
#Finally, Hyperspy is a package that provides algorithms specifically for microscopy
import hyperspy.api as hs
#for interactive use inside Jupyter output cells:
from ipywidgets import interact, widgets
%matplotlib widget
print("done.")Foundations of TEM Data Management and Analysis — CITEAM 2026
A hands-on, one-day workshop · April 2026 · University of Maryland
The 2026 CITEAM workshop is geared toward cyberinfrastructure (CI) training for the community of scientists and researchers using the new generation of Transmission Electron Microscope (TEM) instruments at UMD. The training materials and associated activities are designed to lower the barrier to using the instruments, data, software tools, and computing resources at every step of your scientific inquiry — from data collection, to analysis, to storing and publishing data — encompassing end-to-end TEM Data Management.
This edition includes a keynote talk from Professor Ichiro Takeuchi, Materials Science and Engineering Department Chair at the University of Maryland will unveil the importance of open science practices in modern Materials Science.
The day is structured as a mix of (1) lectures introducing the materials, (2) hands-on sessions where participants work through end-to-end examples side-by-side with the workshop instructors, and (3) a site visit to the instrument itself. Instructors will hold office hours in the afternoon to address specific questions and concerns.
How this notebook is organized
This notebook is your hands-on companion for the workshop. It mirrors the flow of the day and is broken into six numbered sections. Each section follows the same shape:
- a short What you’ll learn box at the top,
- one or more TEM context callouts pointing to Prof. Salamanca’s parallel slide deck for the microscopy background,
- the actual code cells you will run, with narrative explanations in between,
- and instructor cues flagging spots where the lead instructor will pause to demo, debug, or open a discussion.
Instructor lineup
- Programming and data-handling modules (Sections 0.1.1 – 0.1.5): Erik Scott (RENCI) — will lead the Python / HyperSpy walkthroughs.
- TEM physics and microscopy context (woven through every section): Prof. Lourdes Salamanca Ribas (UMD) — running a parallel slide deck. Watch for the TEM context callouts in this notebook.
- FFT module (Section 0.1.6): Dr. Alexander Hall and Professor John Cumings (UMD) — will lead the reciprocal-space, masking, and inverse-FFT exercises at the end of the day.
- Data Management Professor Eva M. Campo (UMD)
- Coding Assistance Mr. Karan Vahi (USC) and Dr. Anirban Mandal (RENCI)
What you’ll learn today
By the end of this session, you will be able to:
- Open a single TIFF micrograph and a stack of TIFF frames, and inspect their basic image properties and metadata.
- Generate, save, and share thumbnails for browsing or publication.
- Read and interpret intensity histograms; apply contrast and gamma corrections.
- Apply Gaussian denoising and discuss the noise-vs-feature trade-off.
- Detect drift between frames in a stack, register the stack with phase correlation, and combine frames to improve signal-to-noise.
- Calibrate axes (pixels → nm) and measure features interactively.
- Compute a 2D FFT of a region, mask features in reciprocal space, and reconstruct a filtered image with the inverse FFT.
Why both “the hard way” and HyperSpy? Throughout the notebook we deliberately do small things twice — once with low-level NumPy / SciPy and once with HyperSpy — so you build intuition for what the libraries are doing on your behalf.
TEM glossary — quick reference
Keep this open in another tab during the workshop. It is intentionally light-touch — LSR’s slides are the authoritative source for the physics behind these terms.
| Term | What it means in this notebook |
|---|---|
| TEM | Transmission Electron Microscope. A high-energy electron beam passes through a thin specimen; the transmitted electrons form an image with sub-nanometer resolution. |
| STEM | Scanning TEM. The beam is focused to a probe and rastered across the sample; signal is collected at each scan position. |
| ADF | Annular Dark Field detector. Collects electrons scattered to high angles — sensitive to atomic number (Z-contrast). The example data here is ADF1. |
| JEM-ARM200F | The JEOL aberration-corrected (S)TEM at UMD that produced the example files. |
| Micrograph | A single image acquired by the microscope. |
| Stack / frame stack | A sequence of fast exposures of the same region. Summed (after alignment) to improve signal-to-noise. |
| Drift | Slow, mostly-monotonic motion of the sample under the beam between frames (thermal, mechanical, charging). |
| Jitter | Fast, frame-to-frame random shifts (vibration, scan instability). |
| Pixel size / scale | Real-world distance (in nm) represented by one pixel. Read from $$SM_MICRON_BAR and $$SM_MICRON_MARKER in the sidecar .txt metadata file. |
| TIFF | Tagged Image File Format. Lossless, supports rich metadata — the standard for raw scientific imaging. |
| HyperSpy | The Python library at the heart of this workshop. Originally built for EELS; now a general-purpose multi-dimensional signal toolkit for microscopy. |
| Real space | The image as you see it — intensity vs. position. |
| Reciprocal space / Fourier space | The Fourier transform of the real-space image. Periodic features (lattices) show up as discrete spots. |
| FFT | Fast Fourier Transform — the algorithm we use to move between real and reciprocal space. |
| Power spectrum | The squared magnitude of the FFT — what we visualize when we “look at the FFT”. |
TEM context — LSR: Please pitch in here with the conceptual overview of TEM vs. STEM, the ADF detector geometry, and what makes the JEM-ARM200F at UMD special. The notebook will refer back to this terminology throughout.
Section 0.1.1 — First Steps - LSR for science, ES for computer sci
What you’ll learn - How to point Python at a TIFF micrograph and a folder of frames. - The minimum imports and helpers for the rest of the day. - Two ways to open the same image — the hard way (raw
tifffile) and the easy way (HyperSpy). - How a TIFF “thumbnail” gets generated and why we save them as JPEG.
TEM context — LSR: Slide pitch-in here — what is the user actually looking at? A short orientation to the JEM-ARM200F at UMD, the ADF1 detector, and the kind of specimen featured in the SiGe and TbFeO3 example datasets.
Instructor cue — ES (RENCI): This is the warm-up. Goal is to get every laptop importing the same packages and pointing at the same data. Surface the package-management story (standard library vs. PyPI), and flag that we’ll keep coming back to NumPy, Matplotlib, SciPy, and HyperSpy for the rest of the day.
We will open a single TIFF micrograph and a stack of TIFF frames, inspect basic image properties, and generate a quick JPEG thumbnail for browsing and sharing. This is the first step in virtually every microscopy workflow.
We’ll take each part of the process step by step so we can see exactly how things work.
The first step is to tell the Python interpreter that we’re going to use several packages — both from the standard Python library and from PyPI, the public repository of over four thousand open-source packages.
Pointing at our data
We’re going to refer to a handful of input files many times. We could type out those filenames everywhere, but that is annoying and error-prone. Instead, we set short variable names once. If the data moves, we update one place and everything below still works.
You may need to scroll horizontally to read the full path.
Instructor cue — ES: Bridge to the Data Management topic later in the day — specifically, the practice of encoding metadata into long, structured filenames. Also a good place to introduce named parameters in
Path.mkdir()— we’ll see them everywhere in HyperSpy.
# Now, let's set some variables to the filenames where our images are stored.
# We'll be using these a LOT today, so defining them this way does two things. First, it
# saves us a lot of typing. Secondly, if we move our files or want to change which ones we're
# working with, we have to make only one change here and everything below here
# will still work as normal.
#Alex:
#SINGLE_TIFF_PATH = "/Users/alexh/Desktop/20251024/JEM-ARM200FImage_20251024_1518_44_ADF1_ImagePanel1.tif" # e.g., "data/single/micrograph_01.tif"
#STACK_FOLDER_PATH = "/Users/alexh/Desktop/20251024/JEM-ARM200FImage_20251024_1518_51_ADF1_ImagePanel1" # e.g., "data/stack_01_frames/"
#OUTPUT_DIR = "outputs_0_2"
#Erik
#SINGLE_TIFF_PATH = "/Users/escott/projects/working-CITEAM/SiGe/JEM-ARM200FImage_20251210_1036_43_ADF1_ImagePanel1/JEM-ARM200FImage_20251210_1036_46_ADF1_1_ImagePanel1.tif" # e.g., "data/single/micrograph_01.tif"
#STACK_FOLDER_PATH = "/Users/escott/projects/working-CITEAM/SiGe/JEM-ARM200FImage_20251210_1036_43_ADF1_ImagePanel1" # e.g., "data/stack_01_frames/"
OUTPUT_DIR = "outputs_0_1"
SINGLE_TIFF_PATH = "SiGe/JEM-ARM200FImage_20251210_1036_43_ADF1_ImagePanel1/JEM-ARM200FImage_20251210_1036_46_ADF1_1_ImagePanel1.tif" # e.g., "data/single/micrograph_01.tif"
STACK_FOLDER_PATH = "SiGe/JEM-ARM200FImage_20251210_1036_43_ADF1_ImagePanel1" # e.g., "data/stack_01_frames/"
# Others can go here. We'll reduce this to one set of values once we know the exact configuration of
# the lab desktops.
# Create the output directory. If it already exists, that's fine, just go on. If need be,
# create any parent directories.
Path(OUTPUT_DIR).mkdir(parents=True, exist_ok=True)
print("done.")Reading an image — the hard way
In day-to-day use, especially on the web, we usually see three image formats: JPEG, PNG, and GIF. Professional graphics users and scientists tend to use TIFF. While there are more than two dozen ways to represent data inside a TIFF, the most common reason to use the format is so an image can be stored without compression — what we call lossless.
Among the thousands of Python packages, several handle reading and writing images. We’ll use the tifffile package to read one of our input files. This is enormously easier than parsing the TIFF byte-stream by hand.
In the next code cell we define a small function that collects a few parameters from a TIFF image and returns them as a dictionary.
# Let's load an image WITHOUT using Hyperspy to get an appreciation for just how much work Hyperspy does for us.
# We'll define a function called "image_stats()" that will collect some basic information about the image and
# return it in JSON format. Then we'll actually load the image, call that function, and print out the results.
def image_stats(img2d):
img = img2d.astype(float)
return {
"shape": list(img.shape),
"dtype": str(img2d.dtype),
"min": float(np.min(img)),
"max": float(np.max(img)),
"mean": float(np.mean(img)),
"std": float(np.std(img)),
"p1": float(np.percentile(img, 1)),
"p50": float(np.percentile(img, 50)),
"p99": float(np.percentile(img, 99))
}
print("done.")Now we actually open the TIFF file and print out information about it — things like the 1st, 50th, and 99th-percentile pixel values.
Instructor cue — ES: More named parameters. They’re going to be all over HyperSpy later today.
img = tiff.imread(SINGLE_TIFF_PATH)
img = img.astype(float)
print("Single image stats:", json.dumps(image_stats(img), indent=2))Displaying the image
It’s almost always a good idea to display an image at various points in your workflow — if for no other reason than to make sure your code is doing something that plausibly could be what you intended.
Instructor cue — ES:
cmap='gray'is worth pausing on. Why grayscale for raw micrographs? When would we deliberately switch to a perceptually-uniform colormap?
# To see the results of our handiwork so far, let's plot that image into our notebook here.
plt.figure(figsize=(5,5))
plt.imshow(img, cmap='gray')
plt.title("0.1.1 Single TIFF")
plt.axis("off")
plt.show()Making and saving a thumbnail
Just for fun (and because it’s useful) let’s make a thumbnail image and save it. In fact, let’s save it as a JPEG. JPEGs are compressed — sometimes highly compressed — and basically unsuitable for doing science with. On the other hand, they are one of the standard formats for the web, and producing them this way for publication or sharing is routine.
We’ll define save_jpeg_thumbnail() and use it on our image.
Instructor cue — ES: - Why programmers write function names with
()at the end. - Step through this one with the debugger! The Python debugger inside JupyterLab is challenging — that is itself worth showing. - Notice whereprint()output ends up.
# And finally, for this section, create a "thumbnail" image and save it as a jpeg (instead of a TIFF file).
# This will create a function called "save_jpeg_thumbnail()" to do the actual work.
def save_jpeg_thumbnail(img2d, out_path, p_low=1, p_high=99, max_size=512):
img = img2d.astype(float)
lo, hi = np.percentile(img, [p_low, p_high])
img = np.clip((img - lo) / (hi - lo + 1e-12), 0, 1)
H, W = img.shape
scale = max(H, W) / max_size
if scale > 1:
step = int(np.ceil(scale))
img = img[::step, ::step]
plt.imsave(out_path, img, cmap='gray')
thumb_path = os.path.join(OUTPUT_DIR, Path(SINGLE_TIFF_PATH).stem + "_thumb.jpg")
save_jpeg_thumbnail(img, thumb_path)
print("Saved thumbnail:", thumb_path)Doing it the easy way — HyperSpy
That was a decent amount of work to open a file, plot it, and save a thumbnail. Fortunately, the HyperSpy package — which we will lean on heavily today — collapses most of it into a single line. It originated as a tool for EELS analysis but has grown into a general-purpose package for microscopy.
# Now, let's do this again using Hyperspy. Hyperspy knows how to deal with multiple input file types,
# including HDF5 and the Digital Micrograph file format, and can read literally dozens of image file formats.
print("=========")
print("Loading an image using Hyperspy")
s = hs.load(SINGLE_TIFF_PATH)
print("loaded successfully!")Plotting with HyperSpy
HyperSpy provides its own plot() function with a number of useful options. Two we’ll use repeatedly are cmap (color map) and colorbar (intensity scale on/off).
Instructor cue — ES: - Look at
sin the debugger and see what’s going on (spoiler: it’s a list, because TEM Center / Digital Micrograph appended a thumbnail). - Walk through the controls on the top-left corner of the image — zoom, zoom history (back / forward), and the home button to reset.
#We can play with cmap settings available in hyperspy, one particularly useful one is 'plasma'
s[0].plot(cmap='plasma')We also do not typically label the axes when looking at real-space images. We can suppress them with axes_ticks (boolean) and remove the intensity scale with colorbar.
# We also do not typically label our axis when looking at realspace images, we can turn that off with the axes_ticks boolean variable (T/F).
# While we are at it, we can remove the intensity scale using the colorbar boolean
s[0].plot(cmap='plasma', colorbar = False)TIFF files, TEM Center, and a closer look
We won’t take a deep dive into the TIFF format and all the ways it has been used (and abused!) over the years, but we’ll peek inside one of our files and see some of the metadata stored there. TIFF allows a rich collection of metadata: some of it is universal (image dimensions, pixel format) and some domain-specific. Microscopy files carry information about how the microscope was set up. Geographic files carry latitude and longitude. You have to know a little about where your image comes from to make sense of its metadata.
Instructor cue — ES: This is the spot to haul out
tiffinfo, show what is actually in the file, and observe that a sidecar.txtfile with metadata also exists. As a bonus, look at the contents of that.txtfile — especially the$$SM_MICRON_BARand$$SM_MICRON_MARKERfields. We’ll use them later to calibrate the pixel scale.
# When saving in TEM Center, a thumbnail image is also saved
s[1].plot(colorbar = False)Take a look at the metadata that HyperSpy is managing for us:
print(s[0].metadata)
print("===============")
print(s[0].original_metadata)Section 0.1.2 — Histograms and Contrast - LSR and ES
What you’ll learn - How to read the intensity histogram of a micrograph. - Why the choice of histogram binning matters. - How to clip extreme intensity tails (1st / 99th percentile) for display. - What gamma correction does, and when to use it for display (never for quantitative analysis).
TEM context — LSR: Slide pitch-in here on detector response and dynamic range — why a linear detector plus nonlinear human vision means we need contrast tools, and what a saturated pixel means physically on an ADF detector.
In this section we plot histograms, inspect intensity distributions, and apply contrast adjustment plus optional gamma correction. We’re looking for areas of interest and/or concern in our images — correcting for nonlinear response (in both the instrument and human vision) and detecting pixels that are “down in the noise” or “full-scale high”.
We’ll use HyperSpy’s plot_histograms() and experiment with a few binning parameters.
# First, we'll just take the defaults
hs.plot.plot_histograms(s[0])Different binning strategies
The choice of histogram bins changes what features pop out. Below we try the square-root rule (the one Excel uses, by the way) and Knuth’s rule.
# Let's try some different options for binning.
# We could try the square root method (the one Excel uses, BTW)
hs.plot.plot_histograms(s[0], legend=["sqr root binning"], bins="sqrt", max_num_bins=2048)
# Or Knuth's algorithm...
hs.plot.plot_histograms(s[0], legend=["Knuth's binning"], bins="knuth", max_num_bins=2048)HyperSpy’s plotting again, with intent
HyperSpy has its own plotting functions. Let’s plot the sample image with all defaults — and then, in the next cell, deliberately modify the image data and watch the histogram change.
Instructor cue — ES: These steps are fairly trivial and we have time. We’ve treated NumPy in a casual way so far — this is a good place to backfill: ndarray vs. Python list, dtypes (
floatvs.uint16), broadcasting, and whynp.percentiledoesn’t loop in Python.Pop quiz from the original code: do you remember how to set the colormap?
# There are many more binning algorithms available - see https://hyperspy.org/hyperspy-doc/current/reference/api.plot/index.html
# Pop Quiz: do you remember how to set the colormap?
s[0].plot()Now: clip to the 1st / 99th percentile, visualize the histogram of the clipped image, then apply a gamma correction (gamma = 0.7) and look once more.
Try it: change
gammato 0.25, 0.5, 1.0, 1.5, and 2.0 and re-run. Why might it be a good idea to label every plot you produce?
p_low, p_high = 1, 99
lo, hi = np.percentile(s[0].data, [p_low, p_high])
print("lo =", lo, " hi =", hi)
img_cs = np.clip((s[0].data - lo) / (hi - lo + 1e-12), 0, 1)
s[0].data = img_cs
hs.plot.plot_histograms(s[0], legend=["auto binning, clipped"], bins=16)
# Finally, let's adjust the contrast of the image by changing the image's "gamma". This is a mapping from the
# linearity of the detector to the non-linearities of the screen we are looking at combined with the
# non-linearities of human vision.
gamma = 0.7 # try 0.25, 0.5, 1.0, 1.5, and 2
img_out = np.clip(img_cs ** gamma, 0, 1)
s[0].data = img_out
hs.plot.plot_histograms(s[0], legend=["auto, clipped, gamma"], bins=16)
s[0].plot() # why might it be a good idea to label all of your plots?Section 0.1.3 — Noise Correction (Gaussian Blur) - LSR and ES
What you’ll learn - What a Gaussian filter is and how it works on an image. - How to apply one with SciPy
ndimage, and how varyingsigmatrades noise reduction against feature preservation. - The same filter, the easier way, via HyperSpy’sdecomposition(). - When denoising is appropriate (display, presentation) and when it is not (quantitative analysis).
TEM context — LSR: Slide pitch-in here — sources of noise in (S)TEM imaging (shot noise, readout noise, scan jitter, drift), and the trade-offs of denoising before measurement vs. before publication.
Instructor cue — UMD folk / LSR: What even IS a Gaussian filter? — this is a natural place for a short slide pause to introduce the convolution / kernel idea before we run the code.
Instructor cue — ES: Six teaching beats hide inside this one cell: 1.
deepcopyvs.copyvs.=. 2. Callback to the NumPy discussion above. 3. Semicolon syntax (...; plt.imshow(...); ...) and why we usually avoid it. 4.forloops over lists and sets — “pythonic” is (mostly) “functional”. 5. SciPy is largely a layer over NumPy. 6. Abstraction is our main tool in computing.
# We've messed with the data in "s", specifically s[0], quite a lot. Depending on what we
# might have done during the break, there's a good chance it's a mess. So let's re-load it!
s = hs.load(SINGLE_TIFF_PATH)
import copy
img = copy.deepcopy(s[0].data)
# We'll clip as before to handle the extreme ends of the range
lo, hi = np.percentile(img, [1, 99])
vis = np.clip((img - lo) / (hi - lo + 1e-12), 0, 1)
# Now we'll use SciPy's "ndimage" package to apply a Gaussian filter to the underlying data
sigma = 0.5 # try 0.5, 1.0, 2.0
dn = ndimage.gaussian_filter(vis, sigma=sigma)
plt.figure(figsize=(5,5)); plt.imshow(vis); plt.title("0.1.3 Before"); plt.axis("off"); plt.show()
plt.figure(figsize=(5,5)); plt.imshow(dn); plt.title(f"0.1.3 After Gaussian (sigma={sigma})"); plt.axis("off"); plt.show()
# Let's try successively larger values of sigma - notice the list syntax for for loops
for sigma in [0.25, 0.5, 1.0, 2.0]:
dn = ndimage.gaussian_filter(vis, sigma=sigma)
plt.figure(figsize=(5,5)); plt.imshow(dn); plt.title(f"0.1.3 After Gaussian (sigma={sigma})"); plt.axis("off"); plt.show()Now, the easier way — with HyperSpy
We can implement noise-filtering algorithms ourselves, or we can let HyperSpy do the heavy lifting. Below we use HyperSpy’s decomposition() (taking all defaults — quite a lot is going on under the hood; ask ES to flag what’s worth digging into).
# We can implement the noise filtering algorithms of our choice, or we can
# let Hyperspy do the hard work for us...
# start with a clean copy again:
tiffFilesWildcard = STACK_FOLDER_PATH + "/*.tif"
print("loading all the files matching", tiffFilesWildcard)
imgStack = hs.load(tiffFilesWildcard, stack=True)
imgStack[0].change_dtype('float')
imgStack[0].decomposition() # taking ALL the defaults - a lot to understand here.
imgStack[0].plot()Section 0.1.4 — Drift Correction on a Stack - LSR and ES
What you’ll learn - Why drift correction is necessary in the first place. - How to load a stack of frames with HyperSpy. - How to estimate, and then correct, the inter-frame shift using phase correlation. - Why summing aligned frames gives a sharper result than summing raw frames. - How to set physical units on the axes (pixels → nm) using the sidecar
.txtmetadata.
TEM context — LSR:
Loose analogy: amateur astronomers stack many short exposures to image the International Space Station, combining motion-blurred frames into one sharper composite (see the BBC Sky at Night write-up). The math we use here is the same family.
Instructor cue — ES??: Take a look at
imgStackin the debugger and walk through what it consists of — it’s the first time we see HyperSpy’s Signal type acting on a stack rather than a single image.
Loading the stack
The first requirement of drift correction is to have more than one image. So far we’ve worked with a single frame, but the example data ships with a directory of ten of them. We re-point our path variables at the TbFeO3 stack and load every TIFF in the folder with one HyperSpy call.
# The first requirement of drift correction is to have more than one image.
# So far we've just been working with the one, but we have a directory with ten of them.
# Let's use them. It's mildly tedious to do this by hand, but again Hyperspy makes life easy.
SINGLE_TIFF_PATH = "TbFeO3/JEM-ARM200FImage_20251024_1518_51_ADF1_ImagePanel1/JEM-ARM200FImage_20251024_1518_54_ADF1_1_ImagePanel1.tif" # e.g., "data/single/micrograph_01.tif"
STACK_FOLDER_PATH = "TbFeO3/JEM-ARM200FImage_20251024_1518_51_ADF1_ImagePanel1" # e.g., "data/stack_01_frames/"
SINGLE_TIFF_METADATA = "TbFeO3/JEM-ARM200FImage_20251024_1518_51_ADF1_ImagePanel1/JEM-ARM200FImage_20251024_1518_54_ADF1_1_ImagePanel1.txt" # e.g., "data/single/micrograph_01.tif"
tiffFilesWildcard = STACK_FOLDER_PATH + "/*.tif"
print("loading all the files matching", tiffFilesWildcard)
imgStack = hs.load(tiffFilesWildcard, stack=True)
#print(imgStack[0])
#imgStack[0]
print('The type of "imgStack" is', type(imgStack))
print('The length of that list (aka "array") is', len(imgStack))
print('A compact representation of "imgStack" is', imgStack[0])
print("=================")
print("The basic metadata for the first image is:")
print("=================")
print(imgStack[0].metadata)Use the slider to flip through the frames in the stack:
# Here you can look at any image in the stack using the slider
imgStack[0].plot(navigator='slider')Why bother stacking?
We acquire multiple fast exposures and sum them to improve signal-to-noise. plot() will normalize, so what we’ll really see is the average.
# What's the purpose of the image stack? We can acquire multiple fast exposures and sum them to improve signal-to-noise.
# plot() will normalize, so we get an average really.
summed_image = imgStack[0].sum()
summed_image.plot()
hs.plot.plot_histograms(summed_image,bins='scott')Signal looks better. But should we assume each frame is perfectly aligned with the next? (Spoiler: no, drift is the rule, not the exception.) Let’s ask HyperSpy to estimate the shift between frames.
# Look! We improved the signal!
# But should we assume each image in the stack is perfectly aligned with the next? (No drift?)
imgStack[0].estimate_shift2D()Aligning the stack
Based on the output above, that’s not a safe assumption. Let’s align the frames using a phase-correlation method.
# Based on the above output, it doesn't seem like a safe assumption...
# Let's try to align them using a phase correlation method
# Link: https://en.wikipedia.org/wiki/Phase_correlation
imgStack[0].align2D()Now check how much estimated shift remains after alignment:
#Now see how much estimated shift remains:
imgStack[0].estimate_shift2D()Sum the aligned stack
We have a stack of frames that are now individually aligned to each other, but they’re still independent images until we combine them. Sum them up.
# We have a stack of independent images, aligned to each other, but they're still independent images
# until we combine them.
summed_image_aligned = imgStack[0].sum()
summed_image_aligned.plot()
hs.plot.plot_histograms(summed_image_aligned,bins='scott')Plot both side-by-side — it should be obvious which one is sharper.
# By plotting both together, it should be easy to determine which is sharper
hs.plot.plot_images([summed_image,summed_image_aligned])Setting physical units on the axes
Set units and a scale on the axes so the image carries real-world distance information rather than pixel indices.
# set units and scale for the axes
summed_image_aligned.axes_manager.gui()
summed_image_aligned.plot()Smoothing for presentation (not analysis)
For presentation, we sometimes want to smooth the summed image — e.g., with a Gaussian filter. This is acceptable for slides and figures. It is not acceptable as a step before quantitative analysis.
# For presentation we sometimes want to "smooth" an image, such as with a gaussian filter
# Note that this can be helpful for presentations, it should not be used for analysis!
from scipy.ndimage import gaussian_filter
s_out = summed_image_aligned.map(gaussian_filter, sigma=1.1, inplace=False)
s_out.plot()
hs.plot.plot_histograms(s_out, bins = 'scott')Other denoising algorithms via HyperSpy
HyperSpy ships several denoising algorithms beyond a simple Gaussian filter. Here we re-load a fresh stack, convert it to floating-point so the math is well-defined, and run decomposition() with all defaults. Plenty to unpack under the hood — ask ES to flag the bits worth a deeper dive.
# Noise reduction via HyperSpy's various algorithms, not just a gaussian filter
# Get a fresh copy of the original stack
freshStack = hs.load(tiffFilesWildcard, stack=True)
# it's currently full of 16 bit integers (0 to 65535). We need floating-point values for doing actual math. So we convert:
freshStack[0].change_dtype('float64')
freshStack[0].decomposition() # taking ALL the defaults - a lot to understand here.
freshStack[0].plot()
imgStack[0].axes_manager.gui()imgStack[0].plot()
imgStack[0].calibrate(interactive=True)Reading the scale from the sidecar .txt metadata¶
The GUI is fine for one-off use, but for any kind of repeatable workflow we want to read the scale information directly from the sidecar .txt file that the microscope writes alongside each TIFF, and then use those numbers to set the HyperSpy axis scales.
#Alternative: read the metadata from the text file
def string_to_float_iterative(s):
numeric_string = ''.join([c for c in s if c.isdigit() or c == '.'])
return float(numeric_string)
print(SINGLE_TIFF_METADATA)
with open(SINGLE_TIFF_METADATA) as f:
#with open("/tmp/hi") as f:
for line in f:
lineList = line.split()
if len(lineList)==0:
break
if lineList[0] == "$$SM_MICRON_BAR":
smMicronBar = float(lineList[1])
if lineList[0] == "$$SM_MICRON_MARKER":
smMicronMarker = string_to_float_iterative(lineList[1])
try:
scale=smMicronMarker/smMicronBar
except:
print("A METADATA PROBLEM HAPPENED!")
scale=-1000000.0
print ("Metadata text file suggests scale =", scale)
imgStack[0].axes_manager[1].name = "Distance"
imgStack[0].axes_manager[1].units = "nm"
imgStack[0].axes_manager[1].scale = scale
imgStack[0].axes_manager[1].offset = 0
imgStack[0].axes_manager[2].name = "Distance"
imgStack[0].axes_manager[2].units = "nm"
imgStack[0].axes_manager[2].scale = scale
imgStack[0].axes_manager[2].offset = 0
imgStack[0].plot()
print(imgStack[0].calibrate(interactive=True))Section 0.1.5 — Measure Distances (pixels → nm) Interactively - LSR and ES
What you’ll learn - How to set the HyperSpy
axes_managerso that distances on screen are reported in nanometers, not pixels. - How to use HyperSpy’s interactivecalibrate()tool to measure features directly on the image.
TEM context — LSR: Slide pitch-in here on what we are actually measuring: lattice spacings, step heights, defect sizes — and how the scale bar in the file metadata gets set in the first place. Useful place to remind students that sub-nanometer precision in display depends on sub-nanometer precision in calibration.
There are several ways to measure features on a micrograph. We could measure in pixels and convert later, or we can tell HyperSpy our image scale once and have it report measurements directly in our chosen units. We’ll do the latter.
Instructor cue — ES: Run the cell below and three buttons appear underneath: Axis 1, Axis 2, Axis 3. Axis 1 is just the number of images in the stack (10). Axis 2 is width and Axis 3 is height. Click each, set units to
nm, and the scale to0.0390625(~25.6 px/nm).
# First, we need to open the Hyperspy "axes manager" and enter the values for the scale.
# Run this code cell, and three clickable buttons will appear below: "Axis 1", "Axis 2", and "Axis 3".
# Axis 1 is just the number of images in the stack - 10. Axis 2 is the width axis and axis 3 is the height axis.
# Click on each of those two, enter "nm" for your units, and .0390625000 for the scale. In other words,
# each pixel represents .039-ish nanometers, or 25.6 pixels per nanometer
imgStack[0].axes_manager.gui()While the GUI works for one-off use, it is easier (and more reproducible) to read the scale from the sidecar .txt files. We did exactly that in the previous section.
# Now we can plot the data. Take a look at the scale mark in the bottom left corner of the image.
imgStack[0].plot()Once the axes have a real-world scale assigned, we can use the calibrate() tool to measure distances between features directly on the image:
# We don't have to hold a ruler up to the screen. Hyperspy provides a handy "calibrate()" function that lets us
# draw a line and it will report the length in our chose scale units. (If we hadn't set the axes scales above then this
# would just return the number of pixels and we would have to do something sensible with it.
imgStack[0].plot()
imgStack[0].calibrate(interactive=True)Section 0.1.6 — FFT Analysis - LSR and AH
What you’ll learn - How to extract a 1D line profile through a region of interest. - How to define a rectangular ROI and compute its 2D FFT. - What the power spectrum of a periodic image looks like. - How to mask features in reciprocal space using polygon ROIs and reconstruct a filtered image with the inverse FFT.
TEM context — LSR: This is the heart of the day from a microscopy standpoint — please pitch in with the slide on reciprocal space, lattice spacing, and what the FFT of a real-space lattice tells us. The FFT is not just a “fancy filter”: for crystalline samples, the spots in reciprocal space are the measurement, and masking + inverse FFT lets us isolate which periodicities contribute to the visible image.
Instructor cue — AH & JC: Suggested beats:
- The 1D line profile (
Line2DROI) as a sanity check on real-space periodicity before we ever transform.- The rectangular ROI → FFT workflow, including why we use
apodization=True(windowing kills edge artifacts) andshift=True(puts DC in the center of the plot).- The polygon-mask demo: pick the diffraction spots that correspond to a single set of lattice planes, mask everything else, inverse-FFT, and show that the resulting real-space image highlights just those planes.
- The cell labelled
# This will error out, perhaps because the smaller N-gon is folded?— this is a known wrinkle worth addressing live.
A 1D line profile across the image
Before we go to 2D Fourier space, let’s pull a single line through our image and inspect its intensity profile. If the sample is crystalline, periodic oscillations in the profile already hint at the lattice spacing.
s[0].plot()
line_profile = hs.roi.Line2DROI(400, 250, 220, 600, 100)
line0 = line_profile.interactive(s[0])hs.plot.plot_spectra(line0)Define a rectangular ROI and compute its FFT
Now we draw a rectangular region on the real-space image and compute the 2D FFT of just that region. apodization=True applies a window function to suppress edge artifacts; shift=True puts the DC component in the center of the plot.
s[0].plot()
roi = hs.roi.RectangularROI()
sliced_signal = roi.interactive(s[0])s_fft = hs.interactive(sliced_signal.fft, apodization=True, shift=True, recompute_out_event=None)
s_fft.plot(power_spectrum=True)Mask features in reciprocal space
Define two polygon ROIs in the FFT plot. The plan: select specific spots (or sets of spots) in reciprocal space, mask everything else, and inverse-FFT to see which real-space periodicities they correspond to.
p_roi = hs.roi.PolygonROI([(0.2, 0.4), (0.45, 0.45), (0.45, 0.2), (0.35, 0.35)])
p_roi2 = hs.roi.PolygonROI([(0.1, 0.2), (0.12, 0.2), (0.15, 0.1), (0.2, 0.14)])s_fft.plot(power_spectrum=True)
p_roi.add_widget(s_fft, axes=s_fft.axes_manager.signal_axes)
p_roi2.add_widget(s_fft, axes=s_fft.axes_manager.signal_axes)s_roi_combined = hs.roi.combine_rois(s_fft, [p_roi, p_roi2])
s_roi_combined.plot()Build a boolean mask from the polygon ROIs and visualize it as its own 2D signal:
boolean_mask = hs.roi.mask_from_rois([p_roi, p_roi2], s_fft.axes_manager)
boolean_mask = hs.signals.Signal2D(boolean_mask)
boolean_mask.plot()np.shape(s_fft) == np.shape(boolean_mask)# This will error out, perhaps because the smaller N-gon is folded?
masked_signal = s_fft*boolean_maskmasked_signal.plot(power_spectrum=True)Inverse FFT the masked region
The whole point of building a mask in reciprocal space is to reconstruct a filtered real-space image: only the periodicities we kept. Run the inverse FFT and compare to the original.
im_ifft = masked_signal.ifft()
im_ifft.plot()Wrap-up & next steps
What you’ve done today
- Loaded TIFF micrographs — both directly with
tifffileand (more cheaply) with HyperSpy.- Inspected metadata, both inside the TIFF and in the sidecar
.txt.- Adjusted contrast and gamma; explored multiple histogram-binning rules.
- Reduced noise with both a hand-rolled Gaussian filter and HyperSpy’s decomposition methods.
- Loaded a stack of frames, detected drift, registered them with phase correlation, and summed them to improve signal-to-noise.
- Calibrated axes from microscope metadata so distances read in nm.
- Computed FFTs, masked features in reciprocal space, and reconstructed filtered real-space images via the inverse FFT.
Office hours, this afternoon
The instructors will be available this afternoon for office hours to address specific questions, debug local installs, and walk through your own data. Bring whichever step gave you the most trouble — that’s the one most worth talking through.
TEM context — LSR: Final wrap-up slide pitch-in here — tying the day’s hands-on workflow back to the broader TEM Data Management story and what to expect on Day 2.