Assessing the interplay of contrast, defocus, and resolution in cryo-EM: a benchmark experiment for limited dataset screening

The selection of defocus ranges for small datasets in cryo-electron microscopy (cryo-EM) is under-researched. We present a comprehensive benchmark experiment that aimed to evaluate the relationship between contrast, defocus, and resolution, particularly in the context of limited datasets. We conducted a detailed analysis of beta-galactosidase, apo-ferritin, and connexin-46/50 datasets to optimize pre-screening strategies for cryo-EM. Our approach involved classifying micrographs based on image contrast using an artificial intelligence (AI) model without considering the defocus level. This method allowed us to investigate the optimal defocus range for pre-screening in a limited dataset and its impact on the overall image processing. The micrographs were categorized into good, moderate, and bad contrast groups. Subsequent analysis revealed that, contrary to the prevailing assumption that lower contrast (associated with lower defocus) leads to higher resolution, in scenarios with limited datasets higher contrast images yield superior resolution. This finding was consistent across all three protein samples, underscoring the critical role of contrast in determining the quality of 3D reconstructions in limited datasets. This significant finding challenges conventional cryo-EM methodologies. In conclusion, our study provides new benchmarks for selecting appropriate contrast and defocus levels in cryo-EM, particularly for screening approaches that use limited datasets. This strategy promises to enhance the data quality and efficiency in structural biology research, particularly in resource-constrained scenarios.


Introduction
Structural biology is crucial for decoding the molecular mechanisms of life.A deeper understanding of biomolecular functions depends on high-resolution structural analyses.In this realm, cryo-electron microscopy (cryo-EM) is a game changer, allowing near-native observation of biological samples frozen at cryogenic temperatures (Benjin and Ling 2020).Recent advancements in cryo-EM, particularly in the improved detective quantum efficiency (DQE) of detectors (Faruqi et al. 2015) and the enhancement of image processing techniques for motion correction and heterogeneity distinction, have markedly enhanced the achievable resolution of molecular structures (Bai et al. 2015).
The evolution of detector technology, evidenced by faster speeds (Sun et al. 2021) and innovative methods, such as aberration-free image shift (AFIS) (Konings et al. 2019), has accelerated the collection of extensive data, significantly boosting throughput.Concurrently, advancements in software that facilitate on-the-fly processing have reduced manual intervention and expedited the verification of results (Mendez et al. 2023;Tegunov and Cramer 2019;Punjani et al. 2017;Kimanius et al. 2021).Nonetheless, the quality of the samples and specimens remains a pivotal factor in high-resolution reconstruction.The increased pace of data production has enabled a more efficient evaluation of sample quality.
In cryo-EM, selecting an appropriate defocus range is vital because image processing depends on capturing information across varying spatial frequencies.A low defocus is often associated with low-contrast micrographs, which are preferable for high-resolution reconstructions (Cheng 2015).However, these micrographs pose challenges in terms of particle selection, which is a critical step for a successful 3D reconstruction.There is a noticeable research gap in the selection of defocus ranges for small datasets (Pasqualetto et al. 2023;Basanta et al. 2022;Masiulis et al. 2019).In this study, we examined various small datasets, such as beta-galactosidase (Jeong et al. 2019), apo-ferritin (EMPIAR-11013 (Brown and Hanssen 2022)), and connexin-46/50 (EMPIAR-10480 (Flores et al. 2020)), to identify the most effective defocus ranges for rapid quality assessment of limited datasets.
At the same focus, the particle contrast in images is influenced by variables such as ice thickness and sample size.We categorized the micrographs based on the particle contrast to identify the optimal defocus range for swift sample quality assessment.The use of artificial intelligence (AI) for categorization ensured complete objectivity and minimized bias, with detailed information on the AI model's configuration and its validation available in Fig. S1 and the Supplementary Tables.The defocus of each micrograph was confirmed by contrast transfer function (CTF) estimation (Jeong et al. 2013).Our findings revealed that high-defocus micrographs typically achieve better resolution than low-defocus micrographs in limited datasets.

Preparation of data for AI model training
The dataset used in a previous benchmarking study that discussed data collection strategies was reused to train and test the AI model for image categorization (Jeong et al. 2019).Briefly, a solution of E. coli K12 beta-galactosidase (Sigma, G5635) was vitrified using Vitrobot Mark IV (Thermo Fisher Scientific) and transferred into a Titan Krios G2 (Thermo Fisher Scientific) equipped with an image Cs corrector, and the data were recorded using a Falcon III EC detector (Thermo Fisher Scientific) at 300 keV and a nominal magnification of 75 000 × (0.864 Å/pixel).All data were collected automatically using the EPU software (Thermo Fisher Scientific) set with a dose rate of 0.68 e/pixel/s, a total dose of ~ 30 e/Å 2 , and the electron counting mode.Then, the data were preprocessed, including motion correction and CTF estimation, using cryoSPARC v4 software (Punjani et al. 2017) to obtain aligned micrographs.
We acquired 368 micrographs and manually sorted them based on image contrast without considering the image defocus level, resulting in 110 good-contrast (GCM), 116 moderate-contrast (MCM), and 142 badcontrast (BCM) micrographs.The sorted micrographs were randomly split into training (80%) and testing (20%) datasets in a stratified manner.The testing dataset was used for independent testing of the developed models and further benchmarking tests (Sect."Image processing for benchmarking test") and was not used for training the model or for internal validation.Sample micrographs from the test dataset are available at https:// github.com/ yongb in98/ cryoem.All the micrographs used in this study are available upon request.Further details regarding AI model training can be found in the supplementary documents [Supplementary Methods, Supplementary Tables and Fig. S1].

Image processing for benchmarking test
The testing datasets of beta-galactosidase described above (Sect."Preparation of data for AI model training" (Jeong et al. 2019)), apo-ferritin (EMPIAR-11013 (Brown and Hanssen 2022)), and connexin-46/50 (EMPIAR-10480 (Flores et al. 2020)) were used for the benchmarking test to investigate an optimal defocus range for the collection of a limited dataset, and overall image processing was performed using CryoSPARC v. 4.2.1 (Punjani et al. 2017).The preprocessing of motion correction (full-frame motion correction) and CTF estimation (CTFFIND4 (Rohou and Grigorieff 2015)) were performed for each dataset, and the resultant micrographs were categorized into three distinct classes based on the image contrast.The particles in the subset were picked in reference-free manner (Blob picking), extracted at an appropriate size for each protein, and selected by conventional 2D classification.The selected particles were reconstructed into a 3D electron density map with protein-specific symmetry imposed using the homogeneous refinement function.All image processing after the particle extraction was performed independently for each subset.The visualization of 3D electron density map was done by Chimera 1.17.3 (Pettersen et al. 2004).

Benchmark using the dataset of Beta-Galactosidase
Micrographs of beta-galactosidase were classified based on the image contrast using a trained AI model (Fig. 1a,  b).This classification resulted in 21 micrographs in each class, which were categorized as GCM, MCM, and BCM.The underfocus values varied between these categories, displaying a range from 1.52 to 2.71 µm in GCM, 0.84 to 2.07 µm in MCM, and 0.31 to 1.20 µm in BCM (Fig. 1c).This variation supports the widely accepted notion that higher defocus levels typically lead to increased contrast between the background and particles (Cheng 2015).
For each category, the particles selected through reference-free particle picking (blob picking) and 2D classification (Fig. 2) were used for 3D reconstruction.Both GCM and MCM contained a greater number of selected particles than BCM.This implies that images with higher contrast facilitate more effective 2D class formation, supporting the observation that high contrast is advantageous for particle visibility (Langlois et al. 2014).
To address concerns regarding potential biases from the initial model in the 3D reconstruction, our study also included an exploration of ab initio reconstruction using selected particles.This approach aims to provide a more balanced and accurate representation of data.In our comparative analysis of 3D reconstructions, we employed two distinct approaches.Initially, as shown in Fig. 3a, particles were selected based solely on their classification into high-quality 2D classes specific to each micrograph category (GCM, MCM, BCM).This method prioritizes particles that individually demonstrate the best features within their respective classes.In contrast, Fig. 3b utilizes a subset of particles termed 'intersectional particles.' These are particles that are consistently recognized across multiple categorizations, providing a crossvalidated subset that ensures robustness and repeatability of features across different micrograph categories.This method aims to minimize bias and variation that may arise from class-specific artifacts or anomalies, thus ensuring a more reliable and generalized representation of the structure.The comparative analysis of the 3D reconstruction results across the categories revealed that both GCM and MCM achieved a higher resolution than BCM (Fig. 3).Given that the 2D classification of BCM (Fig. 2) resulted in fewer adequately formed classes, we conducted a 2D classification using all the selected particles from each category to optimize the utility of BCM particles.The subsequent 3D reconstructions underscored a consistent trend: Both GCM and MCM outperformed BCM in terms of the number of particles used and resolution achieved.
The influence of particle number on resolution is a welldocumented phenomenon in structural biology (Liao and Frank 2010).To determine whether the differences in resolution were attributable to particle count or contrast, we conducted a controlled experiment.An equal number of particles were randomly selected from each category for 3D reconstruction (Fig. S2).This approach revealed that while the resolution in the GCM and MCM diminished with a reduced particle count, it remained notably higher than that of the BCM.This finding underscores the pivotal role of image contrast in dictating resolution, overshadowing the impact of the particle number, especially in analyses involving limited datasets.

Benchmark using the dataset of Apo-Ferritin
To ensure the consistency of our results from the betagalactosidase dataset with those of other proteins commonly used in cryo-EM benchmarks, we extended our research to include apo-ferritin.We analyzed 43 apo-ferritin micrographs and classified them into contrast-based categories using AI.This classification process identified 21 micrographs each in the GCM and MCM categories, and only one micrograph in the BCM category.The underfocus values ranged from 0.94 to 2.41 µm for GCM, 0.59 to 1.55 µm for MCM, and a single value of 0.44 µm for BCM (Fig. S3).Owing to the limited representation of BCM in this dataset, we focused our further analysis on GCM and MCM.After 2D classification and particle selection for each category (Fig. S4), an initial model was created using the particles from all categories.The 3D reconstruction (Fig. 4a) revealed a more pronounced difference in Fig. 3 Three-dimensional reconstruction of the particles from each micrograph category.a Reconstructions derived from the selected high-quality 2D classes specific to each micrograph category, highlighting the impact of contrast quality on particle selection and resolution.b Reconstructions based on 'intersectional particles, ' which are selected from a pool common to all categories, reflecting robust features validated across multiple classifications.This subset ensures a balance and reduces the influence of category-specific biases.Both a and b list the resolution, sigma value, B-factor, and the number of particles involved beneath each 3D map.Notably, the BCM category consistently shows lower resolution and poorer structural features compared to GCM and MCM, highlighting the importance of high-contrast images for detailed structural analysis.(See also Fig. S2) resolution between the GCM and MCM compared to that observed in the beta-galactosidase dataset (Fig. 3a).This observed variation in resolution can be attributed to the lower and more restricted defocus range of the MCM in the apo-ferritin dataset than that in the beta-galactosidase dataset.Interestingly, the combined use of particles from both the GCM and MCM did not lead to a statistically significant improvement in resolution compared to the use of GCM particles alone (Fig. 4a).It is possible that the particles in the GCM already provided sufficient resolution and that the addition of MCM particles did not provide additional meaningful information for significant resolution enhancement.
To further understand the impact of the defocus range on the resolution, we conducted an additional experiment with the apo-ferritin dataset.Specifically, we selected 21 micrographs that matched the defocus range of the beta-galactosidase dataset's MCM, as depicted in Fig. 1c.This was followed by particle selection and 3D reconstruction (Fig. 4b).The experiment yielded a resolution of 3.78 Å, which intriguingly show an improvement from the resolution obtained with GCM classified through the AI model (Fig. 4a Middle).This outcome can be attributed to the lower and narrower defocus range of the GCM in the apo-ferritin dataset compared with that in the beta-galactosidase dataset classified by the same AI model.Moreover, even when the number of particles was adjusted to match those of the GCM under the defocus conditions of the beta-galactosidase dataset's MCM (from 1955 to 1678 particles) (Fig. 4b Left), the resolution did not significantly change (Fig. 4b Right).This observation is crucial, as it highlights that, particularly in small datasets, the defocus range has a more substantial impact on resolution than the sheer number of particles.This finding underscores the significant role of defocus settings in achieving reliable and high-quality cryo-EM reconstructions, emphasizing the need for careful optimization of imaging parameters over simply maximizing particle yield.

Benchmark using the dataset of connexin-46/50
Building on our findings with beta-galactosidase and apo-ferritin, we extended our exploration to the role of contrast in limited datasets of 3D reconstruction by incorporating connexin-46/50, a protein complex crucial for cell gap junction communication.Connexin-46/50, differing in its structural and compositional characteristics, offers an opportunity to examine the impact of contrast on resolution in a different context.
Through manual classification of 200 connexin-46/50 micrographs based on image contrast, we observed underfocus ranges of 1 to 3 µm for GCM + MCM and 0 to 1 µm for BCM.Following blob picking, 2D classification (Fig. S5), ab initio modeling, and 3D reconstruction (Fig. 5), a clear pattern emerged: GCM + MCM consistently yielded higher resolution than their BCM counterparts.This pattern observed in the connexin-46/50 dataset resonates with our earlier observations, underscoring the significant influence of contrast on the resolution in 3D reconstructions of small datasets.Furthermore, these results reinforce our understanding of how the defocus range and particle number are critical factors in achieving optimal resolution, as observed in our experiments with betagalactosidase and apo-ferritin.Such consistency across different protein samples supports the notion that contrast, coupled with other factors, plays a fundamental role in the quality of 3D reconstructions of limited datasets.

Discussion
The landscape of cryo-EM has been significantly reshaped by technological advancements, such as AFIS, along with improvements in detectors and microscopes, leading to a notable increase in the speed of data collection (Sun et al. 2021;Mali et al. 2021).Despite these advancements, the requirement for extensive data collection, often exceeding one day, remains a challenge, escalating the associated costs.This highlights the critical role of efficient preliminary screening.Implementing rapid, accurate, and cost-effective pre-screening methods using limited datasets is not only effective but also essential for optimizing resource utilization before proceeding with extensive data collection.
In cryo-EM, one of the primary challenges is discerning the optimal contrast for particle selection during image processing.While increasing the defocus can enhance contrast, it may also lead to distortion of information and loss of high-resolution details.Typically, a lower contrast (associated with lower defocus) is conducive to restoring a higher resolution (Cheng 2015).However, our findings provide pivotal insight into scenarios involving limited datasets.We observed that when working with a smaller number of micrographs, and thus fewer particles, a higher contrast (higher defocus) yielded better resolution than a lower contrast.This finding is particularly significant in cases with limited data availability.
Our study also revealed that, even with an equal number of particles across different contrast groups, highcontrast images consistently offered better resolution than their low-contrast counterparts (Fig. S2 and 5).This suggests that the acquisition of detailed high-resolution information requires a large pool of particles to facilitate effective signal differentiation and convergence.This becomes particularly relevant in situations with sparse datasets, in which the efficacy of high-resolution information for alignment and reconstruction is diminished.Consequently, during grid screening, swiftly collecting limited quantities of high-contrast data for immediate reconstruction emerges as a pragmatic and advantageous approach.

Conclusion
It is important to acknowledge certain limitations of our study.Primarily, the use of a limited number of micrographs from each protein sample may not fully represent the diversity encountered in more varied datasets.Additionally, while AI-assisted classification offers many advantages, it can also introduce biases that need to be considered, and the manual sorting process may have its own set of subjective limitations.Further research is needed to validate our findings across a broader range of proteins and sample conditions to ensure the generalizability of our conclusions.Despite these considerations, this breakthrough is particularly transformative for the strategies employed in preliminary grid screening using cryo-EM.By judiciously leveraging limited datasets and focusing on the selection of high to moderate (roughly above 1 µm or higher) rather than low-defocus images, researchers can substantially save time and reduces costs.Therefore, our study sets a new precedent for cryo-EM methodology, offering a more efficient way before proceeding to massive data collection or preliminary sample assessment.

AFIS
Aberration-free image shift Fig. 5 3D reconstruction of connexin-46/50 complex across different defocus categories.a illustrates the 3D reconstructions of connexin-46/50 obtained from particles at high + moderate and low defocus (GCM + MCM and BCM), detailing the achieved resolutions and particle counts used.b offers the top-down view of the same reconstructions for a direct comparison.Reconstructions were performed with a number of particles equalized across defocus categories to assess the impact on resolution.Each reconstruction details the number of particles utilized, the resolution achieved, the B-factor, and contour levels adjusted for optimal visibility.The results emphasize the effect of defocus on resolution, with higher defocus providing more particles and improved reconstruction detail

Fig. 1
Fig. 1 Schematic diagram of contrast-dependent micrograph categorization.a Overall network architecture for patch image classification.b Ensemble approach for classification in testing data.c CNN-based micrograph categorization for beta-galactosidase dataset was done on the basis of image contrast, resulting in 3 subsets of data named good, moderate and bad, respectively.Please note that this enables the trained model to perform 'unforced defocus selection' that allows certain overlapping of 'class-boundary' defocus.The results of CTF estimation (defocus estimation) for the images of each class are shown in the parentheses (See figure on next page.)

Fig. 4
Fig. 4 Three-dimensional reconstruction of Apo-Ferritin.a Displays the reconstructions derived from the use of GCM (left), MCM (Middle) and GCM + MCM (right) particles.b Shows reconstructions where the defocus range of 0.84 µm to 2.07 µm (corresponding to the MCM of Beta-galactosidase) was applied (left) and GCM with the identical numbers of particles corresponding to MCM (right).The number of particles utilized and the resolution achieved in each reconstruction is detailed beneath the corresponding maps.Each reconstruction details the number of particles utilized, the resolution achieved, the B-factor, and contour levels adjusted for optimal visibility