Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Image analysis for those who don't want to

become an expert (but still need it)


By Bastien Molcrette, PhD, Biophysicist and Founder of Symbiophysics

Preamble
Have you ever experienced utter discouragement when confronted with an image analysis
challenge? After investing an extensive amount of time in the laboratory, meticulously refining your
experiment for months, or even years, you find yourself thwarted by the complexities of image analysis.
Despite seeking assistance from experts, their proposed solution remains incomprehensible to you, and
the mere installation, let alone utilization, of their program seems insurmountable. They fail to grasp your
requirements and advise you to redo your experiments more effectively, oblivious to the immense effort
involved.

My professional background encompasses expertise in image analysis as well as experience as


a bench biologist, granting me a comprehensive understanding of the requirements in both domains.
Recognizing the occasional gap that exists between these two communities, I was inspired to rectify this
situation and introduce my own approach to learning image processing. This method is specifically
tailored for individuals who do not aspire to become image analysis experts but necessitate its application
for their projects.

If your goal is to obtain a thorough theoretical explanation of the various mathematical operations
employed in image processing, this course may not be the right fit for you. However, if you are interested
in developing practical skills in image analysis and discovering effective strategies for handling noisy data
without dedicating extensive time to theoretical learning, then this method is exactly what you need.
Throughout the course, you will have the opportunity to work with real datasets, allowing you to practice,
encounter failures, and iterate until you arrive at solutions. The core objective of this course is to provide
you with an operational pipeline that can be replicated, modified, and tailored to your specific
requirements. By working with a diverse range of datasets, we will also explore and learn from the best
practices in image analysis.

The initial hands-on session, focusing on cell counting, which is a crucial aspect of image analysis,
is provided to give you a glimpse into my approach. This marks the beginning of a series that will cover
the fundamentals of image analysis. Additional topics such as histological tissues
segmentation/quantification and time-lapse/live-imaging microscopy analysis are currently being
developed and will be available shortly.

If you have any questions, feedback, or specific needs regarding image analysis or access to other
practical lessons, please feel free to reach out to us via email at contact@symbiophysics.fr. We are here
to assist you and provide the necessary support.

My name is Bastien Molcrette, PhD in Biophysics and founder of Symbiophysics, a scientific


support company for R&D biotech projects. My aim with Symbiophysics is to leverage all available
resources to maximize the potential of your biotech R&D project: Expert guidance in scientific, technical,
and strategic matters, data/image analysis, bibliographic studies, technical/scientific intelligence,
problem solving, scientific communication and writing.

Visit our website: https://symbiophysics.fr

1
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
Topic – Counting Cells, Bacteria, and other objects

Practical Cell Counting 1: General Pipeline for Counting Nuclei


ImageJ (beginner); Python (beginner)

Pre-requisites
- Download the dataset containing the images used in this lesson:
https://drive.google.com/drive/folders/1ncFd_pbXqV1ASEA-reOYVS-J8fPG5h4g?usp=sharing
- Fiji (a working version of ImageJ with useful plugins already installed):
https://imagej.net/software/fiji/downloads
- Python interpreter (optional, used to perform post-processing data analysis). I suggest installing
Spyder for this purpose if you don't already have a Python interpreter: https://docs.spyder-
ide.org/current/installation.html
- Pandas Python library (optional, used for post-processing data analysis):
https://pandas.pydata.org/docs/getting_started/install.html

If you can't use Python or can't install it, the first part of the image processing pipeline is completely done
with imageJ, and we can extract the cell counting results from csv files using the notepad or Excel.

Objectives
The purpose of this initial hands-on session is to establish a basic workflow for cell counting, a
fundamental aspect of biology projects. The session will cover a reliable workflow that can address local
brightness variations, noisy pixels, and accurately segment cell clusters. Additionally, a standard
automated cell thresholding method will be demonstrated, along with an examination of morphological
parameters. Finally, participants will learn how to batch analyze data sets and utilize an initial Python
script for automated result processing.

Prior to commencing an image analysis pipeline, it is essential to carefully examine the raw
images to anticipate any challenges that may arise during the extraction of the desired information. It is
crucial to verify that the objects of interest can be readily differentiated from the background and other
detectable objects. In our current dataset, the nuclei are clearly distinguishable, with minimal background
interference and limited intensity variations between nuclei. If feasible (in case of a limited number of
images), it is advisable to review each raw image in your dataset to ensure that they meet the necessary
quality standards for your analysis. Alternatively, select specific images on which to fine-tune your
algorithm.

For this practical lesson, we have at our disposal a high-quality dataset comprising 16 images
that exhibit minimal defects. Our primary objective is to develop a comprehensive framework for
processing microscopy images, with a specific focus on nuclei counting. Through this exercise, we will
gain a preliminary understanding of the capabilities offered by imageJ and Python for image analysis and
data processing. In future lessons, we will tackle the complexities of working with noisy datasets and
explore strategies to overcome the inherent challenges posed by imperfect microscopy images.

The microscopy images featured in this practical lesson are derived from the research carried out
by K-W Fong et al., J Cell Biol. 2013 Oct 14; 203(1): 149–1641. The dataset related to this study has been
made available on the Image Data Resource (IDR) public repository under a licence CC BY-NC-SA 3.02.

1
https://rupress.org/jcb/article/203/1/149/37478/Whole-genome-screening-identifies-proteins
2
https://idr.openmicroscopy.org/webclient/?show=screen-253

2
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
For the ongoing practical lessons, a set of 16 microscopy images has been selected. These images
showcase fixed HeLa cells that have undergone double fluorescent staining. The staining involves
highlighting the nuclei (DAPI) as well as a flag-tagged protein (TRITC). However, our focus will solely be
on the DAPI (w1) channel.

Raw image (plate 11001_A07_s6_w1.tif, from K-W Fong et al.3)

The first phase of our pipeline involves rectifying the brightness inconsistencies in the image. Due
to factors such as microscope settings and sample morphology, the illumination may exhibit local
variations, posing challenges in determining a threshold value for distinguishing objects of interest from
the background. A brief visual inspection of the images can provide this crucial information. Although the
illumination field in our current image appears quite uniform (considering the high quality of our dataset),
we can still identify two separate cells in the lower central portion of the image with significantly higher
intensity values, while some cells appear relatively darker.

To correct these local variations, we can use the CLAHE method4, that increases the local
contrast. Once you have successfully loaded your image in imageJ (by dragging and dropping the tiff file
onto the imageJ window), proceed to the 'Process/Enhance Local Contrast (CLAHE)' panel. In the resulting
window, modify the blocksize parameter (which establishes the scale of a local region) to a value of 200
(ensuring it exceeds the size of your objects of interest) and execute the operation.

3
https://pubmed.ncbi.nlm.nih.gov/24127217/
4
https://imagej.net/plugins/clahe

3
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
run("Enhance Local Contrast (CLAHE)", "blocksize=200 histogram=256
maximum=3 mask=*None*");

Tip: This grey box includes ImageJ macro code that encompasses the actions performed when utilizing
ImageJ (here the CLAHE correction). These segments will be employed later on in the process to create a
complete macro for streamlining the analysis.

After CLAHE correction


The enhancement provided by the CLAHE correction significantly enhanced the signal of the cells in
comparison to the background.

Note: It is advisable to avoid using the CLAHE method when working with images that contain a
background of low intensity, a first object of medium intensity, and a second object of high intensity. This
is due to the fact that the CLAHE method boosts local contrast, potentially causing an increase in the
medium object and a decrease in the higher one, thereby complicating the differentiation between these
two objects. While this pipeline is typically effective for most counting procedures, it is crucial to maintain
a critical approach and assess the progress of each stage in the process.
To optimize the detection of the objects of interest, it is recommended to 'flatten' the background
to zero value, even though the dataset already exhibits high quality. This can be accomplished by
employing the rolling ball method5 available in imageJ. Access the 'Process/Substract Background…'
panel and set the ball's radius to 500. It is important to ensure that this value surpasses the size of the
largest object of interest in the image to avoid unintentional elimination during analysis. If desired, you
can preview the resulting image by enabling the 'preview' button prior to executing the process.

run("Subtract Background...", "rolling=500");

5
https://imagej.net/plugins/rolling-ball-background-subtraction

4
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
To eliminate individual noisy pixels, one can employ a median filter. This filter replaces each pixel
in the image with the median value of the pixels within a specified radius. To access this filter, navigate
to 'Process/Filters/Median...' and adjust the radius to 2. By using the median filter, we preserve the
integrity of the edges, which is crucial for studying the morphology of the identified objects.

run("Median...", "radius=2");

After eliminating the background and applying a median filter (with only minor adjustments
made in comparison to the initial image, as there was little noise present in the original)
The image has undergone processing at this stage to minimize any noise and amplify the signal
originating from the nuclei. It is now possible to carry out image binarization to distinguish the objects of
interest (the nuclei) from the background, which will be assigned a value of zero.

To carry out this procedure, navigate to the 'Image/Adjust/Threshold...' panel. Within this panel,
you have the option to choose the method for determining the threshold values. By setting the minimum
and maximum threshold values, you can select the objects of interest based on their pixel intensity. For
this analysis, I recommend utilizing the Huang method6 and ensuring that the 'dark background' and 'don't
reset range' buttons are selected. Prior to applying the process, it is advisable to examine the results
obtained from other thresholding methods. Always adjust the method according to the specific
characteristics of the images being processed, as a method may yield satisfactory results for one dataset
but may not perform well for another.

setAutoThreshold("Huang dark no-reset");


setOption("BlackBackground", true);
run("Convert to Mask");

6
https://imagej.net/plugins/auto-threshold

5
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
After the binarization step (Huang method); the cells appear in white.

We have successfully isolated the cells from the background, however, cells that are touching
each other remain connected. In image analysis, we classify an independent object as a single uniform
entity enclosed by the background; under this criterion, a group of touching cells would be treated as a
single entity. To achieve a more accurate separation of individual cells, we will execute a watershed
procedure. Essentially, this method will separate each blob-like structure from its neighboring elements.
The watershed process can be initiated from the 'Process/Binary/Watershed' panel.

run("Watershed");

After watershed segmentation

6
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
The last stage of the image processing pipeline involves quantifying the quantity of segmented
cells and assessing various morphological characteristics such as their area, major/minor axis lengths,
and more. These measurements can be configured within the 'Analyze/set measurements...' panel.

In order to conduct the analysis and measurements, navigate to the 'Analyze/Analyze particles...'
panel. Next, establish the minimum size (area) for the segmented object to be classified as real objects
rather than dust. Based on our observations using the line tool in imageJ, it appears that a typical cell is
expected to exceed 20x20 pixels, equivalent to 400 px² for the minimum area. Additionally, you have the
option to select various buttons such as display results, clear results, add to manager, exclude on edges,
and include holes. The latter two buttons are particularly useful for excluding cells that are in contact with
the image edges and might be partially cropped, as well as for incorporating potential holes within the
segmented objects, although this is unlikely due to the watershed process. To obtain a segmented image
displaying each independent cell with distinct values based on their index as detected objects, choose the
'Count mask' option in the view selection.

In addition, you have the option to set minimum and maximum values for the roundness of the
identified objects, particularly when searching for circular objects. This feature is not utilized in this
instance as we will carry out the roundness selection in the post-processing phase; however, it remains a
viable alternative.

run("Set Measurements...", "area perimeter fit shape feret's


redirect=None decimal=3");
run("Analyze Particles...", "size=400-Infinity show=[Count Masks]
display exclude clear include add");
run("glasbey");

After completing the analysis, it is recommended to save the result table in csv format. It is
suggested to create a folder named 'Output' where the analysis results can be stored. To prevent
accidental deletion of the raw data, it is considered a good practice to keep it separate from the analysis
results. Always remember to make copies of the raw data and store processed images/data in separate
folders.

For a clearer representation of the segmented cells post-analysis, you have the option to modify
the LUT (Look-Up Table) in order to assign a distinct color map to the image based on the object/pixels
values. Simply choose the Count masks image, locate the LUT icon in the ImageJ toolbar, and opt for the
glasbey setting: this will assign random colors to the various cells, facilitating easier differentiation.

Saving the result table in csv format marks the completion of the image processing pipeline.
Depending on the analysis objective, it may be necessary to post-process the csv files to extract
quantitative data. For a simple counting task, one can determine the number of detected cells by looking
at the number of lines in the csv files. However, in this practical lesson, we will develop a batch pipeline
to automate the previous analysis on a full dataset and illustrate how to use Python for processing the csv
files and extracting quantitative morphological parameters.

7
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
After the analysis step

8
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
The batching process with ImageJ

To execute the previous analysis using the complete dataset, we will utilize the batch process
feature of imageJ. Begin by opening the panel labeled ‘Process/Batch/Macro…’. Within the ‘input’ section,
choose the folder that contains all the raw images you wish to process. For the ‘output’ section, select the
‘Output’ folder that you have previously established. It is crucial that this folder differs from the input
folder, although it can be a subfolder within the input folder. Specify the image format as Tiff, as it pertains
to our case. In the ‘file name contains’ field, include a common and specific portion of the name that is
shared by every image file. In our case, this would be ‘w1.TIF’, as each image file's name concludes with
this. Lastly, copy and paste the provided macro, which encompasses all the necessary image processing
steps, including additional lines to obtain the name of the currently processed image and save it with the
appropriate name. Once the parameters have been configured, you may initiate the batch analysis by
clicking on ‘process’. Please note that the duration of the analysis may vary depending on your computer's
capabilities and the quantity of images to be analyzed.

name_raw = getTitle();
path_img = getDir("image");

// Pre-processing before thresholding & segmentation


run("Enhance Local Contrast (CLAHE)", "blocksize=200 histogram=256
maximum=3 mask=*None*");
run("Subtract Background...", "rolling=500");
run("Median...", "radius=2");

// Thresholding with Huang method


setAutoThreshold("Huang dark no-reset");
setOption("BlackBackground", true);
run("Convert to Mask");

// Segmentation of the cells by watersheding


run("Watershed");

// Set morphological measurements parameters and run analysis


run("Set Measurements...", "area perimeter fit shape feret's
redirect=None decimal=3");
run("Analyze Particles...", "size=400-Infinity show=[Count Masks]
display exclude clear include add");
run("glasbey");

// Save results table in a CSV file


folder_save = path_img + File.separator + "Output" +
File.separator;
File.makeDirectory(folder_save);
saveAs("Results", folder_save + "Results_CellCounting_" +
substring(name_raw, 0, lastIndexOf(name_raw, ".")) + ".csv");

9
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
After the completion of the process, check the 'Output' folder for the saved result files.

Question: What is your opinion on the results?

10
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
Segmentation results of image plate_11001_A07_s9_w1.tif

CLAHE correction of image plate_11001_A07_s9_w1.tif

The majority of the segmentation results appear satisfactory, with the exception of a few isolated
cells. However, the image plate_11001_A07_s9_w1.tif exhibits a significant out-of-focus object in the
shape of a droplet, which clearly indicates a mis-segmentation.

The identification of these imperfections within vast datasets can prove arduous, as visually
inspecting every result is unfeasible given the dataset's size. This is especially true in our case, where the
data was automatically collected as part of an extensive screening study encompassing thousands of
images. When manually acquiring data, it is advisable to steer clear of these types of defects and aim for
a consistent background without any imperfections.

11
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
Once detected, these flawed images can be either excluded from the analysis (if the dataset is
sufficiently extensive and the excluded images do not represent a significant portion of it) or the defective
area can be cropped. It is recommended not to edit the image by erasing the defect using image editing
software, as this would result in an image transformation, making it incomparable to other images in the
dataset.

Processing the defective images is a viable option, but it would involve applying the same
processing to all images, including the ones that are not defective, in order to ensure a fair comparison.

Exercise: Update the previous pipeline to rectify the issue while preserving the cells.
Hints:
- Additional actions are unnecessary to complete this task; simply modify the values of the various
steps.
- Determine the parameters that vary between the defective object and the desired ones (the
nuclei), which will guide you on which part of the process you should adjust.
- Multiple approaches exist for resolving image analysis issues; concentrate on your goal of
identifying the most effective one.
- The best solution may not be flawless; it is essential to maintain a balance between minimizing
noise sources and preserving the essential information.
An alternative approach that will be implemented involves filtering the segmented cells during post-
processing, using morphological parameters such as circularity or cell area. In a large dataset, it can be
challenging to identify defects and their quantity. While the pipeline was developed based on trials with a
few images, it may not effectively filter defects in individual images. By analyzing the morphological
characteristics of segmented objects, we can exclude those that do not meet the expected criteria,
particularly those that are too large to be single cells or are not rounded. Threshold values can be
estimated by examining individual cells in imageJ and using the line tool to measure the average size of
representative cells; circularity thresholds can be determined based on existing literature values 7.

Python post-processing of the CSV files


To accomplish this, we will utilize Python and its specialized library Pandas to handle the CSV
files. Pandas is specifically designed for managing datasets. Additionally, you have the option to process
these files using Excel or any other software capable of opening and processing CSV files.

Tip: Each functional block in the Python script serves a specific purpose, allowing for easy reuse in other
scripts with minimal adjustments required.
To begin, we need to import the Python libraries that house the functions we will be utilizing.

- Numpy, is utilized for carrying out mathematical computations. To simplify its usage, we
commonly refer to it as 'np'.
- Matplotlib is specifically designed for visualization tasks. For our purposes, we will exclusively
employ its graphical component to generate and present different types of figures.
- Pandas is a specialized library designed for the purpose of managing and manipulating datasets.
The functionalities provided by pandas enable seamless handling of csv data files.
- Os is the operating system library of Python, it provides a range of functionalities to efficiently
handle files, folders, and their respective paths.

7
https://progearthplanetsci.springeropen.com/articles/10.1186/s40645-015-0078-x

12
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import os

Note: the Python code enclosed in this orange box can be copied and pasted into your Python interpreter.
Afterward, you can generate the output of your code by clicking on the 'Compile' or 'Run' button.
We proceed by defining the variables utilized in the script. The initial variable represents the path
where the csv files are located. To ensure proper reading, a 'r' is prefixed before the path8. In Python, string
variables are enclosed within 'string variable'. The subsequent variable is automatically determined based
on the folder path containing the csv files. It is a list that encompasses the names of all the files within
this folder. The final variable is a pandas variable known as a dataframe, which is essentially a table with
columns representing properties such as 'Area', 'Circularity', 'Perimeter', etc., and each row representing
an individual cell. Currently, this dataframe is empty (data = {}), but we will populate it by adding each table
from the csv files.

path_csv_files = r'Full path to the folder containing the csv


files' #example: r’C:\Symbiophysics\Cell_counting\Output’
list_files = os.listdir(path_csv_files)
df_results = pd.DataFrame(data={})

Tip: comments can be included in your code by using a ‘#’ symbol; any text following the ‘#’ will not be
executed.
Next, we iterate through each element in the list_files variable, which contains the names of files
located within the folder and are in CSV format. Throughout this iteration, we validate whether the file is
a CSV file by confirming the existence of the '.csv' substring in the file name: for example,
'Results_CellCounting_plate 11001_A07_s3_w1.csv'. If the current file being processed is a CSV file, we
read it using the read_csv function from pandas and then merge it into the dataframe df_results using the
concat function from pandas.

Tip: in Python, indentation plays a crucial role in defining the contents of loop structures, conditional
statements (such as 'if, elif, else'), and so on.
The read_csv function in pandas requires the complete file path of the csv file as an input. Within
the loop, 'i' represents a string variable containing the name of the csv file, while 'path_csv_files' is another
string variable holding the name of the folder where the csv file is located. By concatenating these two
strings along with a file separator '\\' (which needs to be doubled in the string variable), we can obtain the
full path. Additionally, we include the option ignore_index=True to prevent any conflicts arising from cells
with the same index but coming from different sources.

The concat function in pandas is used to combine dataframes that are stored in a list, such as
[dataframe1, dataframe2, dataframe3, etc]. The square brackets [ ] indicate a list with multiple items
separated by commas. In this particular scenario, we aim to update the dataframe df_results with the data
from a csv file, which is why we use the following script.

8
https://stackoverflow.com/questions/2241600/python-regex-r-prefix

13
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
for i in list_files:
if '.csv' in i:
df_results = pd.concat([df_results,
pd.read_csv(path_csv_files + '\\' + i)], ignore_index=True)

We finally print the results of the analysis with the following script:

print('Number of cells = ' + str(len(df_results)))


print('Mean area = ' + str(np.round(df_results['Area'].mean(),0))
+ ' px^2')
print('STD area = ' + str(np.round(df_results['Area'].std(),0)) +
' px^2')
fig = plt.figure()
plt.hist(df_results['Area'], 100)
plt.xlabel('Area (px^2)')
plt.ylabel('Count')

The print function exhibits a variable, specifically a string variable 'Number of cells = ' combined
with another string variable that represents the size (or length) of the dataframe df_results. This size
corresponds to the overall count of individual segmented cells, achieved by utilizing the len function and
converting it to a string variable using the str function.

The following two lines are almost identical, with the exception that they exhibit the average area
value of the cells and their standard deviation. These calculations were obtained using the capabilities of
pandas dataframes, specifically by applying .mean() or .std() to a subset of the dataframe, in this case
the ‘Area’ column: df_results['Area'].mean().

Another property from the dataframe could have been chosen, such as the circularity
measurements column or the major/minor axes, by modifying the column name in df_results['Area']. A
comprehensive list of all column names can be obtained by utilizing df_results.columns.

The mean or standard deviation results are rounded to the nearest unit using the round function
from the numpy library. We adopt the format np.round(variableToBeRounded, number of decimals) for this
purpose. Subsequently, we convert the rounded number to a string variable using the str function and
concatenate it with either the string variable 'Mean area = ' or 'STD area = '.

The final section of the script utilizes the Pyplot sub-library from Matplotlib to showcase a
histogram figure of the cell area measurements. Initially, a new figure window is opened using the
plt.figure() function. Subsequently, the histogram of the area measurement column is plotted with 100
bins using the plt.hist(df_results['Area'], 100) function. The x-axis and y-axis are labeled with
plt.xlabel('Area (px^2)') and plt.ylabel('Count') respectively. It is important to note that the xlabel and ylabel
functions necessitate string variables for the axis labels.

Once we have merged all the csv files into a single table, we can now proceed to filter out the
segmented cells with abnormally large areas. Without any filtering, the average area value is approximately
2900 px², with a standard deviation of around 900 px². Hence, we will consider any object exceeding 6000
px² as aberrant cells and exclude them from the analysis.

In order to accomplish this, we can utilize the functionalities provided by the pandas datasets to
conveniently filter the cells. By applying the condition "df_results['Area']<6000", we can create a new
dataframe where the cell indexes (rows) with an area parameter below 6000 px² are marked as True,

14
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
indicating that they meet the condition. Conversely, the remaining cells with an area parameter equal to
or greater than 6000 px² are labeled as False.

Subsequently, we can utilize this newly created dataframe as input for itself (i.e., df_results) to
exclusively select the rows that have an area below 6000 px², which are the ones labeled as True. We can
then perform the desired mathematical operations, such as mean() and std(), on these selected rows.

df_results[df_results['Area']<6000]['Area'].mean()

Exercise: Evaluate the number of segmented elements after discarding elements with a circularity
coefficient under 0.7. Display the histogram associated with the data.
Hint: - Use df_results.columns to find the name of the column containing the circularity measurement.

Full Python script to open the CSV files and plot the histogram of the entire area measurements
(no filtering):
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import os

path_csv_files = r'Full path to the folder containing the csv


files'
list_files = os.listdir(path_csv_files)
df_results = pd.DataFrame(data={})

for i in list_files:
if '.csv' in i:
df_results = pd.concat([df_results,
pd.read_csv(path_csv_files + '\\' + i)], ignore_index=True)

print('Number of cells = ' + str(len(df_results)))


print('Mean area = ' + str(np.round(df_results['Area'].mean(),0))
+ ' px^2')
print('STD area = ' + str(np.round(df_results['Area'].std(),0)) +
' px^2')
fig = plt.figure()
plt.hist(df_results['Area'], 100)
plt.xlabel('Area (px^2)')
plt.ylabel('Count')

15
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
Appendix - Solution to the questions

1- A simple method to eliminate the abnormal object involves decreasing the rolling-ball radius in
the background removal procedure. By doing so, larger objects than the rolling ball size will have
their intensity reduced; therefore, a 100 px diameter includes the nuclei but excludes the abnormal
object, which is larger. Following the watershed process and the final step of analyzing particles,
the advantages of this adjustment become clear.

(Left) Rolling-ball 500 px. (Right) Rolling-ball 50 px.

Results of segmentation with Rolling-ball 50 px

16
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics
2- The code df_results['Circ.'] > 0.7 is employed to find the index of cells with a circularity exceeding
0.7. When this dataframe is passed as input for the original dataframe df_results, solely the cells
with a circularity > 0.7 and their related measurements are filtered. Subsequently, we can pinpoint
the desired circularity parameter 'Circ.' by executing df_results[df_results['Circ.'] > 0.7]['Circ.'].

print('Number of cells (Circ.>0.7) = ' +


str(len(df_results[df_results['Circ.']>0.7])))
fig = plt.figure()
plt.hist(df_results[df_results['Circ.']>0.7]['Circ.'], 100)
plt.xlabel('Circularity')
plt.ylabel('Count')

Congratulations on successfully completing the first practical lesson! We hope you found the
journey into the intriguing world of image analysis to be fulfilling. We strongly believe that conducting the
analysis yourself, with comprehensive and precise guidance to navigate through the process seamlessly,
is a fundamental aspect of an effective learning experience. This is the rationale behind our practical
lesson series.

We would greatly appreciate it if you could share your feedback with us at


contact@symbiophysics.fr, if you found this lesson enjoyable and acquired new concepts. Your feedback
will be especially valuable if you are encountering challenges in analyzing your images, as it will aid us in
designing the core of our future practical lessons and providing you with the necessary tools.

This lesson has been meticulously created by Symbiophysics and graciously offered to you for
educational objectives, adhering to the CC BY-NC-SA 3.0 license.

Additional practical lessons, encompassing various applications in image analysis, can be


accessed by contacting us at contact@symbiophysics.fr or visiting our website https://symbiophysics.fr.
For further details regarding fees and the topics included, please reach out to us.

17
https://symbiophysics.fr
https://www.linkedin.com/company/symbiophysics

You might also like