Professional Documents
Culture Documents
Solar Filaments Detection Using Parallel Programming in Hybrid Architectures
Solar Filaments Detection Using Parallel Programming in Hybrid Architectures
Solar Filaments Detection Using Parallel Programming in Hybrid Architectures
net/publication/227418583
CITATIONS READS
3 158
2 authors:
Some of the authors of this publication are also working on these related projects:
Performance Evaluation of Feature Selection Algorithms Applied to Data Streams Classification with Concept Drift View project
Parallel Green's functions molecular dynamics for materials science simulations View project
All content following this page was uploaded by Andre Leon Sampaio Gradvohl on 20 October 2015.
ABSTRACT 1. INTRODUCTION
There are several projects and missions designed to strictly The Sun produces its energy by fusing hydrogen atoms
observe the Sun. These projects usually produce a large into helium. In this process, there are several events that
amount of information embedded in images. The analysis may occur. For example, solar storms can be of high in-
of such information is valuable for the study and monitoring terest because they can lead to problems in telecommunica-
of solar storms that can affect telecommunications, for ins- tions like GPS signal interruptions, satellite routes changes
tance. The databases sizes with sun image are huge. Several or destruction. In addition, those events may cause electric-
projects are producing images of the Sun and exists a con- ity transmission problems such as the overload of electric
siderable amount of stored images. Combining image pro- transmission lines, caused by the geomagnetically induced
cessing algorithms with parallel programming techniques we current caused by the solar storm. These catastrophic sce-
can compute such information faster and a major volume. narios could be avoided if the solar storm is detected with
This paper describes our parallel OpenMP-MPI hybrid solu- enough time to turn off satellites or to reduce the energy in
tions for processing Sun images, and our results obtained in the transmissions lines.
a hybrid system, i.e. a cluster with several multi-core nodes. The solar storms are characterized by several features.
Specifically, we present two methods to detect and catego- One of those features is the arising of filaments on the surface
rize solar filaments in hybrid systems: Filament Diffusion- of the Sun. Those filaments are visible in the Hα images.
Detection based on graphs and Morph Detection, based on They are some plasma from the sun sustained by the mag-
morphological operators. The results show that the Fila- netic fields and with high density. They are cooler than the
ment Diffusion-Detection based on graphs detects approxi- sun surface. That is why they are presented in a darker
mately 80% of the filaments, with a 326-fold speed-up over. color. Figure 1 shows an example of solar storm events, fea-
In turn, Morph Detection detects 58% of the objects with tured a filament. Notice that the darker ribbons of the Sun
a 54-fold increase in speed. Overall, these results show that are the filaments.
our OpenMP-MPI combination works well for hybrid ar-
chitectures, but more optimizations are needed to improve
accuracy.
Keywords
solar filaments, parallel programming, hybrid systems
41
Sun images. Considering the missions that study the uni- be 0, when the pixel value is less than the threshold, or 1,
verse all together, the amount of research data available by when the pixel value is equal or greater than the threshold.
2020 will reach 60 petabytes [1]. To process this amount of The filaments present themselves as dark areas in the Sun on
information in a reasonable time, it is necessary to use a lot Hα images, since they have lower temperatures than those
of computing power. Parallel processing techniques will, for from the surface of the Sun, indicated by darker pixels.
sure, be very useful in achieving this goal [9] [10]. The next step (“Filament detection” depicted in Figure 2
and described in [3]) is executed to visit the pixels resulting
1.1 Related work from the thresholding operation. For each pixel, its spatial
Solar filaments detection methods were already addressed neighbors are visited. If such pixels are connected, they
in other works. In [8], an automatic solar filament detection are considered on the same filament until there are no more
algorithm is presented. The algorithm is based on image en- neighbors. When we implemented the algorithm, we used
hancement, pattern recognition and mathematical morphol- a circular adjacency method due to the nature of filaments.
ogy methods. Also, [11] focuses on the techniques which uses Besides the threshold for binarization, there are two other
mathematical morphology to detect solar filaments. How- thresholds. One of them determines the minimum size of the
ever, these works do not address the combination of diffu- filament. The other defines a minimum distance, whenever
sion filter techniques with the interpretation of the image as such filaments are considered as one.
a graph, nor the parallelization of the proposed methods. Another step added is the association of different gray
The work reported in this paper describes two parallelized shades for every filament, to provide an easy visualization
image processing methods with new preprocessing phase to for each filament.
increase the detection rate of the Sun filaments. Specifically, To avoid noise in preprocessing it is possible to apply a
we use diffusion filter techniques and morphological opera- median filter or a Gaussian filter. However, the edges of the
tors to increase the detection rate or the significance of the filaments will also be mitigated. As a result, the hit rate,
filaments detected. We also demonstrate that, when high i.e. the number of correctly detected solar filaments in the
performance computing techniques are used, we can achieve image, may also decrease.
a high throughput and thus decrease the time needed to
detect filaments.
2.1 Diffusion filter
The rest of the paper is organized as follows: Section 2 des- In the filaments segmentation process, it is important that
cribes the algorithm used to detect the filaments and diffu- the borders are distinct and the other areas are homoge-
sion filter used for preprocessing; morphological techniques neous. The Gaussian filter or the median filter can dete-
are described in Section 3; the methods used to decrease the riorate the borders. To mitigate this problem we use the
processing time are described in Section 4; Section 5 presents diffusion filter.
the results obtained; and finally, Section 6 shows the final As the graph theory is a well-established field, with many
remarks. algorithms to solve various problems, we can represent an
image as a graph. By combining diffusion with the interpre-
tation of an image as a graph, we can smooth the interior of
2. FILAMENTS DETECTION AND DIFFU- objects and still maintain the edges. To represent images as
SION FILTER graphs, we consider the pixels as the vertices, and the edges
The filament detection algorithm used in this work was are the connections, considering relative positions of image
already described in [3]. We improve this method by adding or the properties per pixel, such as brightness. The basis of
a preprocessing step comprising of a diffusion filter and a this diffusion filter was obtained from Image Transformation
post processing step to associate different gray shades for Forest [2].
every filament. Both steps, as well as the filament detection To calculate which vertices are connected by edges, we can
algorithm, use hybrid parallel programming approach. The use the relative position between vertices, as in Equation 1
entire method, called Filament Diffusion-Detection (FDD), and 2. Where p is the pixel and q is the adjacent pixel to
is depicted in Figure 2. Gray boxes in Figure 2 indicate the p; d(p, q) is the distance between both pixels; and t is a
added steps. threshold to delimit how much q is adjacent to the p. This
distance can be spatial or parametric.
42
After transforming the binary image with a threshold that
is half the average [3] of the pixels intensity (the “Binariza-
tion” phase in Figure 4), a set of morphological operators are
recursively applied in the binary image (the “Morphological
closing - set of structural elements” phase in Figure 4). This
set of operators is described in [11] and already used for the
detection of solar filaments. Figure 5 shows one sample of
this structuring element.
Figure 3: Circular pixels neighbourhood.
X
k
J[xp , yp ] = w[i]I[xp + dx[i], yp + dy[i]]. (3)
i=0
The convolution of the nearest pixels is done with a Gauss- Figure 5: Example of the structuring element used
ian kernel (w), using the original image (I) to produce the [11].
processed image (J). To generate the Gaussian filter coeffi-
cients, we travel the adjacency from pixel p to each of dis- Figure 8 depicts an example of the preprocessing filter
placements dx and dy which generate the q adjacent pixel, applied. The Sun image has less variation. Thus, the bina-
using the σ as the variance, applying the Equation 4. rization phase can be performed with more precision. How-
ever, some areas of the filament are obliterated by the smooth-
ing process. The final result of the MD method is in Figure
−(|dx + dy|2 )
w(q − p) = exp . (4) 9.
2σ 2
An example of a result produced by the diffusion filter is
depicted in Figure 6. We can observe that the image is more
homogeneous, but its borders are well defined. In Figure 6,
a filament is detached for a better view.
Figure 7 presents the result of a categorization using FDD
method, after the “Diffusion Filter”, “Binarization”, “Fila-
ment detection” and the “Filament colorization” described
in Figure 2. Using this method, it is possible to observe
different gray shades for each filament.
3. MORPHOLOGICAL OPERATORS
The Morph Detection (MD) method is based on morpho-
logical operators. Mathematical morphology is applied to
image processing in many areas [4]. The diagram in Figure Figure 6: Sun image processed by the diffusion filter
4 describes the method. Gray box in Figure 4 indicates the with a manually detached filament.
added phase.
43
After that, it splits the image and sends each slice to the
corresponding MPI process, using messages exchange; each
slice is further processed using OpenMP.
Finally, it rebuilds the final image, detects the filaments
split and writes the image on the hard disk. To detect the
filament split, we have to consider that, when one filament
has pixels in two different MPI processes, those pixels should
belong to a single filament.
44
In the step called “analyze the image header”, image fea- In addition to the division of the image between the MPI
tures such as bits per pixels, width and height are extracted. tasks, the program calculates the distance between each
In the “Split image” step, calculations are made to define the point and the Sun center point. If this distance is greater
size of segment, the segmentation gap according to the filter than a threshold, the point is not processed. This kind of
to be applied (notice the white stripes in Figure 11) and to calculation is possible in this scenario because the Sun al-
define which area should be processed (since the gap areas ways appears in the same position in different images.
are shared between two adjacent MPI processes). After this
step, the segments and their information are sent to a pro- 5. RESULTS
cess node using MPI.
The cluster used in the tests is equipped with 5 nodes,
Figure 11 shows an example where we use four MPI pro-
each of them with four 3.0 GHz Power 720 processor (each
cesses. Each rectangle represents an area of operation for an
node with 32 cores), 12 gigabytes of RAM and an Infiniband
MPI process, while the white stripes depict the overlapping
network to interconnect the nodes. It uses AIX 6.1 operat-
border regions for each segment; these overlapping regions
ing system, IBM XL C compiler version 1.11, MPI library
should be sent to both processes that share the border. This
version 2.1, OpenMP version 3.0 and the CFITSIO version
overlapping is required for processing image techniques that
3.29 [6].
use the neighborhood pixels [4], and this is a common op-
We used five images produced by the Big Bear Obser-
timization needed for minimizing the number of messages
vatory for tests. Such images of the Sun were obtained in
between the MPI nodes.
January 2012. For each image, we manually built a “ground
truth” image, where we marked the area of each filament
with white dots, and we have used these images as refer-
ences. Figure 12 shows an example of a ground truth image.
45
In 70% of the cases where a filament appears in the Sun,
a CME occurs [5]. CME is a cause of Solar Storm. In this
sense, a method which can detect filaments close to this
percentage may be considered an effective method.
Table 1 contains the hit rate of each processed image, as
well as the rates of false negatives (FN) and false positives
(FP), when comparing with the ground truth. A 100% hit
occurs when the image processed by any method is equal to
the ground truth.
FDD MD
Hits FN FP Hits FN FP
Image 1 78% 21% 1% 50% 50% 0%
Image 2 72% 24% 4% 57% 41% 2%
Image 3 83% 16% 1% 58% 42% 0%
Image 4 87% 10% 3% 56% 40% 4%
Image 5 80% 17% 3% 70% 29% 1%
Figure 13: Time used to process the FDD method
using 4, 8, 32 e 64 OpenMP threads.
The average accuracy of the FDD method is 80% with a
variation coefficient of 7%. The false negative mean in the
FDD method was 17%, the false positive was 2%. According with Figure 13, using 1 to 3 nodes the per-
The mean accuracy of MD method was 58% with a varia- formance with 64 OpenMP threads (the line with ×) or
tion coefficient of 14%. The false negative mean in the MD 32 OpenMP threads (the line with 4) are better than 8
was 40% and the false positive was 1%. OpenMP threads (the line with
). However, when more
Table 2 contains the hit rates for each processed image, than 3 nodes are used the performance of 8 OpenMP threads
considering that at least one point of the filament is detected. is better than 32 or 64 OpenMP threads.
Figure 14 shows one case, using 4, 8, 32 and 64 OpenMP
Table 2: Results of detections in algorithms con- threads per MPI process, using 1 to 5 nodes for MD method.
sidering that at least one point of the filament is
detected
46
Figure 14 shows that the performance using 32 OpenMP We note that both methods can detect or, at least, in-
threads (the line with 4) is better than 4, 8 and 64 OpenMP form the existence of the filaments, despite the considerably
threads, when we use up to 3 nodes. However, when 4 or variation on hit rates.
5 nodes are used the performance of 8 OpenMP threads is The FDD method has a set of parameters (e.g. pixels
better than with 4, 32 or 64 OpenMP threads. radius, closest pixels number) that should be configured, so
The behavior presented in Figures 13 and 14 is propor- the detection is made according to the user needs. The MD
tional to the size of the image segment that each MPI pro- method, on the other hand, has to configure the structuring
cess receives to compute. If the received segment is large, elements.
the OpenMP threads could process with a better parallelism. Therefore, the MD method could be used to evaluate
However, when the image segment is small the OpenMP whether there are filaments in the Sun and if they are oc-
threads creation will imply in more overhead. curring at a high rate. The FDD method, in turn, can be
The graphics in Figure 14 shows the average time to pro- used for the categorization of filaments.
cess the five images with the FDD method when using 1 to
5 nodes to compute each image. Each node hosts 8 MPI 6. CONCLUSION
processes, with 8 OpenMP threads for each process. When
Solar filaments are important because they are strong re-
a single thread was used, the FDD method was executed
lated to CME occurrences. In this paper, we showed two
for 47 minutes. Therefore, the FDD method obtained a 326
detection methods and a parallel solution for hybrid shared
times speed-up with 5 nodes.
and distributed memory architectures to improve the per-
The speed-up was calculated considering the execution
formance of the detection method.
time when the program uses one single thread and the ex-
Detecting them in traditional sequential programming is
ecution time when the program uses all the 5 nodes. Each
slow because the huge data in images. These data may be
node with 8 MPI processes and each MPI process with 8
the image resolution, image quantization or other features.
threads.
In FDD method, we achieve 326-fold speed-up over with
80% hit rate (comparing pixels by pixels with the ground
truth) and a 94% accuracy (comparing if the filament is
detected) on a 5-node cluster. In the MD method, a 54-fold
speed-up was achieved with 58% hit rate and a 66% accuracy
on a 5-node cluster.
The number of OpenMP threads should consider the size
of each image segment to be processed. When the number
of segments increases the segment size decreases. Therefore,
the number of OpenMP threads should be proportional to
the size of segment, in order to minimize the process context
switch and the time for thread creation. We should highlight
that the number of OpenMP threads should be proportional
to the number of cores within the processor. Also, the num-
ber of MPI processes should be the same number of nodes
Figure 15: Time used to process the FDD method. in the cluster.
For the future, we will work on improving the proposed
solution for detecting other Sun features, using other kind of
The graphics in Figure 16 shows the average time to pro- images (like extreme ultraviolet or far ultraviolet), in several
cess the five images with the MD method when using 1, 2, frequencies. We are targeting Sun features like sunspots,
3, 4 and 5 nodes to compute each image. Each node uses 8 faculae, and flares.
MPI processes and 8 OpenMP threads each. When a sin- Moreover, we will make more tests with different ratios
gle thread was used, the MD method was executed for 75 between memory bandwidth, memory sizes, number of pro-
seconds. It means that MD method obtained a 54 times cesses and number of threads per process should be made to
speed-up with 5 nodes. determine the overall scalability of the system.
7. ACKNOWLEDGEMENTS
The authors gratefully acknowledge the financial support
from CAPES for providing the grant for the first author;
and from FAPESP projects 2010/50646-6 and 2011/00861-
0 for providing computing and research facilities. Also, the
authors acknowledge the Big Bear Solar Observatory at New
Jersey Institute of Technology, for the images; and Ph.D.
Ana Lucia Varbanescu, for the comments in this paper.
47
8. REFERENCES [7] Pence, W. D., Chiappetti, L., Page, C. G., Shaw, R.
[1] G. B. Berriman and S. L. Groom. How will astronomy A., and Stobie, E. Definition of the flexible image
archives survive the data tsunami? Communications transport system (fits), version 3.0. A&A, 524:A42,
of the ACM, 54(12):52–56, Dec. 2011. 2010.
[2] A. Falcão, J. Stolfi, and R. de Alencar Lotufo. The [8] M. Qu, F. Y. Shih, J. Jing, and H. Wang. Automatic
image foresting transform: theory, algorithms, and Solar Filament Detection Using Image Processing
applications. Pattern Analysis and Machine Techniques. solphys, 228:119–135, May 2005.
Intelligence, IEEE Transactions on, 26(1):19 – 29, jan [9] R. Rabenseifner. Hybrid parallel programming on hpc
2004. platforms. In 5th European Workshop on OpenMP,
[3] J. Gao, H. Wang, and M. Zhou. Development of an pages 185–194, 2003.
automatic filament disappearance detection system. [10] R. Rabenseifner, G. Hager, and G. Jost. Hybrid
Solar Physics, 205:93–103, 2002. mpi/openmp parallel programming on clusters of
10.1023/A:1013851808367. multi-core smp nodes. In 17th Euromicro
[4] R. C. Gonzalez and R. E. Woods. Digital image International Conference on Parallel, Distributed, and
processing. Prentice-Hall, 3. ed. edition, 2007. Network-based Processing, pages 427–436, May 2009.
[5] L. K. Harra. Explosive events on the sun. [11] F. Shih. Image Processing and Pattern
Philosophical Transactions of the Royal Society of Recognition:Fundamentals and Techniques, chapter
London. Series A: Mathematical, Physical and Solar image processing and analysis, pages 496 –533.
Engineering Sciences, 360(1801):2757–2771, 2002. Wiley-IEEE Press, 1 edition, 2010.
[6] W. Pence. CFITSIO, v2.0: A New Full-Featured Data
Interface. In D. M. Mehringer, R. L. Plante, &
D. A. Roberts, editor, Astronomical Data Analysis
Software and Systems VIII, volume 172 of
Astronomical Society of the Pacific Conference Series,
page 487, 1999.
48