Professional Documents
Culture Documents
Prediction of The Perceived Quality of Streak Distortions in Offset Printing With A Psychophysically Motivated Multi Channel Model
Prediction of The Perceived Quality of Streak Distortions in Offset Printing With A Psychophysically Motivated Multi Channel Model
Prediction of The Perceived Quality of Streak Distortions in Offset Printing With A Psychophysically Motivated Multi Channel Model
To cite this article: Konrad Gadzicki & Christoph Zetzsche (2013) Prediction of the perceived
quality of streak distortions in offset-printing with a psychophysically motivated multi-channel model,
Journal of Modern Optics, 60:14, 1167-1175, DOI: 10.1080/09500340.2013.809162
Article views: 24
The evaluation of printing machines poses the problem of how distortions like streaks caused by the machine can
be detected and assessed automatically. Although luminance variations in prints can be measured quite precisely, the
measured functions bear little relevance for the lightness of streaks and other distortions of prints as perceived by
human observers. First, the measurements sometimes indicate changes of luminance in regions which are perceived
as homogeneous by humans. Second, the measured strength of a distortion correlates often weakly with its perceived
strength, which is influenced by a variety of factors, like the shape of a streak’s luminance profile and the distribution
of luminance variations in its spatial surround. We have used a model of human perception, based on fundamental
neurophysiological and psychophysical properties of the visual system, in order to predict the perceptual strength of
streaks (i.e. the distortion as perceived by a human observer) from the measured physical luminance signal. For the
evaluation of the model, tests with naive and expert observers have been conducted. The results show that the model
yields a good correlation (> 0.8) to the assessments of human observers and is thus well suited for use in an automatic
evaluation system.
Keywords: image quality; offset printing; human visual system; quality control
system (HVS). This prompted the present investigation and Hardware and software for scanning and preprocessing
the associated development of the model suggested here. of the scans in the current study was provided by SID.
Several image quality models based on the properties of The scanner is a RGB scanner by Colortrac which has
the human visual system have been developed in the past been calibrated in a similar way to ICC. The reference
[3–10]. These models commonly incorporate major prop- image consisted of cyan patches of varying density since
erties like luminance invariance, sensitivity to frequency this is the only relevant colour for the purpose of evaluation.
and orientations, and masking effects. Early models used a The CIELAB values of the patches were measured with a
pointwise non-linearity or explicit gain modifications by a colorimeter and in a second step a quadratic function was
masking signal to account for masking effects. More recent fitted to adjust the transformed scanner’s elements’ RGB
models of the visual system used divisive inhibition pooling values to match the measured CIELAB values.
acting over both spatial positions and frequency-selective The prints were scanned twice in opposite direction in
channels [11]. The model suggested here takes these devel- order to counter systematic bias from sampling in one direc-
opments into account. tion. The prints contain markers which allow one to match
In addition to the aforementioned systems, there exists a the images after the scan process. The matching was not
large number of approaches which are not solely based on done with sub-pixel accuracy since this level of accuracy is
the HVS. These approaches include feature based scales, not required for detection of streaks which are much more
e.g. [12,13], multi-dimensional impairment scales, e.g. [14], coarse. The images are then preprocessed by de-screening
a spatial extension to CIELAB [15], structural similarity and correction of row and column shading, averaging over
index [16,17], image information fidelity [18] or most ap- several pixels. A side effect of the preprocessing is that two-
parent distortion [19]. Such approaches can be of special dimensional signal variations are reduced in adaption to the
advantage if one model should be applied to heterogeneous one-dimensional nature of the streak patterns.
stimulus domains. See [20] for a recent comparison of a
number of different approaches. Beside the problem of mod- 2.2. Model
elling human perception, there are also cognitive issues On the top level, the model processes two inputs in parallel.
which can influence human judgements [21]. Such issues As shown in Figure 2, signal + background and background
can be hoped to be not as critical in the specific context of alone are processed in the same manner by a system ex-
the present investigation, where trained specialists operate plained in detail below. Streaks and other distortions are
in a quite restricted domain. seen as the signal, while the background contributes to
masking effects on this signal, e.g. by high-contrast lu-
minance edges at the print borders. A local vector norm
2. System and model between the multi-channel outputs of the two pathways
The model is intended as part of a future system for the au- determines the final model output and thus the perceived
tomatic evaluation of prints produced by SID (Sächsisches strength of a local distortion. The system used for both
Institut für die Druckindustrie). It is designed to work with pathways in Figure 2 is shown in detail in Figure 3. Its first
scanned images of printings, but can also be adapted to other stage is a luminance adaptivity stage, where a multi-scale
input sources (e.g. densitometers). presentation of local contrast is computed by a nonlinear
filter decomposition. In the next stage frequency- and
orientation-selective linear band-pass filters are applied,
2.1. Input and devices leading to a further decomposition of each scale into its
The test patterns used for the evaluation of a print machine orientations. Finally, the filter outputs are normalized by
are homogeneous prints in a specific base colour, usually local gain control mechanisms which pool over space, scales
cyan, which are printed with an accurately defined machine and orientations.
configuration. Figure 1 shows an example of a print with
streak distortions.
Figure 3. Overview over the system (as used for both pathways in Figure 2). From the input (1), the contrast is computed by non-linear
ROG filters (2) and passed to a set of frequency- and orientation-selective linear filters (3). The outputs are then passed through gain
control mechanisms (4). One of these is shown in detail on the right-hand side. Hence each channel is normalized by spatial pooling over
the other channels.
For the evaluation of the model, those parameters of pass inputs with different cut-off frequencies resulting in a
the model which are expected to be specifically related to non-linear band-pass output (1).
the task were fitted with Particle Swarm optimization [22].
−1 x 2 + y2
Other parameters, like the filter bandwidths for the ROG l(x, y) ∗ 2π σi2 exp −
and Log-Gabor filters, and the orientation selectivity were 2σi2
gi+1 (x, y) = .
set to the typical values known from psychophysical and −1 x + y2
2
2
l(x, y)∗ 2π σi+1 exp − +c
neurobiological research (see below). The contrast sensi- 2
2σi+1
tivity function was fitted indirectly by the Log-Gabor filter (1)
amplitudes for different scales. A further parameter that has
been fitted is the exponent of the vector norm. In the model the sigma ratio is σi+1 = 2σi , starting with
The stages of the model are now described in detail. σ0 = 8 pixel. The constant c is set to 2.0. Figure 4 illustrates
the response of the operator. For a computationally efficient
2.2.1. Luminance invariance implementation a pyramid scheme, similar to that in the
Laplacian pyramid [23], is used to build the nonlinear rep-
The first step in the model is to transform the absolute resentation of contrast, decomposed into five spatial scales.
luminance intensities. According to Weber’s Law, the cru-
cial variable for discrimination of luminance variations is
contrast and not the absolute difference of luminance. Con-
trast is defined as the ratio between luminance and (local) 2.2.2. Frequency- and orientation-selective filters
background (average luminance of the surrounding). The The majority of cells in early visual cortex are so-called
model is intended for luminance as input but it can also be Simple and Complex Cells which are responsive to patterns
used with the L ∗ channel from a signal in CIELAB colour with a specific orientation and size. Orientation selectivity
space so that the lightness values correlate already better was the major finding of Hubel and Wiesel [24–26], and
with human perception. The current data set provides such spatial-frequency selectivity was measured later in cats [27]
L ∗ values. Although Weber’s Law is to a certain degree and primates [28]. Mathematically the receptive field of
incorporated in the L ∗ scale, our generalized model incor- such cells can be best described by Gabor functions [29],
porates a local contrast adaptation stage which is able to which offer the optimal resolution and selectivity in order to
model local effects that cannot be captured by the global represent the response of V1 cells [30,31]. The 2D kernel (2)
normalization of the CIELAB scale. of a Gabor filter consists of a complex sinusoid (3) weighted
The model uses the Ratio of Gaussian (ROG) opera- by a Gaussian function (4):
tor [3] in order to calculate the contrast values. As the
name suggests, the ROG is a divisive operation of two low- h(x, y) = g(x, y)s(x, y) (2)
1170 K. Gadzicki and C. Zetzsche
Figure 4. Response of a ratio of Gaussian operator to luminance step edges. (a) Input. (b) The ROG output represents luminance contrast.
(c) Linear DOG response shown for comparison.
⎛ 2 ⎞
log ρρ0 (φ − φ0 )2 (φ − φ0 − π )2
⎜ ⎟
Hlog (ρ, φ) = i exp ⎝−
k
⎠ − exp − −1 exp
k
(6)
2(log σρ )2 2σφ2 2σφ2
[35]. Psychophysical experiments investigating the contrast 0.5 steps with 6 meaning a severe distortion and 1 a barely
response function of neurons from cats and monkeys [36] visible one. 0 was reserved for undetected distortions, thus
introduced the Naka–Rushton function, a hyperbolic ratio, it could not be chosen by the participants directly but was
as the best fitting description for neural response to contrast. assigned during the evaluation in the case that participants
Further investigations of the response properties of neurons did not see particular distortions in certain trials. The scale
revealed a pooling effect [37–39]. Masking effects can thus used here is reversed in comparison to the EBU scale and is
be explained by a suppressive signal derived by pooling over actually the one used historically by the printing industry.
neurons, a mechanism commonly referred to as Cortical
Gain Control. It is modelled by a divisive pooling over
space and channels (7) and was included into recent visual 3.2. Experiments with naive observers
system models, e.g. [11]. In the model pooling is done over The experiments with naive observers used patterns pre-
neighbouring scales and over all orientations of these scales. sented on-screen only, in order to establish full control of
Pooling over space is performed by a convolution with a the presentation and eliminate any additional sources of
low-pass filter kernel. noise which are inevitably introduced in the full process
rs,o (x, y) p of printing and scanning of patterns. The patterns were
r̄s,o (x, y) = s+1 6 . (7) presented on analogue screens with analogue input which
cq + s̄=s−1 ō=1 (rs̄,ō ∗ ks̄,ō )(x, y)
q
allowed for a luminance resolution equivalent to roughly
The responses r are the outputs of linear filters, indexed by 10 bits. The set of patterns consisted of 30 images with
s for scale and o for orientation. As for the exponents used a total of 75 streak distortions. For each participant three
by the model, p = 2.4622 was returned by the optimization sessions were conducted. The whole set was presented dur-
and q = 2 was fixed. The constant c defines the point where ing each session. The participants marked the position of a
saturation begins which was set to 0.5 in the model. The distortion by clicking with the mouse and entered the level
convolution operation is denoted by ∗ with k(x, y) being of distortion (grade) with the keyboard. The distortions for
the kernel. The model uses a Gaussian low pass kernel the whole set were generated in a random fashion. This
with 0.0625 as cut-off frequency (Fourier domain based randomized set was used for all participants and sessions.
convolution). The generation parameters included the number, position
and type (Gauss or sharp edge) of distortions, width of the
distortion, width of the edge, and contrast.
2.2.4. Vector norm
The participants were students with no prior involvement
The final result is the local vector norm (Minkowski with the printing industry (except for being exposed to
distance) between the two multi-channel pathways: printed products like books, magazines, etc. in their daily
1/ p life). They had no experience with an evaluation of printings
d= |r̄signal+background − r̄background | p . (8) task as required in this experiment. In total 21 participants
The optimization returned p = 2.5055 as the best fitting participated in the experiment. Six of them were excluded
exponent. from the evaluation because they either did not finish all
sessions or had a significantly higher deviation of responses
from the average.
3. Methods
For the purpose of data acquisition, several experiments
have been run. The first experiment employed naive ob- 3.3. Experiments with experts from the printing industry
servers with an on-screen display of distorted patterns in This experiment involved evaluation of real printings and
order to establish a baseline for the model in a noise-free was done by experts from various companies from the print-
environment. The only source of noise in this setup was ing industry (Heidelberger Druckmaschinen, Koenig &
effectively perceptual noise of the observers. The second set Bauer, Manroland). In total, 52 printings were specially pro-
of experiments were evaluations of real printings conducted duced for the purpose of this experiment. Among them, 36
by experts from the printing industry. In this setup several contained artificially generated distortions while the
sources contributed to the noise of the system, namely the remaining 16 printings contained realistic distortions pro-
printing process itself, the scanner system, and perceptual duced by an imprecisely configured printing machine. In
noise of the observers. the case of the artificial patterns, contrast, position, width
of the distortion, and width of edges were varied.
The evaluation process was conducted under standard-
3.1. Scale ized conditions according to [41]. It requires the printed
Participants rated distortions on a qualitative grade scale matter to be illuminated by a standard illuminant D50, the
similar to the EBU scale used in television picture quality surround and backing of the matter to be neutral and matt,
assessment [40]. The grades used here ranged from 0 to 6 in and the visual environment to be arranged in a way that
1172 K. Gadzicki and C. Zetzsche
Figure 5. Correlation between model and inter-individually averaged data for (a) naive observers and (b) experts. (The colour version of
this figure is included in the online version of the journal.)
Figure 6. Correlation between model and inter-individually averaged assessments for ratings containing at least one minimum grade
response (a) and for ratings not containing the minimum grade (b). (The colour version of this figure is included in the online version of
the journal.)
4.3. Patterns at threshold of perception 4.3.1. Correlation between assessments and model with
Printings are generally noisy with regard to the luminance varying percentage of near-threshold distortions
signal. Therefore many weak streaks are difficult to detect The correlations between model and participants’ assess-
even for trained observers. This is illustrated by the fact that ments for the whole set, including the afore-mentioned dis-
naive observers who have performed the assessment task tortions near the threshold of perception, have already been
with realistic printings found approximately three streaks presented in Figure 5(b). We now analyse the influence of
per print whereas the experts found up to 25. However, those near-threshold responses on the overall performance.
almost all additional streaks found by the expert group Subsets of the data with a varying percentage of near-
were rated with 1 (barely visible).1 These distortions pose a threshold responses have been investigated with regard to
problem. While the model can generate output lower than 1 their effect on the correlation of the model predictions.
(if the signal is weak), the observers are stuck with the 1 as The criterion for assigning a particular streak to the near-
the lowest grade available (0 is reserved for non-detected threshold group was the percentage of minimum grades
streaks). Even though the signal would require a lower given to that streak. As the minimum grade was effectively
grade, the scale does not offer it and a sort of floor effect given to streaks which were barely visible or not seen at
becomes visible. all by some observers, this criterion offers a flexible way to
1174 K. Gadzicki and C. Zetzsche
determine the subset of near-threshold streaks. The strictest which require a graded evaluation the suggested model of
version of the criterion is to mark a streak as near-threshold the human visual system seems quite useful. It thus appears
if at least one minimum grade has been assigned to it by well suited as part of a system for the automatic evaluation
one of the participants. In subsequent plots assessments of of printings.
distortions are plotted in red if they contain a minimum
grade response.
Figure 6(a) shows the subset consisting of inter- Acknowledgements
individually averaged assessments, which received at least This work was supported by NCT Bremen, Sächsisches Institut
one minimum grade response. As one can see, the appar- für die Druckindustrie Leipzig and Forschungsgesellschaft Druck-
maschinen e.V. Frankfurt am Main. The authors would like to
ent variance of the model output is quite high, which is thank the accompanying working group of specialists (FGD Ar-
in correspondence to the aforementioned hypothesis that beitskreis Streifenmessung) for the fruitful discussions. Parts of
in this regime different signal levels are mapped to the this work has been presented at the 18. Workshop of the German
minimum grade. The omission of this subset, i.e. of streaks Color Group in Darmstadt, Germany in 2012.
which have received at least one minimum grade from the
evaluation, leads to an increase in the correlation for the
averaged data from 0.81 to 0.88, as shown in Figure 6(b). Note
Note that although the overall dynamic range of the data 1. Since the locations of the streaks are marked, they will typi-
is reduced, the correlation increases. This corroborates our cally receive a grade 1, even if critical testing without mark-
ings would presumably reveal that some participants are not
hypothesis that the excluded subset is not well behaved.
able to see them.
5. Conclusion
References
We have presented a model for the automatic prediction of [1] Handbuch zur technischen Abnahme von Bogenoffset-
the perceived strength of streak distortions in offset print- Rollenoffsetmaschinen; Technical Report for Bundesver-
ing. The model is based on recent neurophysiological and band Druck und Medien: Frankfurt am Main, 1996.
psychophysical results. It can describe shape effects due to [2] Technische Richtlinien Abnahme von Bogenoffsetdruck-
the shape of the luminance function of a streak and masking maschinen; Technical Report for Bundesverband Druck und
Medien: Frankfurt am Main, 2005.
effects as caused by the configuration in its spatial surround. [3] Zetzsche, C.; Hauske, G. Proc. SPIE 1989, 1077, 209–216.
Assessments on a large number of streak patterns have [4] Zetzsche, C.; Hauske, G. Principal Features of Human
been collected in experiments with naive and expert ob- Vision in the Context of Image Quality Models. In IEEE
servers. We have then shown that the model shows a good 3rd International Conference on Image Processing and
correlation in predicting the perceived strength of these its Applications; IEEE Press: Piscataway, NJ, 1989; pp
102–106.
distortions. The differences between experiments on-screen [5] Daly, S. The Visible Differences Predictor: An Algorithm
and with prints can be attributed to different participants as for the Assessment of Image Fidelity. In Digital Images and
well as the different media used for presentation. It is further Human Vision: Watson, A.B., Ed.; MIT Press: Cambridge,
possible that mechanisms for adaptation of the visual system MA, 1993; pp 179–206.
might perform differently for self-luminous or illuminated [6] Lubin, J. The Use of Psychophysical Data and Models in
the Analysis of Display System Performance. In Digital
media. Since the L ∗ values differed from luminance values Images and Human Vision: Watson, A.B., Ed.; MIT Press:
only by a constant for the current data set, specific effects Cambridge, MA, 1993; pp 163–178.
on the predictions from use of the ROG mechanism are not [7] Teo, P.C.; Heeger, D.J. Perceptual image distortion. In IEEE
to be expected. This is supported by cross-checks between International Conference on Image Processing ICIP-94,
model fits from on-screen and print experiments, which Austin, TX, Nov 13–16, 1994; IEEE Press: Piscataway, NJ,
1994; Vol. 2, pp 982–986.
showed consistent results with the on-screen data being [8] Heeger, D.J.; Teo, P.C. A Model of Perceptual Image
luminance values and the print data L ∗ . Fidelity. In International Conference on Image Processing,
The prediction of distortions close to the perceptual thresh- Washington, DC, Oct 23–26, 1995; IEEE Press: Piscataway,
old proved to be problematic, resulting in relatively large NJ, 1995; Vol. 2, pp 343–346.
variations of the model responses for barely visible distor- [9] Westen, S.J.P.; Lagendijk, R.L.; Biemond, J. Perceptual
image quality based on a multiple channel HVS model.
tions. The removal of distortions which were rated with In 1995 International Conference on Acoustics, Speech,
the minimum grade increased the correlation of the model and Signal Processing ICASSP-95, Detroit, MI, May 9–
predictions substantially, although the dynamic range of the 12, 1995; IEEE Press: Piscataway, NJ, 1995. Vol. 4, pp
data has been reduced by this removal. If specifically a mod- 2351–2354.
elling of barely visible distortions is intended, a better ap- [10] Taylor, C.C.; Pizlo, Z.; Allebach, J.P.; Bouman, C.A. Proc.
SPIE 1997, 3016, 58–69.
proach would be to perform 2AFC threshold measurements [11] Watson, A.; Solomon, J. Opt. J. Soc. Am. A 1997, 14 (9),
instead of quality judgements, and use these for appropriate 2379–2391.
model fits in the threshold range. However, for applications [12] Miyahara, M. IEEE Commun. Mag. 1988, 26 (10), 51–60.
Journal of Modern Optics 1175
[13] Miyahara, M.; Kotani, K.; Algazi, V. IEEE Trans. Commun. [27] Campbell, F.W.; Cooper, G.F.; Enroth-Cugell, C. J. Physiol.
1998, 46 (9), 1215–1226. 1969, 203 (1), 223–235.
[14] Kayargadde, V.; Martens, J.B. J. Opt. Soc. Am. A 1996, 13 [28] De Valois, R.L.; De Valois, K.K.; Ready, J.; Blanckensee, H.
(6), 1178–1188. Assoc. Res., Vision Ophthalmol. 1975, 15, 16.
[15] Zhang, X.; Wandell, B.A. SID J. 1997, 5 (1), 61. [29] Gabor, D. J. Electr. Eng., Part III 1946, 93 (26), 429–441.
[16] Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multiscale structural [30] Marcelja, S. J. Opt. Soc. Am. 1980, 70 (11), 1297–1300.
similarity for image quality assessment. In The Thirty- [31] Daugman, J.G. Vision Res. 1984, 24 (9), 891–910.
Seventh Asilomar Conference on Signals, Systems & [32] Field, D.J. J. Opt. Soc. Am. 1987, 4 (12), 2379–2394.
Computers, Pacific Grove, CA, Nov 9–12, 2003; IEEE Press: [33] De Valois, R.L.; Albrecht, D.G.; Thorell, L.G. Vision Res.
Piscataway, NJ, 2004; Vol. 2, pp 1398–1402. 1982, 22 (5), 545–559.
[17] Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. IEEE [34] De Valois, R.L.; Yund, E.W.; Hepler, N. Vision Res. 1982,
Trans. Image Proc. 2004, 13 (4), 600–612. 22 (5), 531–544.
[18] Sheikh, H.R.; Bovik, A.C. IEEE Trans. Image Proc. 2006, [35] Legge, G.E.; Foley, J.M. J. Opt. Soc. Am. 1980, 70 (12),
15 (2), 430–444. 1458–1471.
[19] Larson, E.C.; Chandler, D.M. J. Electron, Imaging 2010, 19 [36] Albrecht, D.G.; Hamilton, D.B. J. Neurophysiol. 1982, 48
(1), pp 011006–1-011006-21. (1), 217–237.
[20] Sheikh, H.R.; Sabir, M.F.; Bovik, A.C. IEEE Trans. Image [37] DeAngelis, G.; Robson, J.; Ohzawa, I.; Freeman, R.D. J.
Proc. 2006, 15 (11), 3440–3451. Neurophysiol. 1992, 68 (1), 144–163.
[21] de Ridder, H. J. Electron Imaging 2001, 10 (1), 47–55. [38] Geisler, W.S.; Albrecht, D.G. Vision Res. 1992, 32 (8), 1409–
[22] Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In 1410.
Proceedings of ICNN’95 - IEEE International Conference [39] Heeger, D.J. Vision Neurosci. 1992, 9 (2), 181–198.
on Neural Networks, Perth, Australia, Nov 27-Dec 1; [40] Recommendation ITU-R BT.500-11, Methodology for the
IEEE Service Center: Piscataway, NJ, 1995; Vol. 4, pp Subjective Assessment of the Quality of Television Pictures;
1942–1948. Technical Report for International Telecommunication
[23] Burt, P.; Adelson, E. IEEE Trans. Commun. 1983, 31 (4), Union: Geneva 2002.
532–540. [41] ISO 3664:2009–04 Graphic Technology and Photography.
[24] Hubel, D.; Wiesel, T. J. Physiol. 1959, 148 (3), 574–591. Viewing Conditions; Technical Report for ISO: Geneva
[25] Hubel, D.; Wiesel, T. J. Physiol. 1962, 160 (1), 106–154. 2009.
[26] Hubel, D.; Wiesel, T. J. Physiol. 1968, 195 (1), 215–243. [42] Versuchsdurchführung bei der Bewertung von Druckbogen;
Technical Report for FOGRA: Munich 2010.