Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Index

Note: Page numbers followed by f indicate figures, t indicate tables, b indicate boxes, and np indicate
footnotes.

A module execution system, 357–358


Abbe’s diffraction limit, 224 pipeline support, 357
Absolute category rating (ACR). See Single Python and MATLAB scripting, 357
stimulus (SS) methodology Connoisseur classification module, 359–361
Acquisition system Connoisseur service, 359–361
depth, 90 core concepts
Fraunhofer HHI camera system, 23–26, 26f documents or resources, 349–350
Hasselt University multiview camera system, document versioning, 352
28–29, 29f metadata graphs, 351
Nagoya University multiview camera system, 23, modules, 350, 350f
26f provenance, 351
Pozna n University of Technology multiview query system, 352
camera system (linear rig), 27, 27f core requirements, 347–349, 348f
Pozna n University of Technology multiview feature service, 359
camera system (modular), 27, 28f micro-services
texture, 89 blob service, 355
Action-reward system, 124 data service, 353
Actor interaction model, 411–412 image service, 356
Adaptive Fisher Discriminant Analysis (AFDA), module service, 354, 354f
376–377 scalability mechanisms, 353–354
Adaptive SDPC (ASDPC), 181–182, 182f table service, 356
Advanced video coding (AVC), 58 sparse images, 358–359
Animation Framework eXtension (AFX), 97 underwater images, annotations
Attentional modulation, 148–149 aggregation techniques, 362
Attention-deficit hyperactivity disorder (ADHD), percent cover with 100 points, 361, 362f
294–295 semantic segmentation, 362, 363f
Audiovisual alignment technique, 414–415 uniform percent cover classification, 362, 363f
Autostereoscopic displays, 94 Bonferroni correction procedure, 204
Bootstrapping, 187
Bottom-up attention, 116–117
B BQR query language, 353
Backward projection, 50 Bradley-Terry-Luce (BTL) model, 185–186
Backward warping technique, 50 Brightness transfer function (BTF), 371
Ball detection system, 123 Brown-Conrady model, 32–33, 33f
Barrel distortion, 33
Bayes’ rule, 121–122, 411
Belief propagation method, 43 C
Benjamini-Hochberg correction procedure, 204 Camera obscura, 213
Bernoulli variable, 128–129 Camera parameter estimation
Better vs. worse analysis, 201–202 checkerboard pattern, 33f
Bidirectional reflectance distribution function extrinsic parameters, 34, 35f
(BRDF), 92 file format, 35
Bidirectional 3D communication, 84 intrinsic camera parameters, 34
BisQue lens distortion parameters, 34
analysis modules Cameras, principles of, 213

425
426 Index

Cascade-CNN, 240 Deformation models, 275


Cauchy-Green strain tensor, 272 geodesics, 289–291, 291f
CAVIAR4REID dataset, 373–374, 374t transformation-based representations, 276–277
Center-surround, in temporal domain, 125 Degrees of freedom (DoF), 276
Charge-coupled device (CCD) camera, 90–91 Depth accuracy, 19
Chien model, 318, 321f Depth-based image rendering (DBIR), 76–77
Chromatic bilateral operator, 379 MVV plus depth, 99
Cinematic, mixed reality light field editing, 84 Depth estimation
Cinematic principles, 403–404 global stereo matching, 41–43
Cinematic VR, 4, 105 local stereo matching, 39–40, 40f
Closed world scenario model, 367 passive depth estimation techniques, 38
Coded aperture systems, 217, 218f Depth Estimation Reference Software (DERS), 35
Codesign of optics, 214 MPEG vs. belief propagation, 44f
Color correction, 36–37, 37f Depth image-based rendering (DIBR), 5–7
Community assignment, 415–418 MPEG standardization, 12, 12f
Community leaders, 418, 419t vs. point clouds, 9–12, 11f
Compression, 54–58 principles, 7–9
monoscopic video coding, 58–62, 63f, 64t view interpolation basic principles, 8f
multiview video coding, 62–63 Depth-invariant defocus blur, 218–219
simulcast coding, 58–62 Differential mean opinion scores (DMOS), 172
3D video coding, 63–65 Directional information, 77
Computational photography Direct scaling methods, QoE
breaking precepts, 213 confidence intervals calculation, 177–178
depth of field (DoF), 217–219 DSCQS, 174–175
single photodetector, 215 DSIS, 174
space-time bandwidth product, 216–217 mean scores calculation, 177
spatial multiplexing, 215 MUSHRA, 175–176, 176f
form-factors and capabilities, 214 SAMVIQ, 175–176, 176f
tractable solutions to inverse problems, 214 screening, 178
Confidence intervals (CI), 177–178 SS, 173–174
Connoisseur-based deep learning technique, Discrete camera array, 98–99
362 Discriminatively trained Part-based Model (DPM),
Connoisseur classification module, 359–361 237–238
Connoisseur service, 359–361 Disparity, 7–8, 18
Context influence factors, 165–166 effect, 99
in QoE measurement, 169 interview prediction, 62f
Conventional cameras, 92, 92f, 97 max-flow/min-cut approach, 42f
Conventional 2D displays, 94 Disruptions, 140–141
Correct decision, 198 Distance metric, 386–387
Cortical magnification, 152 Distance metric learning, 371
Covert attention, 115 Double Stimulus Continuous Quality Scale
Cross-view quadratic discriminant analysis (DSCQS), 174–175, 175f
(XQDA), 370 scoring sheet for, 175f
Cumulative distribution function (CDF), 178 Double Stimulus Impairment Scale (DSIS), 174
Cumulative Match Characteristic (CMC) curve, Dual purkinje eye trackers, 130–131
366–367

E
Efficient Impostor-based Metric Learning (EIML),
D 388–389
Datasets, eye-tracking, 133–138, 134–137t Elastic coregistration, 3D shape, 291–292
David-Fletcher-Powell (DFP) formula, 282–283 Elastic metrics, statistical analysis, 286–289
Deep learning, 359–361, 360f non-euclidean metrics, 286–288
Defocus blur, 217 SRNF inversion, 288–289
Index 427

Elastic registration with pose estimation, 241


landmark-based, 279–280 results
re-parameterization problem, 280–281, 281f AFW dataset, 254, 254f
Elastic shape analysiss, 263, 273–274, 283, design decisions, 254–255, 255f
295–297 detection time, 256, 256–257f
Endogenous attention, 116 failure modes, 255, 256f
Energy entropy (EE), 407 FDDB dataset, 254
Energy peak ratio (EPR), 407 3D models, 240
Entropy/information maximization, 123–124 energy model, 242
Epipolar plane images (EPIs), 46–48, 48f face 3D model, 241
Epsilon-Insensitive RMSE (RMSE*), 194 face representation, 241, 242f
Equal error rate (EER), 144f, 146 inference algorithm, 242, 243f
Event, defined, 403–404 3D view-based models, 240
Exogenous attention, 116 training dataset, 251
Expectation Maximization (EM) algorithm, 315, Face recognizer, 402, 402f
315b False differentiation, 198
Extrinsic camera parameters calibration, 31–32, False tie, 198
31f, 35f Feature Selection with Annealing (FSA) classifier,
Eye accommodation phenomenon, 102 243
Eye-tracking process Fencing view interpolation, 6f
datasets, for model validation, 133–138, Field of light. See Light field
134–137t Film scene, 400–401
in disease detection, 150 Fixation map, 139
disruptions analysis, 140–141 Fixation shift, 124
experiment, 130–133, 131–132f Fixed parameterization, 269–270, 280–281, 281f
eye movement, 130, 131f FlatCam, 220–222, 221f
saccades and fixations, 138–139 Focus sweep, 218–219
saliency maps, 139 Forward projection, 50
scan-path generation, 140 Forward warping technique, 50
tele-surgery, 151 Fourier ptychography (FP), 222–224, 223f
testing, 141–147 Four point congruent sets (4PCS) algorithm, 279
in training of medical personnel, 151 Frame-based animation mesh compression
(FAMC), 97
Frame-based model, 76–77
F Frame-compatible coding, 61, 61f
Face alignment, 240–241 Free navigation
Face detection free viewpoint home television, 85–86, 86f
candidates evaluation, 252–253, 253t free viewpoint sports event, 85, 85f
cascade approaches, 240 360 degree camera, 85
computational problem Frobenius norm, 272
image-based regression, 237
part-based model, 237
contributions, 238–239 G
face alignment, 240–241 Generalized multidimensional scaling (GMDS),
fitting 3D models 279–280
from 2D annotations, 250–251 Geodesics, 268, 282–286
rigid transformation, 250, 251b computation algorithms, 290f
method nomenclature, 252 deformations, 289–291, 291–292f
multiview models, 239 pullback metrics, 282–283
parameter sensitive classifiers, 241 shape spaces and metrics, 284–286
linear model, 247 SRNFs space, 283–284
nonlinear model, 247–248, 248f Geodesic subspaces, 287
optimization, 249–250 Global stereo matching
training cost function, 248–249, 249f belief propagation, 43
428 Index

Global stereo matching (Continued) Inference algorithm, 242, 243f


competing costs, 41 candidate generation, 245
graph cut, 41–43, 42f face scores
Graph-based methods local difference features, 245–246
foreground-background segmentation, 126–127 local selected features (LSF), 246
graph flow techniques, 126 modified LBF features, 246
graph spectral methods, 128 score function, 247
random walk based, 127 special features, 246
salient boundary and object identification, 127 keypoint detection, 243, 243f
Graph-based visual saliency (GBVS), 126 nonmaximal suppression, 247
Graph cut technique, 41–43, 42f 3D pose candidates
Graph spectral methods, 128 ground truth 3D poses, 244
Grenander’s pattern theory, 275 image-based regression, 244
training examples, 244–245
H Inferred social network. See Social network
Head mounted device (HMD), 4–5, 94 inference
Heat kernel signatures (HKS), 279–280 Information and decision-theory models
Hidden Markov Models (HMMs), 128 action-reward based, 124
Hidden reference (ACR-HR), 173–174 entropy/information maximization, 123–124
High-definition (HD) video, 56 Infrared cameras, 90
High efficiency video coding (HEVC), 48 Inpainting, 52–53
bitrate reduction, 66f Interactive all-reality
Holm-Bonferroni correction procedure, 204, 204b augmented reality surveillance with light field
Holografika Super-MultiView 3D display, 6f editing, 88
Holograms, 90–91 remote surgery with glasses-free 3D display,
Holographic displays, 95 86–88, 87f
Human influence factors, 166 surveillance with depth recovery, 86, 87f
in QoE measurement, 169–170 VR training, 88
Hybrid approaches, 145–147 Intrinsic parameters, 30–31, 31f
Irregular actions/behavior, detection of, 125–126
I Irregular sampling, 92, 97
Image-based interpolation techniques, 85 Ising model, 307, 321f
Image-based regression, 237 Iterated Conditional Mode (ICM) algorithm,
Image-based rendering (IBR), 5 310–311, 311b
Image processing Iterative Closest Point (ICP) algorithm, 279
multiple objects detection
population evaluation, 330–334
road network detection, 334–342 J
probabilistic approaches Joint Cascade, 240
MCMC simulations, 305–306 Joint kernel, 409
MRFs (see Markov Random Fields (MRFs)) Joint Photographic Experts Group (JPEG), 4
optimization problem, 305–306 JPEG PLENO, 104–105
point process approach (see Spatial point
process) K
potentials, 305 Karcher mean, 266–267
Importance maps, 119–120, 120f SRNF inversion, 288b
Incremental Coding Length (ICL) contribution, 123 Keep It Simple and Straightforward MEtric
Indirect scaling methods (KISSME), 370, 389
Bradley-Terry-Luce model, 185–186 Kendall’s rank order correlation coefficient
pair comparison matrix, 186–187 (KROCC), 195–196
paired comparison, 180–182 Kendall’s shape space, 268–271
ranking, 179–180 morphable models, 269–270
Thurston-Moesteller model, 183–185 nonlinear nature, 270–271
Index 429

Kernel Local Fisher Discriminant Analysis Maximum of the Pseudo Likelihood (MPL),
(KLFDA), 420–421 320–322
K-means classification algorithm, 332 Mean opinion scores (MOS), 172
Kolmogorov-Smirnov (KS) test, 142 Mean score, 177
Kullback-Leibler (KL) divergence, 142 Medial atom, 277
Kurtosis, 178 Medial representations (M-reps), 277
Memory-based modeling, 128–129
L Mesa Imaging SR4000 depth sensor, 20f
Landmark-based elastic registration, 279–280 Meshes, 93
Large margin nearest neighbor (LMNN). Metrics, 262
See LMNN metric learning deformation space, 276–277
Lens distortion, 32–33 non-euclidean, statistical analysis, 286–288
removal, 37, 38f physical deformations, 271–275
Lensless cameras, 219–222 pullback, 282–283
Levenshtein distance, 143–144 Metropolis-Hastings algorithm, 308–309, 309b
Light Detection And Ranging (LIDAR), 90, 90np Microlens array, 89, 91, 96
Light field, 80–81 vs. full parallax, multiview camera array image,
cameras, 91, 219 92f
communication, 82–84 light field interpolation, 98–99
displays, 94 refocusing problem, 99–101, 100f
editing, 82, 84 Micro-services, BisQue
representation, 78–82 blob service, 355
LiSens camera, 216f data service, 353
LMNN metric learning, 387–388 image service, 356
Local binary features (LBF), 240 module service, 354, 354f
Local Binary Pattern (LBP) cascade descriptor, scalability mechanisms, 353–354
401–402 table service, 356
Local Fisher discriminant analysis (LFDA), 369 Microsoft Kinect depth sensor, 20f
Log-likelihood maximization, 314b M€
obius transform, 280
Lossless coding, 57–58 Modes of variation, 269–270, 287b
Monoscopic video coding
frame-compatible coding, 61
M high definition (HD) format, 59
Mahalanobis distance matrix, 389 milestone standards, 59f
Mahalanobis metric learning, 386–387 ultra high definition (UHD) format, 59
Markov chain, 127 video compression, 60, 60t
Markov Chain Monte Carlo (MCMC) simulations, Montager, 359
305–306 Morphable models, 269–270
Markov Decision Process (MDP), 124 Moving Picture Experts Group (MPEG), 4, 103–104
Markov Random Fields (MRFs) camera parameter file format, 36f
inverse problems DIBR, 12, 12f
image restoration problem, 315–316, 317f FTV free navigation scenario, 103f
segmentation problem, 316–320, 317f FTV super-multiview video scenario, 103f
texture modeling, 320–322, 321f multiview video, 12
lattice-based models and Bayesian paradigm VSRS configuration file, 54, 55t
modelling, 306–308 MSCR operator, 382
optimization, 308–311 Multicamera system
parameter estimation, 311–315 camera arrangements, comparison, 21t, 22
Matrix bullet effect, 4f camera spacing, 22
Maximum A Posteriori (MAP), 305 camera type, 22–23
Maximum Entropy Random Walker (MERW), 127 on exemplary arc, 28f
Maximum likelihood estimation (MLE), 185–186, features, 20
320–322
430 Index

Multicamera system (Continued) Multiview video coding (MVC) standard, 76–77, 95


indoor/outdoor usage, 22 Multiview video plus depth (MVD), 13, 91
mobility, 22
mutual relations, 21f N
number of cameras, 20 Nominal scale, 171
processing throughput, 20–22 Non-euclidean metrics, statistical analysis,
Multidimensional perceptual scales 286–288
for QoE measurement, 170–172 Normalized disparity, 18
scales and scaling methods, 171–172, 172t
Multidimensional scaling (MDS), 292–294, 293f O
Multimedia delivery, visual attention Object recognition task, 122
image re-targeting, 150 Oculomotor bias, 129–130
interactive streaming, 149 OpenGL/DirectX 3D rendering pipeline, 4
packet loss, 149–150 Open world scenario, 367
Multiple births and deaths algorithm, 328, 328b Optical vignetting, 100np
advantages, 329 Ordinal scale, 171
graph construction, 329, 329f Outlier ratio (OR), 194–195
modified, 328, 328b Overt attention
Multiple comparisons, 203–204 bottom-up, 116–117
Multiple-shot vs. multiple-shot (MvsM), 384 endogenous and exogenous, 116
Multiple-shot vs. single-shot (MvsS), 384 importance map, 119–120, 120f
Multiple Stimuli with Hidden Reference and perceived importance, 118, 119f
Anchor (MUSHRA), 175–177, 176f top-down, 116–117
Multiview video (MVV), 76–77, 79–80, 91, 95–96
acquisition, 23–29 P
depth in stereo, 16–19 Packets, 149–150
multicamera system, 20–23 Pair comparison matrix (PCM), 186–187
multiview fundamentals, 12–16 Paired comparison (PC), 180–182
compression, 54–58 Paired comparison matrix (PCM), 180
coding, 62–63 Parameterization domain, 264
monoscopic video coding, 58–62 Parameterization process, shape, 264
simulcast coding, 58–62 Parameter sensitive classifiers, 241
3D video coding, 63–65 Path straightening procedure, 282–283
depth estimation Pearson correlation coefficient, 142
global stereo matching, 41–43 Pearson’s Linear Correlation Coefficient (PLCC),
local stereo matching, 39–40 192–193
multicamera depth estimation, 44–48 Perceived interest
DIBR, 5–7 attentional processes, 118
MPEG standardization, 12 gaze samples, 118, 119f
vs. point clouds, 9–12 ROI, 118
principles, 7–9 top-down semantic features vs. bottom-up
future aspects, 65–67 factors, 118, 119f
Matrix bullet effect, 4f Perceptual computing, 113–114
preprocessing Person re-identification. See Re-identification
geometrical parameters, 29–35 (re-id) system
video correction, 35–37 Photoactivated localization microscopy (PALM),
3D Light Field displays, SMV, 5 224–225
view synthesis, 99 Photographic light field editing, 84
inpainting, 52–53 Pinhole camera model, 13–16, 14f, 29, 30f
view blending, 51–52 Plane sweeping method, 45–46, 46–47f
VSRS, 53–54 Plenoptic function, 77–79, 79f
warping, 49–51 5D simplification, 80f
VR, 3D graphic representation formats, 4–5 light field, 81f
Index 431

Plenoptic imaging DSCQS, 174–175


acquisition, 89–91 DSIS, 174
challenges, 105–107 mean scores calculation, 177
data coding, 95–98 MUSHRA, 175–176, 176f
data rendering SAMVIQ, 175–176, 176f
discrete camera array, 98–99 screening, 178
meshes and point clouds, 98 SS, 173–174
microlens light field, refocusing, 99–101 indirect scaling methods
MVV plus depth, 99 Bradley-Terry-Luce model, 185–186
display, 94–95 pair comparison matrix, 186–187
free navigation, 85–86 paired comparison, 180–182
future aspects, 105–107 ranking, 179–180
interactive all-reality, 86–88 Thurston-Moesteller model, 183–185
light field influencing factors
communication, 83–84 context, 165–166, 169
editing, 84 human, 166, 169–170
representation, 78–82 significance calculation, 187–190
processing flow, 88f system, 165, 168
representation models, 91–93, 101–102 multidimensional perceptual scales for, 170–172
standardization activities performance evaluation
JPEG PLENO, 104–105 KROCC, 195–196
MPEG FTV, 103–104 multiple comparisons, compensation for,
Point cloud(s), 5f, 81, 81f 203–204
vs. DIBR, 9–12, 11f outlier ratio, 194–195
rendering textured meshes, 98 Pearson’s linear correlation coefficient,
with texture, 92, 97 192–193
Point cloud-based representation, 93 RMSE*, 194
Point Cloud Library (PCL), 97 ROC-based, 200–203
Potentials, 305 root-mean-squared error, 193
Potts model, 317f, 318 RP measures, 196–200
Poznan Blocks multiview video, 17f SROCC, 195
Preference of Experience (PoE), 179 Quality of Service (QoS), 164
Preshape space, 264
PRID2011 dataset, 374–376, 375t
Principal Component Analysis (PCA), 262, 269–270 R
Probability density functions (PDFs), 183 Random shape, 287b
Probit, 183–184 Random 3D model synthesis, 294, 295–296f
Procrustes analysis of shapes, 278–279 Random walk based method, 127
Projection, 287 Rank-n accuracy, 366–367
Ptychography, 222–224 Receiver operating characteristic (ROC), 143
Pullback metrics, 282–283 performance evaluation
Python and MATLAB scripting, 357 better vs. worse analysis, 201–202
different vs. similar analysis, 200–201
objective algorithms, 202–203
Q in saliency maps, 142–143
Q-maps, 294–295 Rectification, 16
SRVF and, 274–275 Recurrent high structured patches (RHSPs),
QoS experienced (QoSE), 164 382–384
QoS perceived (QoSP), 164 Registration quality, shape analysis, 278–281
Quality of experience (QoE) deformations, 289–291, 291f
definition, 164–166 elastic registration as re-parameterization
direct scaling methods problem, 280–281, 281f
confidence intervals calculation, 177–178 landmark-based elastic registration, 279–280
432 Index

Re-identification (re-id) system optimization, 341


CAVIAR4REID dataset, 373–374, 374t simulated annealing framework, 341–342
closed world scenario, 371 Rigid classifiers, 239
CMC curve, 366–367 Root-mean-squared error (RMSE), 193
description, 365 Round trip time (RTT), 149
direct approach class, 368
feature extraction, 366, 370–371 S
gallery set, 365–366 Saccades and fixations, 138–139, 142
learning-based approach class, 369 SAIVT-SoftBio dataset, 376, 377f, 377t
matching phase, 365–366, 384–385 Saliency maps, 119–120, 120f
metric learning, 370, 385 for images and videos, 139
efficient impostor-based, 388–389 similarity in, 142–143
KISSME, 389 as weighting factor of local distortions, 148
LMNN, 387–388 Scan-path generation, 140
Mahalanobis, 386–387 Scan-path (saccadic) models
model design, 365–366 memory-based modeling, 128–129
model learning oculomotor bias and memory based, 129–130
brightness transfer function, 371 residual information based, 129
distance metric learning, 371 semantic region based, 129
geometric transfer function, 371 SCAPE model, 275, 277
model training, 366 Scene, defined, 403–404
open world scenario, 367 Score function, 247
PRID2011 dataset, 374–376, 375t SDALF matching distance, 384
rank-n accuracy, 366–367 Self-Assessment Manikin (SAM), 170, 170f
SAIVT-SoftBio dataset, 376, 377f, 377t Semantic region based method, 129
SDALF approach Shape, 261
object segmentation, 378–379 elastic coregistration, 291–292
silhouette partition, 379–381 general formulation, 263–267
symmetry-driven accumulation of local geodesics, 282–286
features, 381–384 invariance requirements, 264–265, 265f
taxonomy, 366–368 metrics, 271–275
VIPeR dataset, 372, 373t registration quality, 278–281
Relative position (RP) representations, 263–264
of initial gaze, 140, 141f tools for manipulation, 261–262
of saccadic target, 140, 141f transformation-based representations, 275–277
Rendering process, 98, 263–264 choice of template T, 275
Re-parameterization, 264–265 deformation models, 276–277
elastic registration, 280–281, 281f Shape atlas, 286
Replay Technologies, 9 Shape spaces, 265
Residual information based method, 129 Kendall’s shape space, 268–271
Residual memory, 129 and metrics, 267–277, 284–286
Resolving power (RP) square-root representations, 273–275
accuracy, 196–197 thin shells, 272–273
classification plots, 197–200 Short-time energy (SE), 407
Reversible Jump Markov Chain Monte Carlo SHREC07 watertight 3D model benchmark,
(RJMCMC) method, 325 292–294
acceptance/rejection concept, 327 Silhouette partition, 379–381
birth and death kernel, 326–327 Simulated annealing algorithm, 310, 310b
efficiency, 326 Simulcast coding, 56. See also Monoscopic video
with mixture of kernels, 326b coding
multiple births and deaths algorithm, Simulcasting configuration, 97
328, 328b Single-shot vs. single-shot (SvsS), 384–385
Index 433

Single stimulus (SS) methodology, 173–174 Square root maps (SRM), 274
Singular Value Decomposition (SVD), 269–270 Square-root representations, shape space, 273–275
Site entropy rate (SER), 127 Square root velocity function (SRVF), 273–274,
Social affinity, 415 284, 296f
Social communities, 396–397 inversion, 267f, 288–289, 296f
Social group, 396 and Q-maps, 274–275
Social network graph, 410–411 space, 283–284
Social network inference State-of-the-art techniques, 262–263, 278, 297
actor affinity, 418 Stel Component Analysis (SCA), 378–379
actor recognition, 401–402 Stereo displays, 94
affinity matrices, 415 Stereo matching
audiovisual alignment, 414–415 border-aware window shapes, 41f
communities matching windows in, 40f
actor interaction model, 411–412 Stereoscopy, 151–152
assignment to actors, 412–413, 415–418 Stochastic optical reconstruction microscopy
leader estimation, 413 (STORM), 224–225
social network graph, 410–411 Stratified Procrustes Analysis, 251
community leaders, 418, 419t Structured illumination microscopy (SIM), 225
computed communities, 396–397 Structured light projectors, 90
dataset, 414, 414t Subdiffraction imaging, 224
grouping Subjective Assessment of Multimedia Video
auditory features, 407–409 Quality (SAMVIQ), 175–177, 176f
cinematic principles, 403–404 Super-multiview (SMV) video, 5, 83, 83f, 103
grouping criteria, 409 Symmetry-based Descriptor of Accumulated Local
visual features, 404–406, 406–408f Features (SDALF), 366
weak, 403 object segmentation, 378–379
latent features, 418–421, 421t silhouette partition, 379–381
network evolution, 396 symmetry-driven accumulation
relations, 399t MSCR operator, 382
single group, 398 RHSP generation, 382–384
sociology domain, 399 weighted histograms, 381–382
video shot segmentation, 400–401 Synthetic aperture radar (SAR) techniques,
Social relations, 396 222–224
Space-time bandwidth product, 216–217 System influence factors, 165
Sparse coding strategy, 123 in QoE measurement, 168
Spatial covering operator, 379
Spatial multiplexing camera, 215
Spatial point process
modeling
marked point process, 324 T
Markov process, 323 Target map, 122
point process, 323 Texture synthesis inpainting, 54f
optimization, 325–329 3D camera system, 19
Spatio-temporal computational models 3D displays, 151–152
center-surround in temporal domain, 125 3D graphics pipeline rendering techniques, 85
irregular actions/behavior, detection of, 125–126 3D-HEVC standard, 64, 76–77, 96, 103–104
Spearman’s rank order correlation coefficient Three-dimensional (3D) games, 3–4
(SROCC), 195 3D Light Field, SMV, 5
Spectral flux (SF), 409 3D lossless compression, 57–58
SPHARM approach, 294–295 3D mesh coding (3DMC) tools, 5f, 12, 93, 97
Square design PC (SDPC), 181 3D video coding, 56, 63–65
Square error (MSE) measure, 415 360 degree camera, 85, 89, 90f
434 Index

Thurstone-Moesteller (TM) model, 183 challenges, 66


Time-of-flight (ToF), 86–87 cinematic, mixed reality light field editing, 84
Lidar scanner, 9 multiview video, 4–5
range imaging cameras, 90 3D graphic representation formats, 4–5
Top-down attention, 116–117 Visual attention
Top-down computational attention models applications
driving, 123 immersive media, 151–152
gaming, 123 in medicine, 150–151
object recognition task, 122 in multimedia delivery, 149–150
sports, 123 quality assessment, 147–149
visual search task, 121–122 covert, 115
Traditional cameras, 89 eye-tracking
Transformation-based representations, shape, datasets, for model validation, 133–138,
275–277 134–137t
choice of template T, 275 disruptions analysis, 140–141
deformation models, 276–277 experiment, 130–133, 131–132f
Transient attention. See Endogenous attention saccades and fixations, 138–139
Troy saliency maps, 139
actor recognition, 402f scan-path generation, 140
adversarial scene, 404, 405f testing, 141–147
communities in, 396–397 graph-based methods
entropy histogram, 407f foreground-background segmentation, 126–127
friendly and adversarial social relationship, 408f graph flow techniques, 126
histogram computation, 406f graph spectral methods, 128
social network graph, 397, 397f random walk based, 127
salient boundary and object identification, 127
V information and decision-theory models
Variance, defined, 286 action-reward based, 124
Velocity profiles, 138 entropy/information maximization, 123–124
Video attention deviation (VAD), 149 in multimedia, 113–114
Video coding overt
milestone generations, 59f bottom-up, 116–117
rate-distortion lines, 57f endogenous and exogenous, 116
Video correction, 35 importance map, 119–120, 120f
color correction, 36–37, 37f perceived importance, 118, 119f
lens distortion removal, 37 top-down, 116–117
Video shot segmentation, 400–401 scan-path (saccadic) models
View blending, 51–52, 52f memory-based modeling, 128–129
View interpolation, 48 oculomotor bias and memory based, 129–130
View synthesis and virtual navigation, 49f residual information based, 129
HEVC, 48 semantic region based, 129
inpainting, 52–53 spatio-temporal computational models
view blending, 51–52 center-surround in temporal domain, 125
view interpolation, 48 irregular actions/behavior, detection of,
VSRS, 53–54 125–126
warping, 49–51 top-down
View synthesis reference software (VSRS), 35, driving, 123
53–54 gaming, 123
Viola-Jones face detector, 122 object recognition task, 122
VIPeR dataset, 372, 373t sports, 123
Virtual Reality (VR), 4, 152 visual search task, 121–122
caves, 94 vision science to engineering, 114–115
Index 435

Visualization process, 263–264 X


Visual search task, 121–122 XPath, 353
Visual short term memory (VSTM), 125 XQDA. See Cross-view quadratic discriminant
VSRS. See View synthesis reference software analysis (XQDA)
(VSRS)

W Z
Willmore energy, 272 Zero crossing rate (ZCR), 409
Wireless mobile camera module, 28f

You might also like