2007, Video Recog. Systems - Rail & Transit

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Video Recognition Systems Video Technology for security:

Problems and Solutions


http://vrs.iit.nrc.ca
www.perceptual-vision.com
Dr. Dmitry Gorodnichy

“Computer Vision allows computers to see. Rail and Urban Transit Security Workshop
Perceptual Vision allows computers to understand what they see. ” Montreal
November 2007

Outline History of Video


Approaches developed

Intro: Video Technology (VT)


Video recognition systems:
– History of VT
– NRC/IIT Video Recognition Systems project
when became possible to process video frames fast (>12 fps)
– VT within GoC (VT4NS meeting) XXI century
– Important VT facts: What vendors don’t tell
Part 1: On Intelligent Surveillance Digital video systems Computer Video
– Need-to-know facts: status quo, real challenges, real solutions
Pattern Vision Recognition
• On “motion-detection”
• Next-generation surveillance: object-detection based Recognition
– Introducing ACE Surveillance™ (Annotated Critical Evidence)
– Case study/Demo: NRC Commissioners use ACE Surveillance™ Analog video systems (capture/storage)

Part 2: On Video-based Face Recognition


– Very different from Photograph-based:
First video (motion picture):
• Different constraints, Different approaches, Different applications when became possible to display video frames fast (>12fps)
– Solutions/Demo: Recognizing actors in movies XX century
3. Video Recognition Systems (Dmitry Gorodnichy) 4. Video Recognition Systems (Dmitry Gorodnichy)
Video Recognition – Video Recognition
area of XXIst century Systems
• Aka
- Video Analysis and Content Extraction (VACE), • Started 2001 within NRC/IIT
- Intelligent Video, Smart Video, … – Formerly, as Perceptual Vision project
- Perceptual Vision • Do both research / services and development / licensing
• IS NOT about capturing data (better lenses, grabbers, coders, – Worked on Canadarm2 (2001-2)
transmitters), but about understanding captured data (better – Known for Nouse® (Nose as mouse) tool for disabled (2003-7)
theory) • Emphasis on Security & Surveillance since 2004
• Very young area and IS NOT: – Intelligent surveillance
• Pattern (Face) Recognition & Machine Learning – Face recognition from Video
• Computer Vision & Image Processing • Work with Industry, Academics & OGDs:
• Neurobiology & Biological Vision – Esp. CBSA, RCMP, DRDC.
– But requires a mixture of expertise in all of the above! • Partner of USA DTO / VACE program Å handout
(Disruptive Technology Office / Video Analysis and Content Extraction)
5. Video Recognition Systems (Dmitry Gorodnichy) 6. Video Recognition Systems (Dmitry Gorodnichy)

VRS Role VRS expertise


Social values
Other clients:
CBSA, RCMP Health
driver DRDC, DND/CF Media
• Object detection and tracking
CSIS, PPTC, CIC, PCO Education
impact papers, PSERC, TC, CATSA – Automated Teleoperator
conferences – ACE Surveillance™
CVPR, CRV
CPFC
Knowledge/ Technology/
Discovery Services
• Faces in Video
– Face detection, tracking
VRS IIT
– Face recognition from Video

NRC Acoustics / IMS • Other


Universities Flight Facility
OGDs – Image Search (Roth)

Industry – Marker-based tracking (Fiala)
7. Video Recognition Systems (Dmitry Gorodnichy) Industry 8. Video Recognition Systems (Dmitry Gorodnichy)
Events organized VT4NS’07 attendees

Since 2004: ** USA DTO/VACE (Disruptive Technology Office/Video Analysis and Content Extraction)
** NRC/IIT/Video Recognition Systems
IEEE-archived Intern. Workshops on + NRC/Administrative Services and Property Management Branch / Security Operations
Video Processing and Recognition (VideoRec’08 - in * NRC/ Institute for Aerospace Research/Flight Research Laboratory
* CRC (Industry of Canada, Communications Research Centre)/Advanced Video Systems
Windsor, May 27-30, 2008) ** CRIM (Computer Research Institute of Montreal)
Goal: Focus academic effort on newly emerged area. + CBSA (Canada Border Services Agency)/Laboratory and Scientific Services Directorate
+* RCMP/ Surveillance Technology Section / Covert Video (CV), Remote Sensing
Technologies (RST) and Special Purpose Vehicle (SPV) units
+ RCMP/ Technical Security Branch
Ottawa, June 5, 2007: +* DRDC/Automated Intelligent Systems/UAV
First federal departments meeting on +* DRDC/Network Information Operations Section
+* DRDC/Centre for Operations Research & Analysis (CORA)
Deploying Video Technologies for National Security +* CPRC (Canadian Police Research Center)
(VT4NS’07) + Transport Canada / Security Technology / Security and Emergency Preparedness
+ Office of the Privacy Commissioner of Canada
+ DND/Forces (several depts.)
Goal: Discuss the ways to synchronize the effort in developing VT * VT developers
+ VT users
solutions and setting VT standards for the new century within GoC.
9. Video Recognition Systems (Dmitry Gorodnichy) 10. Video Recognition Systems (Dmitry Gorodnichy)

VT4NS Links VT4NT Report


(to help translation only)
• NRC-Administrative Services and Property Management Branch (NRC-ASPM) Security
Operations ( link: http://www.nrc-cnrc.gc.ca/institutes/aspm_e.html),
• No national / regional VT program yet.
• NRC-Institute for Aerospace Research (NRC-IAR) Flight Research Laboratory ( link:
http://iar-ira.nrc-cnrc.gc.ca/flight_main_e.html). • Decisions influenced by vendors / short-term solutions
• Communications Research Centre Canada (CRC) Advanced Video Systems ( link:
http://www.crc.ca/en/html/crc/home/research/broadcast/advanced_video),
Æ No national standards for capturing /saving video data.
• Canada Border Services Agency (CBSA) Laboratory and Scientific Services Directorate (
link: http://www.cbsa-asfc.gc.ca/media/facts-faits/035-eng.html),
• E.g. over 30 different video systems deployed in Ottawa
• Royal Canadian Mounted Police (RCMP) Surveillance Technology Section: ( link: Æ No policy to handle evidence:
http://www.rcmp-grc.gc.ca/bc/lmd/surrey/content/services/fis_e.htm),
• RCMP Technical Security Branch ( link: http://www.rcmp-grc.gc.ca/tsb/) • E.g. is data original, not altered
• Defence Research & Development Canada (DRDC) Automated Intelligent Systems ( link:
http://www.drdc-rddc.gc.ca/researchtech/tis/activ2_e.asp), • Many local initiatives, not coordinated
• DRDC Network Information Operations Section ( link: http://www.ottawa.drdc-
rddc.gc.ca/html/NIO-102-section_e.html), – City of Calgary (traffic abnormalities detection with CCTV cams)
• DRDC Centre for Operations Research & Analysis ( link: http://www.drdc-
rddc.gc.ca/researchcentres_e.asp), – Cornwall Canada US border* (DVR). Pilot project #1 “port-runner”
• Transport Canada Security Technology ( link: http://www.tc.gc.ca/en/menu.htm),
• Office of the Privacy Commissioner of Canada ( link: http://www.privcom.gc.ca/), – Ottawa/Montreal Airports* (CCTV, PTZ DVR), …
• several National Defense and the Canadian Forces (DND/Forces) departments ( link:
http://www.forces.gc.ca/site/home_e.asp), •This is about to be changed (2007)
• Computer Research Institute of Montreal's Vision and Imaging Group (In French only) (
link: http://www.crim.ca/fr/index.html), •Follow the USA DTO/VACE model
• Canadian Police Research Center ( link: http://www.cprc.org/)
11. Video Recognition Systems (Dmitry Gorodnichy) 12. Video Recognition Systems (Dmitry Gorodnichy)
Facts to know Problems
(what VT vendors may not tell)

1. Video capture is no longer expensive or bad 1. Environment/Setup – light/weather, field of view …


– Composite video/RCA (CCTV analog) 2. Objects/Activities – non-collaborative actions
– USB2 cams and digitizers 3. Misconceptions (in interest of vendors)
– Firewire cams 1. The more, the better – NO
– Wireless & IP cameras 2. A human can see, so the system will (one day) – NO
– Multi-channel framegrabers for CCTV 3. “Baggage of the past”: using old tools for NEW problems
2. Beware of “high resolution” cameras 4. Real-time constraint – for “alarm” systems
1. It’s unlikely the real resolution 5. Resolution
2. It doesn’t help making video more “intelligent” 1. Video image is small: 720x480 NTSC)
2. Objects occupy small part: <1/8 of image
3. It’s Intelligence that’s missing But is resolution really a problem ?
13. Video Recognition Systems (Dmitry Gorodnichy) 14. Video Recognition Systems (Dmitry Gorodnichy)

More on resolutions & formats: Recorded from TV


video sources
(320 x 240 video)

Humans watch movies on TV without a problem…


Despite “bad” resolution + orientation, expression, occlusion
(Faces are 30x30 pixels!)
• NTSC: You don’t have problem recognizing people & activities
– Vert. res.(fixed) + active 487 (interlaced) out of 525
– Horiz. res.(variable) + 330 (TV), 210 (VHS) - Due to fuzziness!
– 60 half-frames / second

• VCD: 320x240 mpeg1 – for TV (VHS tape) PLAY


VIDEO
• DVD: 720x240 mpeg2 – most suited for digital recordings
Yet computers fail… - Is something wrong with computer
• HDTV: 1920x1080, but… for humans approaches?
it’s sound (DolbyDigital) not video that makes the difference! (Even on a studio taken video with perfect FOV and lighting!)
15. Video Recognition Systems (Dmitry Gorodnichy) 16. Video Recognition Systems (Dmitry Gorodnichy)
Two Big problems

1. Storage space consumption


Intelligent Surveillance: • Typical assignment:
2-16 cameras, 7 or 30 days of recording, 2-10 Mb per
problems & solutions min.
Î1.5 GB per day per camera / 20 - 700 GB total !

2. Data management and retrieval


• London bombing video backtracking experience:

“Manual browsing of millions of hours of digitized video from


thousands of cameras proved impossible within time-
sensed period”
[by the Scotland Yard trying to back-track the suspects]
17. Video Recognition Systems (Dmitry Gorodnichy) 18. Video Recognition Systems (Dmitry Gorodnichy)

Main bottleneck Intelligent Surveillance


Objective

This is now affordable: To replace / assist human personnel


• “highest picture quality and resolution”, To make video data manageable (esp. for long-term monitoring)
• “complete Pan/Tilt control”, To make surveillance affordable: time-wise, space-wise
• “powerful 44X Zoom",
• ``total remoteness", “ “For video surveillance to be operational,
• “multi-channel support of up-to 32 cameras", it is critical to store only that video data which is useful,
i.e. the data containing new evidence”.
• ``extra fast capture of 240 fps”

1. Evidence = objects, events of interest


but…
2. New = succinct and non-redundant.
• you may just not have time to browse it all in order to
detect the important information.
Possible only with video recognition!
19. Video Recognition Systems (Dmitry Gorodnichy) 20. Video Recognition Systems (Dmitry Gorodnichy)
Misconception about Noise & changes in video
“Motion-based” capture (demo)

• Term “Motion-based” is coined to make people believe – Changing light / weather (esp. in 24/7 monitoring)
that video recognition is happening, which is not! • Wind, precipitations
– Against sun/light, out of focus, blurred, thru glass
• It’s actually illumination-change-based, as it uses • Reflections, diffraction, optical interferences
simple pixel brightness comparison: – Image transmission, compression losses
| Bij(t) – Bij(t-1) | > N for K pixels Î “alarm”
– Which often happens not because of motion!
• Light changes
• Noise
– Especially: Outdoors & in long-term monitoring

21. Video Recognition Systems (Dmitry Gorodnichy) 22. Video Recognition Systems (Dmitry Gorodnichy)

Next-generation ACE Surveillance™


surveillance (Annotated Critical Evidence)

Solution: Definition: Critical Evidence Snapshot (CES) - video snapshot


- Do as much as possible Video Recognition in real-time that provides information that is both useful and new.
BEFORE saving video !
- Object-based surveillance (not change-based) ! Definition: ACE Surveillance - surveillance that deals with
extraction and manipulation of Annotated Critical Evidence.

Example: ACE Surveillance™ technology - Based on recent advances in object detection / tracking.
- Replaces video clips with annotated JPG images
– Compresses 1 Gb of video into 2 Mb of easy to browse still images
(can hold several years of evidence on a single computer).
– Shown annotations: size, velocity, colour of detected objects.
- Enables new Zoom-on-Evidence™ browsing
23. Video Recognition Systems (Dmitry Gorodnichy) 24. Video Recognition Systems (Dmitry Gorodnichy)
Object Detection and Motion-based capture
Tracking results

• Tested 24/7 in many outdoor and indoor setups


• On ordinary computer, in real-time with up to 8
cameras.

• Demo: 24 hours of monitoring outdoors


(long-term, low traffic) Å video
•Many captured snapshots are
useless: either noise or redundant
• Demo: Hockey players tracking indoor
•Without visual annotation, motion
(high-traffic, multiple fast-moving objects) Å video information is lost.
•Hourly distribution of snapshots is
not very useful
25. Video Recognition Systems (Dmitry Gorodnichy) 26. Video Recognition Systems (Dmitry Gorodnichy)

ACE Capture ACE Surveillance


Applications / Limitations

Ready to
1. For existing CCTV systems
• Works with stationery cameras only
2. For security desks with a computer
• Upto 8 cameras on a single (3GHz / 2Gb RAM) pc

NRC commissioner example:


•Each captured snapshot is useful. – Installed: January 2007.
•Object location and velocity shown
using graphical annotation – Archived of more than 6 months of evidence data.
•Hourly distribution of snapshots is – 2 entrances (ADT-installed CCTV cams) + USB webcam
indicative of what happened in at the desk
each hour, provides good
summarization of activities over
– Became an indispensable daily routine
27. Video Recognition Systems (Dmitry Gorodnichy)
long period of time. 28. Video Recognition Systems (Dmitry Gorodnichy)
Example: Example:
Monitoring in XX-th century Monitoring in XXI-st century

• In real-time mode: watch closely if alarm sounds.

• If away from desk: Last captured CES shows whether anything


happened. Then play-back all CES-es.
A dedicated officer has to look at the monitors at all times.
If he is away / looked elsewhere, an event may pass unnoticed. • In archival mode: “zoom on evidence” – zoom on a day, on
hour, then on event - point and click (for high res as needed)
29. Video Recognition Systems (Dmitry Gorodnichy) 30. Video Recognition Systems (Dmitry Gorodnichy)

Zoom-on-the-evidence™ Demo
Browsing
Back Door Entry Delivery Entry

• Monitoring NRC premises with ACE Surveillance


On week-day

• Browsing data with Zoom-on-evidence ACE Browser


On week-end

31. Video Recognition Systems (Dmitry Gorodnichy) 32. Video Recognition Systems (Dmitry Gorodnichy)
Future trends

• In software (video recognition algorithms):


– Better object detection & tracking
Video-based
• For complex motions Face Recognition:
• For moving cameras
– Better annotation: activity recognition problems & solutions
• In hardware:
– Smart PTZ cams: PTZ on objects
– Smart IP cams: send only when/what is needed
– Video + hi/res photocamera / other sensors
– Synchronized cameras
• In mentality / logistics:
– More inter-department VT initiatives
– More constrained/proper setups and tasks
33. Video Recognition Systems (Dmitry Gorodnichy) 34. Video Recognition Systems (Dmitry Gorodnichy)

Why in video? Biometric modalities


summary
Hierarchy of affordability / applicability
of different biometrics modalities
(from NATO Biometrics workshop, Ottawa, Oct.2004)

Public level Unconstrained environment


Video-based information is
CCTV Æ
- most available
- least intrusive
Video provides:
- soft biometrics # bio-measurements # registered ids
- identification at distance
+
Passport Æ
Ready infrastructure
(CCTV already used
Detainee’s level Controlled environment
everywhere for surveillance)
35. Video Recognition Systems (Dmitry Gorodnichy) 36. Video Recognition Systems (Dmitry Gorodnichy)
Current situation: Intentional misconception?
computers fail, humans succeed

Face recognition systems performance


(from biometrics forums, NATO Biometrics workshop, Ottawa, Oct. 2004) Over last 5 years over $XX.XXX.XXX already spent on
100
applying face recognition to video data…
80

60 And what?
By humans
40
Face Recognition Vendor Test (www.frvt.org) is still seen: “in
By computers
making the video data of better quality”
20

0
In In Approaching NEW problem with OLD tools ?
photos video
Instead of developing approaches which can deal with low-
While humans easily recognize a person in video (with faces < 40 pixels!), resolution data
computer performance on video is much worse than that on photos!
37. Video Recognition Systems (Dmitry Gorodnichy) 38. Video Recognition Systems (Dmitry Gorodnichy)

Important Photos vs Video

Photos: Video:
- High spatial resolution - Low spatial resolution
Photographic facial data and video-acquired facial - No temporal knowledge - High temporal resolution
data are two different image-based modalities ( Individual frames of poor quality)
E.g. faces:
– different nature of data 1. in controlled environment 1. in unconstrained environment
(similar to fingerprint (in a “hidden” camera setup)
– different biometrics
registration) 2. don’t look into camera, don’t
– different approaches 2. “nicely” forced-positioned even face camera
– different testing benchmarks 3. 60 pixels IOD 3. 10-20 pixels IOD
(IOD = intra-ocular distance)

Face recognition in video requires


video-based framework Yet, for humans, video (even of this “low” quality) is often even
more informative than a photograph !
39. Video Recognition Systems (Dmitry Gorodnichy) 40. Video Recognition Systems (Dmitry Gorodnichy)
Canonical Face Model Face Detection results
used in passports

Adopted by ICAO’02 for • Psychological study: people recognize faces starting from
passport-type documents IOD > 10 pixels
(used in Canada, USA, EU)
• Good news (2002): computers can also detect faces
– with i.o.d >= 10 pixels
• One picture per person
– in poor illumination,
• IOD=60 (Width=120 pixels)
– with different orientations: +/- 45o
– different facial expressions
Used
- To store faces in databases
- In recognition algorithms

• But can it be used for video?


• Should it be used ? – motion rejects with spuriously detected faces
41. Video Recognition Systems (Dmitry Gorodnichy) 42. Video Recognition Systems (Dmitry Gorodnichy)

Canonical face model Proper approaches


suitable for video-based
recognition
1. Work on low-res images (as long as i.o.d.>10)
Proposed in 2004: 2. .IOD
i.o.d 2. Accumulate facial data over time
•IOD=12 pixel
•24 x 24 size is sufficient Good video-based recognition implies accumulation of
24 data over time!
• Is much easier to extract from video Anything based on a single frame won’t be good.
(by computers).
• Eyes can be automatically aligned. Types of multi-frame facial data fusion
• Many of these can be extracted for • Super-resolution
the same person, as s/he is being
• Neuro-biological (synaptic adaptation)
tracked.
• 3D face models
43. Video Recognition Systems (Dmitry Gorodnichy) 44. Video Recognition Systems (Dmitry Gorodnichy)
VRS technology Discussion
(person annotation in TV programs)

• IOD < 10, body/gait biometrics should be used


Problem: Recognize faces in 160x120 mpeg1 video
• IOD > 11, “some” face recognition from video can be performed:
– Accumulation over time techniques may, in some cases (many
shapshots under good angles), enable “1 to many” ICAO identification.
– Sufficient for many “1 to few” recognition tasks.
Good applications:
- Monitoring of limited-access premises
- Multiple-camera person tracking and backtracking,
- Verification (e.g. with access card)
Bottleneck:
- Angle of view, quality of video (see Introduction)
Demo: 98% real-time recognition of four people in a video clip. - General “1 to many (>1000)” recognition is still unresolved
Future Trend:
Approach: neural network based accumulation of low-res facial data
- Forced face registration (as in check-ins / “hidden” eye-level cameras)
45. Video Recognition Systems (Dmitry Gorodnichy) 46. Video Recognition Systems (Dmitry Gorodnichy)

You might also like