Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Pusat Pengajian Sains Ukur dan Geomatik

Universiti Teknologi MARA


40450 Shah Alam
Selangor Darul Ehsan

Compiled by Maisarah Abdul Halim


Overview:
•GIS & GIS Applications
•Big Data
•Internet of Things
•Sample Analytics
G I S eographic nformation

• A geographic information system (GIS)


ystems

lets us visualize, question, analyze, and


interpret spatial data to understand
relationships, patterns, and trends
• Spatial is special
• Almost everything that happens,
happens somewhere
GIS
APPLICATIONS
Big Data- What is Big Data?
• ‘Big-data’ is similar to ‘Small-data’, but bigger
• …but having data bigger consequently requires
different approaches:
◦ techniques, tools & architectures
• …to solve:
◦ New problems…
◦ …and old problems in a better way.
Characterization of Big Data – 3 V’s
Extremely large data sets that may be
analysed computationally to reveal
patterns, trends, and associations,
especially relating to human behaviour and
interactions

Source: http://www.igi-global.com/dictionary/big-data/39008
Volume (Scale)
• Data Volume
• 44x increase from 2009 2020
• From 0.8 zettabytes to 35zb
• Data volume is increasing exponentially

Exponential increase in collected/generated


data
4.6 billion
30 billion RFID tags camera
12+ TBs today phones world
of tweet data (1.3B in 2005) wide
every day

100s of
millions of
GPS
data every day

enabled
? TBs of

devices sold
annually

25+ TBs of
log data every 2+ billion
day people on the
Web by end
2011

76 million smart meters


Velocity (Speed)

• Data is begin generated fast and need to be processed fast


• Online Data Analytics
• Late decisions ➔ missing opportunities
• Examples
• E-Promotions: Based on your current location, your purchase history, what you like ➔
send promotions right now for store next to you

• Healthcare monitoring: sensors monitoring your activities and body ➔ any abnormal
measurements require immediate reaction
Real-time/Fast Data

Mobile devices
(tracking all objects all the time)

Social media and networks Scientific instruments


(all of us are generating data) (collecting all sorts of data)

Sensor technology and networks


(measuring all kinds of data)

• The progress and innovation is no longer hindered by the ability to collect data
• But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from
the collected data in a timely manner and in a scalable fashion
Variety (Complexity)
• Relational Data (Tables/Transaction/Legacy Data)
• Text Data (Web)
• Semi-structured Data (XML)
• Graph Data
• Social Network, Semantic Web (RDF), …

• Streaming Data
• You can only scan the data once

• A single application can be generating/collecting


many types of data

• Big Public Data (online, weather, finance, etc)

To extract knowledge➔ all these types of


data need to linked together
Additional V’s :
• Veracity: much of geospatial big data are from unverified sources with low or
unknown accuracy, level of accuracy varies depending on data sources, raising
issues on quality assessment of source data and how to “statistically” improve
the quality of analysis results.
• Visualization: provides valuable procedures to impose human thinking into
big data analysis. Visualizations help analysts identifying patterns (such as
outliers and clusters).
• Visibility: the emergence of cloud computing and cloud storage has made it
possible to now efficiently access and process geospatial big data in ways that
were not previously possible.
Big Data- Big Analytics
• Complex math operations (machine learning, clustering,
trend detection, ….)
• Mostly specified as linear algebra on array data
• Integrated with complex analytics
• Specified as arrays, not tables
• The challenges include capture, curation, storage,
search, sharing, transfer, analysis, and visualization.
Figures for one day of NYTimes’ Website
• 50Gb of uncompressed log files
• 10Gb of compressed log files
• 0.5Gb of processed log files
• 50-100M clicks
• 4-6M unique users
• 7000 unique pages with more then 100 hits
• Index size 2Gb
• Pre-processing & indexing time
• ◦ ~10min on workstation (4 cores & 32Gb)
• ◦ ~1hour on EC2 (2 cores & 16Gb)
Tools typically used in Big Data Scenario
• NoSQL
◦ DatabasesMongoDB, CouchDB, Cassandra, Redis, BigTable, Hbase,
Hypertable, Voldemort, Riak, ZooKeeper
• MapReduce
◦ Hadoop, Hive, Pig, Cascading, Cascalog, mrjob, Caffeine, S4, MapR,
Acunu, Flume, Kafka, Azkaban, Oozie, Greenplum
• Storage
◦ S3, Hadoop Distributed File System
• Servers
◦ EC2, Google App Engine, Elastic, Beanstalk, Heroku
• Processing
◦ R, Yahoo! Pipes, Mechanical Turk, Solr/Lucene, ElasticSearch,
Datameer, BigSheets, Tinkerpop
Internet of Things (IoT)
• Devices that collect and
transmit data over the
internet.
• Include technologies such as
smart homes, intelligent
transportation and smart
cities
• MIMOS Berhad (Malaysia's
national R&D centre in ICT
under purview of the
Malaysian Ministry of
Science, Technology and
Innovation (MOSTI) ) has
developed a Strategic Image Source: http://www.wordstream.com/blog/ws/2015/01/09/the-internet-of-things
Roadmap for IoT in Malaysia http://mimos.my/iot/roadmap2.html
Source: http://www.redtone.com/malaysia-unveils-iot-roadmap-expects-us11b-income-boost/
Source: http://2011trabzon.com/wp-content/uploads/2016/09/internet-of-things-examples-aiulgfp0.jpg
Your phone has been recording exactly where
you’ve been and how long you spent there.

https://www.buzzfeed.com/jimwaterson/your-iphone-knows-exactly-where-youve-been-and-this-is-how-
t?utm_term=.fmlGLZyWd#.np6dm85VQ
http://www.pcworld.com/article/2907061/4-ways-your-android-device-is-tracking-you-and-how-to-stop-it.html
Ok Google / Siri

Google Analytics tracks and reports website traffic.

• “Google Suggest” or “Autocomplete”


• Based on popularity
Source: http://paultan.org/
Data-Driven Geography
• More than 80% of the data kept by organizations worldwide
has a location component.
• By combining location-related data with other data,
organizations can gain critical insights, make better decisions,
and optimize important processes and applications.
• A data-driven geography may be emerging in response to the
wealth of georeferenced data flowing from sensors and
people in the environment.
Sources of geographically (and often
temporally) referenced data:
• Location-aware technologies such as the Global Positioning System
and mobile phones
• In-situ sensors carried by individuals in phones, attached to vehicles,
embedded in infrastructure, remote sensors by airborne and satellite
platforms, radiofrequency identification (RFID) tags attached to
objects
• Georeferenced social media
GIS and BIG DATA???
• Traditional GIS systems are often insufficient
for meaningful interpretation
• Collection of Geospatial Big Data
• Geographically-enabled social media
- Facebook, Twitter, Instagram
• As a geospatial expert, we are very interested in the location
component of data
• Able to answer and forecast questions such as WHERE? WHEN?
– Spatio-temporal analytics
WHY?
• Taps huge datasets for policy measures – GIS tools for Big Data
processing facilitate deep insights and predictive modelling for policy
making in health care, crime detection, disaster response and more.

• Supports spatial analysis of unstructured data in real-time – Maps


integrate unstructured data (e-mails, blogs, social media content, in-
store sensor data, meteorological data, driving times, etc.) in real-time.
This is useful for location analysis in retail, finance and insurance.

• Integrates multiple data layers for a complete picture – Huge amount


of data is pulled from different formats, devices or systems and given a
geographic context for a complete picture or analysis.
(cont..)
• Empowers Business Intelligence (BI) approach to businesses – The
convergence of Big Data and mapping leads to deeper insights,
profitability, lesser time-to-market, improved customer engagement
and better ROI.

• Enables spatiotemporal queries over big geospatial data – Case-by-


case query processing and data mining of huge spatiotemporal data is
possible for different projects.
SAMPLE ANALYTICS
Areas where geospatial technology has applied Big Data for enhanced analysis:
• Climate modelling and analysis
• Location analytics
• Retail and E-commerce
• Intelligence gathering
• Terrorist financing
• Aviation industry
• Disease surveillance
• Disaster response and early warnings
• Political campaigns and elections
• Banking
• Insurance and Fraud analysis
Greetings from
London
• Mapping of how you’d
say ‘hello’ in the most
frequently spoken
language aside from
English
• Each ‘hello’ has been
scaled to show the
percentage of people
in each area who use it
Source: http://mappinglondon.co.uk/2016/greetings-from-london/
• Vessels tracking and identification of illegal vessels/ immigrants
• Spatio-temporal analysis for congestion detection --making use of
voluminous traffic data to help alleviate congestion
3D Visualisation of traffic data
• California based data artist
Washington DC
Eric Fischer used data form
Twitter and Gnip to create
these heatmaps of major
cities around the world
illustrating the great divide
between where the
tourists and locals.
• Red represents tweets from
tourists while blue
symbolizes local tweets.
Source: http://www.rsvlts.com/2015/03/04/twitter-heatmaps/
Space-time scan statistics for crime pattern detection
Travel mode detection using GPS tracks of two months period
Conclusion
• 3 V’s of Big data- Velocity, Volume,
and Variety
• Mapping and analysis become
further complicated with the
explosion of disruptive
technologies like the cloud,
embedded sensors, mobile and
social media.
• IoT is a convergence of smart
devices that generate data
through internet.
• Data has spatial component
• How can you make use of Big Data
in your organization?
References/ Further Reading
• GIS in the Era of Big Data [ https://cybergeo.revues.org/27647]
• Looking Forward Again: Four Thoughts on the Future of GIS in 2015 and Beyond
[ http://www.esri.com/esri-news/arcwatch/0215/four-thoughts-on-the-future-
of-gis-in-2015-and-beyond ]
• Empowering GIS big data [ https://www.gislounge.com/empowering-gis-big-
data/ ]
• Miller, H. J., & Goodchild, M. F. (2015). Data-driven
geography. GeoJournal, 80(4), 449-461.
• IoT Strategic Roadmap
[http://www.mimos.my/iot/National_IoT_Strategic_Roadmap_Summary.pdf]
• Big Data Tutorial –[ http://www.planet-
data.eu/sites/default/files/presentations/Big_Data_Tutorial_part4.pdf ]

You might also like