Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 16

UCRL-PRES-148116

The Earth System Grid


Presented by
Dean N. Williams

PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean


Williams (LLNL)
http://www.earthsystemgrid.org

Presented at:
The “EO GRID” Workshop
Frascati, Italy

May 6, 2002 Earth System Grid - Williams


Earth System Grid (ESG):
Overview
 Funded by the Scientific Discovery through Advanced Computing
(SciDAC), this program seeks a new paradigm in the climate
change community evolving from centralized data sharing to
distributed data-sharing.

 Enabling geographically distributed teams of researchers to


effectively and rapidly acquire knowledge and understanding of
massive amounts of climate data holdings.

 Multiple interfaces to ESG will allow researchers to focus on science


and not issues with data receipt, format, and data set manipulation.

May 6, 2002 Earth System Grid - Williams


ESG: Why is ESG Important to the
U.S. Climate Change Program
 Climate model output and quality observations are vital to
providing timely assessments of climate change and impacts.
 Recent U.S. and IPCC assessment efforts made it clear the lack
of accessibility to model simulations is a major problem for
future assessments.
 Access to retrospective climate data (input and output) needed
to enable a feedback mechanism to tie researchers directly back
to quality control and diagnostics of models.
 Researchers require access to “format independent” climate and
observational data for case-study & training.
 In the U.S., climate simulation can be viewed as a systems
problem, requiring a team of multi-agencies and institutions
working together in collaboration.

May 6, 2002 Earth System Grid - Williams


ESG: U.S. Collaborations &
Development
ANL: Computational grids,
& grid-based applications

LBNL: Climate storage


facility

LLNL: Model diagnostics


& inter-comparison

USC/ISI: Computational grids,


& grid-based applications

NCAR: Climate change LANL: Next generation ORNL: Climate storage &
predication and scenarios coupled models & computing computational resources

May 6, 2002 Earth System Grid - Williams


ESG: Requirements & Priority
Matrix ESG Developer ESG Administrator ESG User
ESG Services:
Framework H H H
Automatic Installation L L H
Distributed Computing
Authorization & AuthenticationH H M
Registration H H L
Event Services L L M
Task Management L L L
Logging Services L H H
Data Systems
Search and Discovery M H H
data movement (transport) L H H
meta-data framework H H M
collaboratories M L H
Tools
analysis M M H
visualization L L H
collaboration M M H

L = LOW, M = MEDIUM, H = HIGH


May 6, 2002 Earth System Grid - Williams
ESG: U.S. Department of Energy
(DOE) Next Generation Internet
(NGI) Project
 ESG-I (past):
 Focused on developing techniques for the high-speed
data movement between sites and users (e.g., the
secure highly efficient File Transfer service, called
gridFTP, developed by ANL (i.e., Globus))
 Developed replica catalogs for keeping track of data
locations
 Developed request manages for coordinating multiple
transfers
 Developed a grid-enabled version of LLNL’s data
analysis package
May 6, 2002 Earth System Grid - Williams
ESG: ESG-I Architecture
LDAP Metadata
PCMDI Catalog
application
n Replica
Catalog
Disk LDAP
Cache text ANL
Request
Manager Network
LDAP
Weather Service
GridFTP
CLIENT everywhere
CORBA

GSI-pftpd GridFTP
HRM GridFTP
tape system
Disk tape system
Cache Disk
Cache
SDSC ANL
LBNL­PDSF
GridFTP GridFTP GridFTP
Disk Disk Disk
Cache Cache Cache

ISI NCAR LBNL­Clipper

May 6, 2002 Earth System Grid - Williams


ESG: ESG-I Team Presented their
work at Supercomputing 2001

RAID
LDAP/Server
Metadata
LDAP/Sever Catalog
Metadata
Catalog
CLOUD Network SC ‘01

LLNL

I N Local
A
U
Disks
R R
&
TE

V
LDAP/Sever LDAP/Sever
Metadata Metadata
Catalog Catalog
LBNL ANL
tape system parallel disk system tape system parallel disk system

May 6, 2002 Earth System Grid - Williams


ESG: DOE SciDAC Project
 ESG-II (present):
 Building upon the substantial work of ESG-I
 Grid-wide services supporting authentication, authorization, data
discovery, and user specified analysis
 Metadata services supporting remote data browsing, querying,
accessing, displaying, etc.
 Filtering services performing intelligent model specific analysis
before delivering the results to the user
 Integrate next-generation data analysis and visualization
applications (such as ongoing work at LLNL and NCAR), web-
based data portals and other thin clients supporting the
Distributed Oceanographic Data System (DODS), and
collaborative problem-solving environments.

May 6, 2002 Earth System Grid - Williams


ESG: ESG-II Architecture

May 6, 2002 Earth System Grid - Williams


ESG: Metadata Services
ESG CLIENTS API   PUBLISHING ANALYSIS & VISUALIZATION
& USER INTERFACES
SEARCH & DISCOVERY ADMINISTRATION BROWSING & DISPLAY

HIGH LEVEL METADATA SERVICES
METADATA METADATA METADATA & DATA  METADATA METADATA
EXTRACTION ANNOTATION REGISTRATION BROWSING QUERY

METADATA METADATA METADATA METADATA


AGGREGATION VALIDATION DISPLAY DISCOVERY

CORE METADATA SERVICES
METADATA ACCESS SERVICE TRANSLATION
(update, insert, delete, query) LIBRARY

METADATA HOLDINGS

Data & mirror
Dublin Core Dublin Core COARDS COMMENTS
Metadata Database Database XML Files
Catalog XML Files

May 6, 2002 Earth System Grid - Williams


ESG: Collaboration Network
ESG services: information, replica,
Data consumers metadata, community authorization
? R
M
CAS

Grid and Network


Infrastructure
Computational
resources

Online
storage systems
Data producers
May 6, 2002 Earth System Grid - Williams
ESG: Example of a Web-based Data
Portal (currently serving 40+ simulations of
AMIP, CMIP, and PCM data, and growing)

May 6, 2002 Earth System Grid - Williams


ESG: Example of a Client
Application

May 6, 2002 Earth System Grid - Williams


ESG: Example of a Script Access
 The next-generation language, Python, is used to access the Earth
System Grid at LLNL

Import cdms

db = cdms.open(“ldap://localhost:389/database=demo,ou=PCMDI,o=LLNL,c=US”)
f = db.open( “ncep_reanalysis_mo”)
ds = f(‘ts’)

May 6, 2002 Earth System Grid - Williams


ESG: Concluding Statements
 ESG is a highly collaborative effort and will allow users to quickly
access data storage facilities storing petabytes of raw or processed
data in an application independent manner.

 Payoffs of this distributed collaborative infrastructure, would include:


 distributed data-sharing
 Simplified data discovery of climate data
 Large-scale climate data processing and analysis
 Increased collaboration among climate research scientists
 Aid in climate assessments and estimates of future climate variability
and trends

 For more information on ESG, visit our website at:


http://www.earthsystemgrid.org

May 6, 2002 Earth System Grid - Williams

You might also like