Professional Documents
Culture Documents
Scientific Discovery
Scientific Discovery
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
TonyHey AnIntroduction
CommanderoftheBritishEmpire
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
ADigitalDataDelugeinResearch
Data collection
Sensor networks, satellite
surveys, high throughput
laboratory instruments,
observation devices,
supercomputers, LHC
Data processing,
analysis, visualization
SensorMap
Functionality: Map navigation
Data: sensor-generated temperature, video
camera feed, traffic feeds, etc.
Archiving
Digital repositories,
libraries, preservation,
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Scientific visualizations
NSF Cyberinfrastructure report, March 2007
EmergenceofaFourthResearchParadigm
1.
2.
3.
4.
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Astronomyhasbeenoneofthefirstdisciplinestoembrace
dataintensivesciencewiththeVirtualObservatory(VO),
enablinghighlyefficientaccesstodataandanalysistools
atacentralizedsite.Theimageshowsthe
PleiadesstarclusterformtheDigitizedSkySurvey
combinedwithanimageofthemoon,
synthesizedwithintheWorldWide Telescopeservice.
Sciencemustmovefromdatato
informationtoknowledge
WiththankstoJimGray
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
http://research.microsoft.com/fourthparadigm/
TheimpactofJimGraysthinkingiscontinuingto
getpeopletothinkinanewwayabouthowdata
andsoftwareareredefiningwhatitmeanstodo
science."
BillGates,Chairman,MicrosoftCorporation
Oneofthegreatestchallengesfor21stcentury
scienceishowwerespondtothisneweraof
dataintensivescience.Thisisrecognizedasanew
paradigmbeyondexperimentalandtheoretical
researchandcomputersimulationsofnatural
phenomenaonethatrequiresnewtools,
techniques,andwaysofworking.
DouglasKell,UniversityofManchester
Thecontributingauthorsinthisvolumehave
doneanextraordinaryjobofhelpingtorefinean
understandingofthisnewparadigmfroma
varietyofdisciplinaryperspectives.
GordonBell,MicrosoftResearch
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Listed7keyareasforactionbyFundingAgencies:
1.Fundbothdevelopmentandsupportofsoftware
tools
2.Investatalllevelsofthefindingpyramid
3.Funddevelopmentofgeneric Laboratory
InformationManagementSystems
4.Fundresearchintoscientificdatamanagement,
dataanalysis,datavisualization,newalgorithms
andtools
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Remainingthreekeyareasforactionrelateto
thefutureofScholarlyCommunicationand
Libraries:
5.EstablishDigitalLibrariesthatsupporttheother
sciencesliketheNLMdoesforMedicine
6.Funddevelopmentofnewauthoringtoolsand
publicationmodels
7.Exploredevelopmentofdigitaldatalibraries
thatcontainscientificdata(notjustthe
metadata)andsupportintegrationwith
publishedliterature
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Developing a Sustainable
e-Infrastructure
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Acceleratingtimetoinsight
withAdvancedResearchToolsandServices
Ourgoalisto accelerateresearchbycollaboratingwith
academiccommunitiestouseadvancedcomputer
scienceresearchtechnologies
AimtohelpscientistsspendlesstimeonITissuesand
moretimeonsciencebycreatingopentoolsand
servicesbasedonMicrosoftplatformsandproductivity
software
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
DataAcquisitionandModeling
TheSwissExperiment
PowerfulSoftwareImprovesEnvironmental
Forecasting
Environmentalscientistsfacemanychallenges
inmonitoringandunderstandingourplanets
changingclimate.Throughaninternational
collaborationcalledtheSwissExperiment,
environmentalscientistsandcomputerscience
expertsaredeployingadvancedsensornetworks
anddatamanagementtoolstoimprove
environmentalmonitoringandforecasting.
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
LifeUnderYourFeet
ResearchersatTheJohnsHopkinsUniversity
aredeployinglargearraysofwirelesssoil
sensorsinavarietyofenvironmentalsettings,
includingapark,anurbanforestandawetland.
Thenetworksenablescientiststomonitor
ecologicalchangesonanunprecedentedscale
andofferinsightsintohydrology,greenhouse
gasesandtheactivityoforganismsinthesoil.
CollaborationandVisualization
ResearchInformationCenter
Collaborationandinformationsharingamong
researchersareamongthemostimportantbut
challengingaspectsofscientificresearch.In
recentyears,scientistshavebegunusing
virtualresearchenvironments toexchange
informationwithcolleaguesinspecificareasof
study.MicrosoftResearchandTheBritish
LibraryareteaminguptobuildtheResearch
InformationCentre.
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
SciScope SpeedsDataRetrievalfrom
MultipleRepositories
Forenvironmentalscientistsandengineers,
findingandretrievingrelevantdatacanbea
dauntingandtedioustask.MicrosoftResearchis
developinganonlinesearchenginecalled
SciScope thatenablesresearcherstosearch
multipledatarepositoriessimultaneouslyand
retrieveinformationinaconsistentformat.
AnalysisandDataMining
Trident
AScientificWorkflowWorkbenchBringsClarity
toData
ScientistsattheUniversityofWashingtonare
workingwithMicrosoftExternalResearchto
demonstratehowmarryingvisualizationand
workflowtechnologiescanallowresearchersto
bettermanage,evaluateandinteractwitheven
themostcomplexscientificdatasets.
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
PhyloD
StatisticaltoolusedtoanalyzeDNAofHIVfrom
largestudiesofinfectedpatients
Typicaljob,10 20CPUhourswithextreme
jobsrequiring1K 2KCPUhours
VeryCPUefficient
Requiresalargenumberoftestrunsfora
givenjob(1 10Mtests)
Highlycompresseddataperjob(~100KB
perjob)
DisseminateandShare
Chem4Word
ChemistryDrawinginWord
CreatedincollaborationwithUniversityof
Cambridge;PeterMurrayRust,et.al.
Intent:Recognizes
chemicaldictionaryand
ontologyterms
Author/edit1Dand2Dchemistry.
Changechemicallayoutstyles.
Data:Semantics
storedinChemistry
MarkupLanguage
<?xml
version="1.0"?>
<?xml version="1.0"?>
<cml
<cml version="3"convention="orgsynthreport"xmlns="http://www.xmlcml.org/schema">
version="3"convention="orgsynthreport"xmlns="http://www.xmlcml.org/schema">
<molecule
<molecule id="m1">
id="m1">
<atomArray>
<atomArray>
<atom
<atom id="a1"elementType="C"x2="2.9149999618530273"y2="0.7699999809265137"/>
id="a1"elementType="C"x2="2.9149999618530273"y2="0.7699999809265137"/>
<atom
<atom id="a2"elementType="C"x2="1.5813208400249916"y2="1.5399999809265137"/>
id="a2"elementType="C"x2="1.5813208400249916"y2="1.5399999809265137"/>
<atom
<atom id="a3"elementType="O"x2="0.24764171819695613"y2="0.7699999809265134"/>
id="a3"elementType="O"x2="0.24764171819695613"y2="0.7699999809265134"/>
<atom
<atom id="a4"elementType="O"x2="1.5813208400249912"y2="3.0799999809265137"/>
id="a4"elementType="O"x2="1.5813208400249912"y2="3.0799999809265137"/>
<atom
<atom id="a5"elementType="H"x2="4.248679083681063"y2="1.5399999809265137"/>
id="a5"elementType="H"x2="4.248679083681063"y2="1.5399999809265137"/>
<atom
<atom id="a6"elementType="H"x2="2.914999961853028"y2="0.7700000190734864"/>
id="a6"elementType="H"x2="2.914999961853028"y2="0.7700000190734864"/>
<atom
<atom id="a7"elementType="H"x2="4.248679083681063"y2="1.907348645691087E8"/>
id="a7"elementType="H"x2="4.248679083681063"y2="1.907348645691087E8"/>
<atom
<atom id="a8"elementType="H"x2="1.0860374036310796"y2="1.5399999809265132"/>
id="a8"elementType="H"x2="1.0860374036310796"y2="1.5399999809265132"/>
</atomArray>
</atomArray>
<bondArray>
<bondArray>
<bond
<bond atomRefs2="a1a2"order="1"/>
atomRefs2="a1a2"order="1"/>
<bond
<bond atomRefs2="a2a3"order="1"/>
atomRefs2="a2a3"order="1"/>
<bond
<bond atomRefs2="a2a4"order="2"/>
atomRefs2="a2a4"order="2"/>
<bond
<bond atomRefs2="a1a5"order="1"/>
atomRefs2="a1a5"order="1"/>
<bond
atomRefs2="a1a6"order="1"/>
<bond atomRefs2="a1a6"order="1"/>
<bond
atomRefs2="a1a7"order="1"/>
<bond atomRefs2="a1a7"order="1"/>
<bond
<bond atomRefs2="a3a8"order="1"/>
atomRefs2="a3a8"order="1"/>
</bondArray>
</bondArray>
</molecule>
</molecule>
</cml>
</cml>
Intelligence:Verifiesvalidityof
authoredchemistry
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Relationships:Navigateand
linkreferencedchemistry
DisseminateandShare
OntologyPlugInforWord
Services: Ontology
downloadwebservice
JohnWilbanks
PhilBourne
LynnFink
Intent:Term
recognition
&disambiguation
Relationships:
Ontology
browser
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
ArchivingandPreservation
DefaultwebUIwithCSS
supportandcustomASP.Net
controls
Zentity
NativesupportforRSS,OAIPMH,OAI
ORE,AtomPubandSWORD
Flexibledatamodel
enablesmanyscenarios
andcanbeeasilyextended
overtime
ThisworkislicensedunderaCreativeCommons
Asemanticcomputingplatformtostore
Attribution3.0UnitedStatesLicense.
andexposerelationshipsbetweendigital
assets
ArchivingandPreservation
oreChem theChemicalSemanticWeb
Semanticstorage
Mashup(reuse)data
experiments
text
documents
measurements
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
data
scientists
molecules
data
Compounddocument
authoring
molecules
Networkanalysisisofgrowing
importanceinacademic,
commercial,andInternet
socialmediacontexts
ExistingSocialNetworkTools
arechallengingformany
noviceusers
ToolslikeExcelarewidely
used
Leveragingaspreadsheetasa
hostforSocialNetwork
Analysislowersbarriersto
networkdataanalysisand
display
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Leveragespreadsheetforstorageofedgeandvertexdata
Applydynamicfilterstothedata
Intent:InsertCreativeCommons
licensesfromwithinOffice2007
Services:Integrateswith
CreativeCommonsWebAPI
tocreatenewlicenses
Relationships:licenseinformation
storedasRDFXMLwithinthe
documentOOXML
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
http://ccaddin2007.codeplex.com
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
StatisticaltoolusedtoanalyzeDNAofHIVfrom
largestudiesofinfectedpatients
PhyloD wasdevelopedbyMicrosoftResearchand
hasbeenhighlyimpactful
Smallbutimportantgroupofresearchers
100sofHIVandHepC researchersactivelyuseit
1000sofresearchcommunitiesrelyontheseresults
CoverofPLoS Biology
November2008
PhyloD nowportedasWindowsAzureCloudService
Cloudenablesagiledeploymentofscalablescientificservices
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
CourtesyofRogerBarga
Sciencepipelinefordownload,initialprocessing
andreductionofsatelliteimagery.Developedby
MSR,UVa,UCB.
Dramaticallylowersresourceandcomplexity
barrierstousesatelliteimageryforterrestrial
hydrologyandgeoscience.
Inusenowtocompute10yearcontinentalscale
waterbalanceforNorthAmerica.Peryear:
Commonimagerylocationdeterminationand
uploadfromdiversesources
Commonreprojection andharmonizationto
producesciencereadyimagerywiththesame
length,timeandqualityattributes
Optionalscientistprovidedreductionalgorithm
(.NET,Java,orMatLab)
Ondemandscalabilitybeyondlocaldesktopor
cluster
CatharinevanIngen(MSR),JieLi,MartyHumphreys
(UVA),YoungryelRyu(UCB),DebAgarwal(BWC/LBL)
SourceImageryDownloadSites
...
Source
Metadata
DataCollectionStage
RequestQueue
AzureMODIS
ServiceWebRolePortal
Reprojection Stage
500GB(~60Kfiles)uploadof9differentsource
imageryproductsfrom15differentlocations
400GBreprojected harmonizedimagery
consuming~3500cpu hours
5GBreducedscienceresultleveragingreported
fielddataaggregatesconsuming~60cpu hour
ReductionQueue
Additionalsciencerequestspending
ExpandingabovetoEurope
Additionalsourceimageryproductsand
formats
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Reprojection Queue
Analysis/ReductionStage
Scientist
ScientificResults
LedbyNewcastleUniversity,UK(PaulWatson),
projectsupportedbyER
Investigatingapplicabilityofcommercialcloudsforscientific
research
Buildaworkingprototypeforusecasesinchemo
informatics
UsesMicrosofttechnologiestobuildsciencerelated
services(WindowsAzure,Silverlight)
Builtinitialproofofconcept
Silverlight UIforbasicQuantitativeStructure
AnalysisRelationship(QSAR)modeling
DemonstratedabilitytoscaleQSARcomputations
inWindowsAzure
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Data/informationisinter
connectedthroughmachine
interpretableinformation(e.g.
paperX isaboutstarY)
Socialnetworksareaspecialcase
ofdatameshes
Aknowledgeecosystem:
Aricherauthoringexperience
Anecosystemofservices
Semanticstorage
Open,Collaborative,
Interoperable,andAutomatic
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
Attribution:ChrisBizer
VisionofFutureResearch
eInfrastructureusing
Client+Cloudresources
visualizationand
analysisservices
scholarly
communications
search
books
citations
domainspecificservices
blogs&
socialnetworking
Reference
management
instant
messaging
identity
Project
management
mail
notification
documentstore
storage/data
services
knowledge
management
knowledge
discovery
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
compute
services
virtualization
Thesitecontainsaccessanddownloadsofrelevantopentoolsand
resourcesfortheworldwideacademicresearchcommunity.Examplesof
ouropentoolsandservices:
PluginsforOffice
OntologyAddinforWord
ArticleAuthoringAddinforWord
Chem4Word ChemistryDrawinginWord
MicrosoftBiologyFoundationMBF
Enablesandacceleratesfundamentaladvancesinbiology
F#
CollaborationwiththeacademicandresearchcommunityonF#stypedfunctionaland
objectorientedprogrammingonthe.NETplatform
SoftwareEngineeringTools
Spec#:ProgramverifierforC#extendedwithdesignbycontract
VCC:ProgramverifierforConcurrentC
PEX:automaticunittestingtoolfor.NET
CHESS:UnittestingtoolsforconcurrentWin32executableand.NET
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
MicrosoftResearch
http://research.microsoft.com
MicrosoftResearchdownloads:http://research.microsoft.com/research/downloads
MicrosoftExternalResearch
http://research.microsoft.com/externalresearch
ScienceatMicrosoft
http://www.microsoft.com/science
CodePlex
http://www.codeplex.com
TheFacultyConnection
http://www.microsoft.com/education/facultyconnection
MSDNAcademicAlliance
http://msdn.microsoft.com/enus/academic
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.
ThisworkislicensedunderaCreativeCommons
Attribution3.0UnitedStatesLicense.