Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 7

Data goodness

Mostly in black and white By Dom

You must love your data!


Lost data :
Current imaging data in BRIC cost ~5.1M, just for scanning costs! (2011)

no research
no publications
no jobs no PhDs! Sad Dom

Look after your data!


It looks after you
Happy Dom

Data Storage
Home directories:
ISIS home, U Home Not for large amounts of imaging data

Projects directory
ISIS, V: Big stuff goes here

If you require large amounts of space


E.g. > 50 GB
LET ME KNOW IN ADVANCE!

Server goodness
Why is the server a good place to store data?
Mirror and parity - some errors - data can be easily recovered

BACKUPS:
Tape backups, daily - 1 month retention if you have funding, processed data can be mirrored off site raw data is always mirrored offsite (ECDF) by default

Desktop PC's
not reliable - no mirroring, no parity - some errors - data is lost (Often all of it) Network backups often fail
Machines turned off, Network busy moving to a new system when I get time!

Data love
Curation: Do this as you work!
Plan your data use

Use meaningful folder names Make 'README.txt' files with dates, names of students/employees involved, references to software, scripts and versions, purpose of experiment/processing. Be tidy with your data - tidy up occasionally Friday afternoon - quick tidy up Big tidy up at end of experiment/ project/ phase/ year
BE CAREFUL, dont rush

Data, spreadsheets, databases


Anonymisation *** Repatriation keys***

Code and Scripts


Coding:
Testing
Make sure that the software you are using does exactly what you think it does! Check every step for every image!

Do not use hard coded paths


Use versioning software (ECDF)

Safe data is Happy data!

You might also like