Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Cheatsheet dvc.

org/doc

Getting started Data versioning External data Experiments


Install DVC Start tracking files Both import and get download the Run a new experiment
pip install dvc[<dependency>] dvc add <file/directory> data. import also tracks it with DVC. dvc exp run
Optionally use a remote dependency: git add . & git commit -S ‘<param>=<value>’
s3, azure, gdrive, gs, oss, ssh, all Download from DVC project Use --queue to add to queue
Update tracked files dvc import <url> <path>
Initialize dvc add <file/directory> dvc get <url> <path> Run experiment queue
dvc init dvc push (if using remote) dvc queue start
Use -f to overwrite an existing DVC cache git add . & git commit Download from URL (e.g. S3)
dvc import-url <url> <out> Show experiment table
Troubleshooting Switch data version dvc get-url <url> <out> dvc exp show
dvc doctor --v git checkout <commit>
We have a Troubleshooting section in the Apply experiment to workspace
dvc checkout
docs. You can also get help on Discord.
Pipelines dvc exp apply <exp>
Show status tracked files
Remotes dvc data status Pipelines are defined in dvc.yaml Create branch from experiment
and parameters in params.yaml dvc exp branch <exp> <branch>
Add a remote
dvc remote add <name> <url> Show differences commits
dvc diff Create a new pipeline Push to and pull from remotes
Modify a remote dvc stage add <...> dvc exp push <branch> <exp>
dvc remote modify <name> Remove unused files from cache Or edit dvc.yaml dvc exp pull
<option> <value> dvc gc
Add a stage Remove experiment
dvc exp remove <exp>
Push to and pull from remote File structure dvc stage add
dvc push -n <name> -d <dependency>
DVC moves files under its control -o <output> -p <parameter> Show and compare metrics or plots
dvc pull
to the .dvc/cache. It then creates <command to execute> dvc metrics show
.dvc files for each directory and file. Or edit dvc.yaml dvc metrics diff <exp1> <exp2>
Fetch from remote
The files in your workspace are dvc plots show
dvc fetch View pipeline DAG
replaced with reflinks to the cache. dvc plots diff <exp1> <exp2>
Downloads data from remote like dvc pull
dvc dag Use --open to open the plots in browser
but doesn’t place data in workspace.
Inside the cache, DVC uses its own
Reproduce pipeline ⭐ Also try the DVC extension for
structure based on file hashes. This
dvc repro Visual Studio Code for easier
lets it avoid file duplication.
Use -f to run the entire pipeline experiment management!

You might also like