Professional Documents
Culture Documents
Grant's Photo File Naming Scheme
Grant's Photo File Naming Scheme
Introduction
One of my main goals of this file naming scheme is for regular people who don't have
access to my database or don't know how to use it to be able to at least glean enough
information from the file name to find any images that are important to them. I know that
much of this information will be redundant to information that can be gleaned from the
folder structure and information that will be in both the database as well as the file's
metadata. I am trying to cover all of my bases here rather than put all my eggs in one
basket.
The system is designed with the Joliet CD file system in mind. This means that the file
name itself is limited to 64 characters and the total path + file name can't be longer than
255 characters when it is burned to CD. (It can be longer on the hard disk as long as you
plan to reduce the number of levels of subfolders when you actually burn it to the CD.)
Another requirement that must be met is that the serial number must be unique in and of
itself. This is because for many scanned images the date will not be known at the time of
the scan. If you start the sequence of serial numbers over for each day then there are
bound to be conflicts when you finally do assign a date to some scans.
• YYYY-MM-DD_CM#########_Description / People_HhHh_UseCode.extxx
1234567890123456789012345678901234567890123456789012345678901234
1 2 3 4 5 6
○ YYYY-MM-DD_ = Date
This is the date the picture was actually taken. The way I see it, the date I scanned the
image is irrelevant to some other person coming along later.
11 characters including the '_'.
If I don't know the full date then the unknown characters will be filled in with the tilde
(~) character. This is the little squiggly line that is on the top of the key to the left of the
the '1' (one) key (at least on American keyboards).
Some people have suggested that they use the numeral zero when they don't know
part of the date. This can cause confusion when you know the photo was taken some
some time in the 70's, say, but not the exact year. Using the zero system you would
enter the year as 1970 which would make most people think it was actually taken in
1970 rather than maybe 1973 or 1977. I chose the tilde for several reasons. Some
computers display a series of dashes as one continuous line so it is hard to count
how many dashes there are. Since the tilde makes a curve you can tell exactly how
many of them there are. Also, it is very close to the symbol used in mathematics that
means an approximation. In fact that symbol is just two tilde's stacked on top of one
another.
Some may argue that a tilde is not a recommended character to use in file names. But
But it is not forbidden in any current operating system. So I don't need to hear your
arguments against it. If you don't like it, use something else.
I thought about not including the day since most people don't need to know the exact
day a picture was taken but without it people might get confused. Most people can spot
spot a date with the full year, month, and day quite easily. I did not include the hour,
I have been reading many posts about image file organization. As a result, I started thinking
about how I would keep track of all the image files I will likely be generating. These files will
be coming directly from my digital camera, from my scanner, and as a result of modifications
made on the way to an ideal final image. It is possible for these modifications to follow
multiple possible paths as I experiment with what is an appropriate final adjustment. In
addition, many images may be the starting point for many different final images, such as when
when I decide to crop down to a single subject within the image and prep that for printing or
display. There could be an entire hierarchy of images based on other images as I step through
my work flow as well as multiple branches coming off from various points as I think of new
things to do with existing images that are themselves results of various stages of modification.
I quickly realized it would be imperative for me to use a concise, consistent naming system to
keep track of all the different variations. I have searched through old messages in almost a
dozen newsgroups, Googled till I was blue in the face, and even asked graphic artists I know. It
seems that no one has such a system.
Almost everyone just seems to guess their way through it as they go along, using incredibly
long files names, and trying to describe all their modifications within the name itself.
However, this leaves them with a directory full of randomly named files and no easy way to
sort them by hierarchy. Some people use deeply nested subfolders with descriptively named
directories. But this leaves them with a dozen or so identically named files and an incredibly
long path name. This then causes them trouble when they go to archive these off onto CD
because there are limits on the sub-folder depth and path name length.
So, I decided to put my incredible skills in organizational system design (yes, that is sarcasm)
to the test and see if I could come up with something. Two days, one white board, one Tablet
PC, and six mocha's later, I think I have a solution. I have spent the last few days just trying to
figure out how to describe it clearly.
Important Note:
I completely understand that many programs allow for lossless processing of images,
including many different branches of edits. However, not everyone has these programs,
programs, and not everything you may want to do will be possible within a single
program with these features. Therefore sometimes you may still need to save separate
copies of files. This system is to allow you to keep better track of those files.
This scheme is not meant to replace the entire file name. The characters comprising this
encoded data are meant to be inserted somewhere near the end of the file name. The main part
part of the file name would still use whatever system you use to serialize your original images.
I will call that part of the name your 'BaseName'. The BaseName will be the same for all images
that are the result of modifications to the same original image file.
The System:
The structure of the file name will be as follows...
BaseName_HhHhHh_UseCode.ext
Where...
• HhHhHh_ = the hierarchy code. This may be as long as necessary.
• UseCode = the use code (or a special modification of a final image for a particular
use such as simple resizing for e-mail or web use). This portion will only be
inserted into the file name for these special purpose files and only if you choose
to save them at all. You can use any codes you like and as many characters as you
like.
• .ext = the regular file extension. It's OK if it is longer than 3 characters.
The original file will be named 'BaseName_00_!Origin.ext'. The 00 and the exclamation point
causes it to sort to the top of the list of all the files based upon it. The 'Origin' just reminds you
or any casual observer that this is the original image file.
Naturally, this system won't work for 8.3 character DOS based file names. All you can squeeze
into that small space is a straight sequence of serial numbers anyway.
The hierarchy code consists of pairs of characters. The first character in a pair is a number and
the second character is a letter. For instance '1A' or '3H'. The number indicates the branch or
path followed from the original or parent image and the letter indicates how many sequential
steps along in the work flow it is. Each time you branch off from an existing sequence you add
an additional pair of characters. You will see how you can do a heck of a lot of variations and
still end up with hierarchy codes that are only 6 characters long.
For instance, let's say you are starting out with your original image file (BaseName_00_!
Origin.jpg) and you normally do several different things to it to start getting it ready for
use. You convert it to a .TIF file (BaseName_1A_Lossles.tif), then you despeckle it
(BaseName_1B_Despekl.tif), finally you adjust the color (BaseName_
1C_Color.tif). You can also choose to omit these intermediate use codes, leaving file
names such as: "BaseName_1A_.tif." Just be sure to leave in the trailing underscore, so
automated matching routines and regular expressions can match filenames more easily.
When you are satisfied, you create (BaseName_1C_!Final.tif). Later, once you are
finished with everything, you may choose to delete all the intermediate files, and maybe even
the original. That's up to you.
Now this may not be your normal work flow and some of these steps could even be done just
before using the image without even keeping the resulting image permanently. That is not the
Why use two characters instead of one? You will see in just a second.
Let's say that later you are either unsatisfied with the results of the first sequence of operations
operations or just want to take a different tack. So you decide to go back to the original file and
make a different sequence of adjustments from there. You do your modifications and name the
files with hierarchy codes 2A, 2B, 2C and so on. This would be a different branch of the
hierarchy. You could follow your muse along a yet another path and create 3A and 3B as well.
But tomorrow you have second thoughts. You're thinking that the 2B file might have been better
better than the 1B file so you open that one and start modifying from there. You save files 2B1A,
2B1A, 2B1B, and 2B1C. Upon reflection, that 2B1C file just isn't quite right so you go back to
2B1B, do it just a little differently and save 2B1B1A then finally 2B1B1B.
00
|
+---------------+-------------------------+
| | |
1A 2A 3A
| | |
1B2A --- 1B --- 1B1A 2B --- 2B1A 3B
| | | | |
1B2B 1C 1B1B 2C 2B1B --- 2B1B1A
| | | | |
1B2C 1B1C 2D 2B1C 2B1B1B
Naturally, you must be viewing the above using Courier font for it to look right. I threw in yet
another branch on the left of 1B to further illustrate the idea of multiple possible branches
from each point in the tree. Of course it is hard to illustrate more than two branches for each
point in a text file but you can see that you could have as many branches from any one point as
there are different codes for the first character.
If you really get crazy you can even go past 9 and use letters for the first character as well but
then it would be harder to read. If you do this you might want to use upper case for the first
character and lower case for the second character in each pair.
BaseName_00_!Origin.jpg
BaseName_1A_.tif
BaseName_1B_.tif
BaseName_1B1A_.tif
BaseName_1B1B_.tif
BaseName_1B1C_.tif
BaseName_1B2A_.tif
BaseName_1B2B_.tif
BaseName_1B2C_.tif
BaseName_1B2D_.tif
BaseName_2A_.tif
BaseName_2B_.tif
BaseName_2B1A_.tif
BaseName_2B1B_.tif
BaseName_2B1B1A_.tif
BaseName_2B1B1B_.!Final.tif
BaseName_2B1B1B_.EMail.JPG
BaseName_2B1B1B_.P9x13.TIF
BaseName_2B1B1B_.Screen.JPG
BaseName_2B1B1B_.Web-S.GIF
BaseName_2B1C_.tif
BaseName_2C_.tif
BaseName_2D_.tif
BaseName_3A_.tif
BaseName_3B_.tif
Notice how you can actually see the branching structure just by
looking at the file names.
After you have stewed on all these edits a while you may decide you
don't need to keep all these huge image files around forever. You can just
delete all of the dead end branches if you want to. You could even
delete all the steps up to your final image. If you
didn't want to be reminded that you had been so darn wishy-washy you
could even rename the file restarting the hierarchy code at 1A or just
replacing it leaving a file name like BaseName_00_!Final.tif with the
additional special purpose formats renamed similarly. (Note: You still keep the _00
part to make it easier to use the same regular expressions for all of your file names.)
I posted this in another forum and all I got was abuse. People have a tendency to
make fun of me for making up complicated systems when they think they aren't
needed. They start giving me lectures about the KISS principle and Ozcam's Razor.
Well, I can also quote Einstein who said that something should be as simple as
possible but no simpler.
Please remember, I am a 60 year old former computer consultant who has studied
database normalization, quantum physics, complexity theory, and chaos theory, as
well as invented a new form of geometry and laser. I don't need the lectures. I have
done my share of simplifying other systems that people thought were simple because
it didn't take very many words to describe them but, in reality, had grown to a huge,
unmanageable, mess because their few words did not cover the realities of life. (kind
of like that sentence) Ask any software developer and they will tell you, in order to
make a program simple to use it must be more complicated under the surface.
There are going to be people replying to this message saying they just use a simple
system of typing out every modification they have made in the file name. It sounds
good but it quickly gets to where you can't tell which file is a modification of which
previous file. Others will tell me that they just keep track of everything in one
PhotoShop .PSD file. Well, that's great if you can afford PhotoShop. Even if I could,
there is a huge learning curve to using all the layers to control the modifications
rather than just save multiple files. Besides, not everything can be done in PhotoShop.
PhotoShop.
Other people will tell me to just create different folders or categories for the different
stages of my work. It would be great if I knew in advance that I would always make
the exact same series of edits to all of my images in lock-step fashion. But it also
sounds pretty boring. I needed a system that would allow me to keep track of the
order in which files were modified and which files were based on which other ones
no matter where my creative thoughts took me. And I have found one.
I am not trying to tell anyone that they have to use my system just by making this
post so I would appreciate it if people didn't try to tell me I was doing something
intrinsically wrong by using it. Besides, I worked really hard on it so be nice.
People often think that just driving down the straightest road is the simplest way to
get to Rome. What they forget is they are only able to do that because someone else
took a century worth of technology, used it to manufacture thousands of incredibly
complicated parts, that all work together in precision harmony, burning gas that took
millions of years to create and another complicated refinery to produce, driving on a
road that took dozens of complicated machines which are even larger and have more
parts than the car itself, just to be able to drive down the road. So don't talk to me
about keeping things simple.
That is what I am designing my file name system for. For those people who may
come along later and won't have access to my database and won't know how to build
one from my metadata. The data will be there so a more astute individual will be able
to use it. But I want anyone who can boot up a computer to at least be able to glean a
little bit of information from the folder structure and file names alone.
For important family pictures I will probably also put text files that have even more
information right in with the picture files. I may eventually write a script that will do
it automatically. Then I will have the best of both worlds and have all my bases
covered too.
Copyright:
This entire document is copyright 2020 by Grant S. Robertson. It may not be
reprinted, posted to a web site, stored in any electronic archive other than this forum,
forum, taught in any class or seminar, nor may it be incorporated into any software
product, without my prior written permission. Individuals are hereby granted the
right to use this system for their personal filing needs and to tell their friends. In
other words, I am happy to share this idea with fellow photographers but if you are
going to be making money off of it then I get a cut. As people who generate intellectual
intellectual property every time they press the shutter release, I imagine you will
understand.