Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

File Name Scheme for Digital Images

Wednesday, June 10, 2020


12:41 PM

Introduction
One of my main goals of this file naming scheme is for regular people who don't have
access to my database or don't know how to use it to be able to at least glean enough
information from the file name to find any images that are important to them. I know that
much of this information will be redundant to information that can be gleaned from the
folder structure and information that will be in both the database as well as the file's
metadata. I am trying to cover all of my bases here rather than put all my eggs in one
basket.

The system is designed with the Joliet CD file system in mind. This means that the file
name itself is limited to 64 characters and the total path + file name can't be longer than
255 characters when it is burned to CD. (It can be longer on the hard disk as long as you
plan to reduce the number of levels of subfolders when you actually burn it to the CD.)

Another requirement that must be met is that the serial number must be unique in and of
itself. This is because for many scanned images the date will not be known at the time of
the scan. If you start the sequence of serial numbers over for each day then there are
bound to be conflicts when you finally do assign a date to some scans.

Here is the basic structure of the system:

• YYYY-MM-DD_CM#########_Description / People_HhHh_UseCode.extxx
1234567890123456789012345678901234567890123456789012345678901234
1 2 3 4 5 6
○ YYYY-MM-DD_ = Date
 This is the date the picture was actually taken. The way I see it, the date I scanned the
image is irrelevant to some other person coming along later.
 11 characters including the '_'.
 If I don't know the full date then the unknown characters will be filled in with the tilde
(~) character. This is the little squiggly line that is on the top of the key to the left of the
the '1' (one) key (at least on American keyboards).
Some people have suggested that they use the numeral zero when they don't know
part of the date. This can cause confusion when you know the photo was taken some
some time in the 70's, say, but not the exact year. Using the zero system you would
enter the year as 1970 which would make most people think it was actually taken in
1970 rather than maybe 1973 or 1977. I chose the tilde for several reasons. Some
computers display a series of dashes as one continuous line so it is hard to count
how many dashes there are. Since the tilde makes a curve you can tell exactly how
many of them there are. Also, it is very close to the symbol used in mathematics that
means an approximation. In fact that symbol is just two tilde's stacked on top of one
another.
Some may argue that a tilde is not a recommended character to use in file names. But
But it is not forbidden in any current operating system. So I don't need to hear your
arguments against it. If you don't like it, use something else.
I thought about not including the day since most people don't need to know the exact
day a picture was taken but without it people might get confused. Most people can spot
spot a date with the full year, month, and day quite easily. I did not include the hour,

Organize Photographs Page 1


minute, and second because it would make the file name a lot longer without conveying
conveying much additional information that a regular person would care about. Nor
can I really rely on the time to give me unique file names as other people have
suggested. Within just a couple of years I fully expect to get a Digital SLR capable of
taking 8 or more pictures per second. Yes I could also put the hundredths of a second in
in the file name but that would really make the file name long (a minimum of 8
characters longer). And it still wouldn't solve the problem of having unique numbers
for scanned, older images without making up a unique time to assign to each image that
that was taken on the same day. That would get really tedious really fast.
○ CM########_ = Serial Number of the image.
□ 11 characters including the '_', for a total of 22 so far.
□ The number follows the following format:
 C = a one character code representing the Class of the image.
Think of these as a series of tests, rather than distinct categories. The first 'test' to
match is the one that is used.
□ M = the image was created by Me.
□ F = the image was not created by me and is a picture of part of my Family.
□ A = the image was not created by me and is a picture of (or by) one of my friends or
Associates.
□ O = the image is something Other than any of the above.
Other letters could optionally be used here to indicate other major classes such
such as: W = Work, P = Personal, W = Wedding. Do not use these letters in
place of keywords. Just to group major categories of images. Because I do not
do professional "Wedding Photography" I would not use that Class code even if
if I did happen to take some pictures at a wedding. I would just use keywords
for that.
 M = a one character code representing the Media involved
□ D = Digital camera
□ I have decided that a regular person would not care which digital camera was
used to take the picture so it is not in the filename.
There is no special letter for panoramas, unless they are composites of
multiple other images (see below).
□ O = 360° image.
□ This is the capital letter 'O'. It was chosen to represent viewing all the way
around in a circle.
□ This will be especially important because all 360° images either have multiple
components or have multiple steps in the processing, to get from what the
camera outputs to what viewing software can display. The stage of said
process is encoded in the UseCode part of the file name. This is so all of the
first part of the file name (up to but not including the Hierarchy Code) can be
used to match up all the stages (or versions) of the same file.
□ Q = 360° Video.
□ This is similar to a 360° image, but with some extra. (OK, so I had to get
creative with the letter codes.)
□ S = Screenshot
□ N = scanned Negative
□ It is safe to assume that a regular person will really want to know if a physical
copy of the image is available and whether that physical copy is a slide,
negative, or print.
□ T = scanned Transparency

Organize Photographs Page 2


□ Since slide pages won't fit in the cases I use for negative pages they have to be
stored in a different place.
□ Decided to use 'S' for Screenshots, since there will be far more of those.
□ P = scanned Print (when no negative or slide is available)
□ C = Composite image (created from one or more of the other images)
□ The notes for the image in the database will indicate which other images were
used to create this image.
□ V = Video
□ This includes screen-capture videos (because they are truly just videos), unless
they contain additional data like mouse movements, in which case use a…
□ W = Screen-Capture Video that includes additional information, such as from
Camtasia.
A 'W' is a 'V' with something extra.
□ A = Audio
□ G = Graphics
□ Including animations.
 ######## = 9 digit serial number.
□ This number will be unique among all of my entire system. This way I will be able
search on that one short string and find all images based on that one original image.
□ I know nine digits is a lot. You will see why I used so many in a minute.
 From Digital Camera:
□ Simple serial number.
 Slides and Negatives taken by me:
□ Since the slides and negatives that I already have are marked with the roll and
negative number I have decided to stick with that system. For these images the
serial number will take a special form of 'r#####n##' or 'r#####s##'.
 Where:
◊ r = roll
◊ n = negative
◊ s = slide
Note: These are lower case numbers for a reason. This allows them to
stand out from other letters that are codes for specific things.
This gives 4 digits for the roll number which will cover 9,999 rolls. More
than I expect to take in my lifetime. 2 digits are required for the negative
number within the roll. I do not include the little 'a' that is sometimes on
the edge of the film. I thought about only using 3 digits for the roll number
number but it is pretty likely that I will take more than 999 rolls in my
whole life, especially if I have kids. My girlfriend suggested that I didn't
need the little 'r' and 'n' in there but I want it to be easy for regular people
to figure out.
Yes, I know that this is a little dated. I actually created this system
when I still shot analog, but was planning for the future.
 All others (Including slides and negatives taken by others)
□ Will simply have a serial number assigned to them.
 There are 9 digits available because I wanted to keep all the serial numbers the same
same length. This gives 999,999,999 + (9,999 x 2 x 99) different images, far more
than I can possibly imagine using.
○ Description_: (Optional)
 Generally a maximum of 20 characters, for a total of 44 characters.
 The Description is for things that would be important for a regular person to know, like

Organize Photographs Page 3


like the name of the person in the picture. I won't be putting any description at all (in
the file name) for most of the pictures I take. That will be taken care of by the
categories and description fields in the database. But, for those pictures where some
family member would really want to know who is in the picture, I will put that in the
filename. If it's a group picture just put the name of the group like "Leirer Sisters" or
"Ruskin High Science Club"
 This section gets whatever is left over. That would generally be about 20-25 characters.
characters. I have limited this to 20 characters in my automatic naming settings.
However, if the following Hierarchy Code is shorter, you could add a couple of
characters to this part. As you can see it is still not enough for a full description but it is
better than nothing.
○ HhHh_ = Hierarchy Code
 The hierarchy code (optional) is discussed separately, next. Normally it will not be
more that 4 - 8 characters while you are working on the various modifications of the
image. Once you are finished you will likely be able to reduce those down to just 2
characters. You might have different crops or use various filters or blur the background
on one version and want to keep all the variations. You could then just reduce the
Hierarchy Codes to something like 1D, 2C, and 3B for those files rather than keep all of
the extra letters and all of the unneeded intermediate versions of the image.
 The original file gets a hierarchy code of 00 so it will sort at the top properly.
 This section will likely only take up 3 characters once you have finished editing and
shortened this section, for a total of 47 so far. However, during processing and before
archiving, it is allowed for these to have as many hierarchy levels as necessary.
○ UseCode
□ The Use Code is a short code indicating the file's intended use.
□ 4 to 7 characters for a grand total of 54 so far.
□ Here are a few examples. Of course you can make up your own codes or system.
□ !Origin= Original file.
□ !Final = The final version of the full original file after all color and exposure
corrections have been done.
□ P4x6 (A resolution suitable for printing as a 4" x 6" print.)
□ P8x10 (A resolution suitable for printing as a 8" x 10" print.)
□ P13x19 (A resolution suitable for printing as a 13" x 19" print.)
□ Screen (Suitable for viewing on a standard computer screen.)
 This use includes as the version for some programs to use to display on
screen, rather than processing the full version of the file.
□ Web-S
□ Web-M
□ Web-L
○ File Extension
 The file extension is almost always going to be 3characters after the period, but up to
five characters are allowed after the period, for a grand total of 59 characters.
• Conclusion
○ In previous posts I have discussed using both letters and numbers for serial numbers to
reduce the number of characters required for a given number of different combinations.
However, I fear that this may confuse future users. Plus, I could not find a single
automated renaming system that would allow anything other than incremental digits for a
a serial number, so I gave up on that idea.
○ Unfortunately, this system has several problems. I had wanted to keep it short (the
filename, not the post) but I'm using up almost the whole darn 64 characters! Even if I

Organize Photographs Page 4


don't enter a description I am up to 40 characters for many file names.

Organize Photographs Page 5


Grant's Image Hierarchy File Name Encoding Scheme
Wednesday, June 10, 2020
12:41 PM

I have been reading many posts about image file organization. As a result, I started thinking
about how I would keep track of all the image files I will likely be generating. These files will
be coming directly from my digital camera, from my scanner, and as a result of modifications
made on the way to an ideal final image. It is possible for these modifications to follow
multiple possible paths as I experiment with what is an appropriate final adjustment. In
addition, many images may be the starting point for many different final images, such as when
when I decide to crop down to a single subject within the image and prep that for printing or
display. There could be an entire hierarchy of images based on other images as I step through
my work flow as well as multiple branches coming off from various points as I think of new
things to do with existing images that are themselves results of various stages of modification.
I quickly realized it would be imperative for me to use a concise, consistent naming system to
keep track of all the different variations. I have searched through old messages in almost a
dozen newsgroups, Googled till I was blue in the face, and even asked graphic artists I know. It
seems that no one has such a system.

Almost everyone just seems to guess their way through it as they go along, using incredibly
long files names, and trying to describe all their modifications within the name itself.
However, this leaves them with a directory full of randomly named files and no easy way to
sort them by hierarchy. Some people use deeply nested subfolders with descriptively named
directories. But this leaves them with a dozen or so identically named files and an incredibly
long path name. This then causes them trouble when they go to archive these off onto CD
because there are limits on the sub-folder depth and path name length.

So, I decided to put my incredible skills in organizational system design (yes, that is sarcasm)
to the test and see if I could come up with something. Two days, one white board, one Tablet
PC, and six mocha's later, I think I have a solution. I have spent the last few days just trying to
figure out how to describe it clearly.

The Goal: An encoding scheme that would...


1. Clearly indicate where in the hierarchy a particular image file belonged.
2. Use as few characters as possible.
3. Allow all related images to be stored in the same directory.
4. Sort the files properly when using normal ASCII file name sorting.
5. Be flexible enough that one could pick any image in the hierarchy and branch off
with further modifications without screwing up the coding scheme.

Important Note:
I completely understand that many programs allow for lossless processing of images,
including many different branches of edits. However, not everyone has these programs,
programs, and not everything you may want to do will be possible within a single
program with these features. Therefore sometimes you may still need to save separate
copies of files. This system is to allow you to keep better track of those files.

This scheme is not meant to replace the entire file name. The characters comprising this
encoded data are meant to be inserted somewhere near the end of the file name. The main part
part of the file name would still use whatever system you use to serialize your original images.
I will call that part of the name your 'BaseName'. The BaseName will be the same for all images
that are the result of modifications to the same original image file.

Organize Photographs Page 6


Also, because many image editing programs now allow for lossless rotation of .JPG files, that
first step of rotating to the proper orientation will not be considered a modification of the
original file. It will still maintain its status as the 'original'. Just be sure your software really
does true lossless rotation and doesn't delete any metadata in the process.

The System:
The structure of the file name will be as follows...

BaseName_HhHhHh_UseCode.ext

Where...
• HhHhHh_ = the hierarchy code. This may be as long as necessary.
• UseCode = the use code (or a special modification of a final image for a particular
use such as simple resizing for e-mail or web use). This portion will only be
inserted into the file name for these special purpose files and only if you choose
to save them at all. You can use any codes you like and as many characters as you
like.
• .ext = the regular file extension. It's OK if it is longer than 3 characters.

The original file will be named 'BaseName_00_!Origin.ext'. The 00 and the exclamation point
causes it to sort to the top of the list of all the files based upon it. The 'Origin' just reminds you
or any casual observer that this is the original image file.

Naturally, this system won't work for 8.3 character DOS based file names. All you can squeeze
into that small space is a straight sequence of serial numbers anyway.

The Hierarchy Code:


(Not quite as interesting as The Bible Code or The Da Vinci Code but useful
nonetheless.)

The hierarchy code consists of pairs of characters. The first character in a pair is a number and
the second character is a letter. For instance '1A' or '3H'. The number indicates the branch or
path followed from the original or parent image and the letter indicates how many sequential
steps along in the work flow it is. Each time you branch off from an existing sequence you add
an additional pair of characters. You will see how you can do a heck of a lot of variations and
still end up with hierarchy codes that are only 6 characters long.

For instance, let's say you are starting out with your original image file (BaseName_00_!
Origin.jpg) and you normally do several different things to it to start getting it ready for
use. You convert it to a .TIF file (BaseName_1A_Lossles.tif), then you despeckle it
(BaseName_1B_Despekl.tif), finally you adjust the color (BaseName_
1C_Color.tif). You can also choose to omit these intermediate use codes, leaving file
names such as: "BaseName_1A_.tif." Just be sure to leave in the trailing underscore, so
automated matching routines and regular expressions can match filenames more easily.

When you are satisfied, you create (BaseName_1C_!Final.tif). Later, once you are
finished with everything, you may choose to delete all the intermediate files, and maybe even
the original. That's up to you.

Now this may not be your normal work flow and some of these steps could even be done just
before using the image without even keeping the resulting image permanently. That is not the

Organize Photographs Page 7


point. You are sometimes going to have several sequential steps where you desire to save
images files as you go for whatever reason. These are just examples for purposes of
explanation.

Why use two characters instead of one? You will see in just a second.

Let's say that later you are either unsatisfied with the results of the first sequence of operations
operations or just want to take a different tack. So you decide to go back to the original file and
make a different sequence of adjustments from there. You do your modifications and name the
files with hierarchy codes 2A, 2B, 2C and so on. This would be a different branch of the
hierarchy. You could follow your muse along a yet another path and create 3A and 3B as well.

So what's up with the multiple pairs?


Imagine that after thinking about things you decide that your first path was the right one but
you just didn't adjust the color correctly for the 1C file. So you decide to redo it but you aren't
ready to commit to deleting the 1C file yet. You open up the 1B file and make your more
enlightened color modifications then save the file as BaseName_1B1A_.tif. This is the first side
branch based on the 1B file and it is the first step along the new sequence. Now we could think
of the 1C file that was originally based on the 1B file as the first branch but it is simpler to think
think of it as the same trunk growing straight up (or root growing straight down). You like how
how the image is working out so you go on to create 1B1B and 1B1C. See how you can tell
exactly what step along which of the first level branches these images are based on just by
looking at the hierarchy code embedded in their file name.

But tomorrow you have second thoughts. You're thinking that the 2B file might have been better
better than the 1B file so you open that one and start modifying from there. You save files 2B1A,
2B1A, 2B1B, and 2B1C. Upon reflection, that 2B1C file just isn't quite right so you go back to
2B1B, do it just a little differently and save 2B1B1A then finally 2B1B1B.

Eureka! That is perfect. So you rename this file to BaseName_2B1B1B_!Final.TIF to


indicate that it is the final version and you don't think it needs any further adjustments. Next
you create several different files for different possible uses and insert whatever use codes you
like, naming them things like BaseName_2B1B1B_Web-S.GIF for web sites, BaseName_
2B1B1B_EMail.JPG for e-mail, BaseName_2B1B1B_Screen.JPG for sending to family,
and BaseName_2B1B1B_P9x13.TIF for large size printing. Notice how you don't need to
create new hierarchy codes when all you have done is generate a different version of the file for
a specific use. It is still essentially the same image, just at a different resolution or file format.
The "!" is to keep the primary version of this final file sorted before it's derivatives. You will
have your own standards as to what sizes are appropriate for which uses.

Here is how the hierarchy looks conceptually:

00
|
+---------------+-------------------------+
| | |
1A 2A 3A
| | |
1B2A --- 1B --- 1B1A 2B --- 2B1A 3B
| | | | |
1B2B 1C 1B1B 2C 2B1B --- 2B1B1A
| | | | |
1B2C 1B1C 2D 2B1C 2B1B1B

Organize Photographs Page 8


|
1B2D

Naturally, you must be viewing the above using Courier font for it to look right. I threw in yet
another branch on the left of 1B to further illustrate the idea of multiple possible branches
from each point in the tree. Of course it is hard to illustrate more than two branches for each
point in a text file but you can see that you could have as many branches from any one point as
there are different codes for the first character.

If you really get crazy you can even go past 9 and use letters for the first character as well but
then it would be harder to read. If you do this you might want to use upper case for the first
character and lower case for the second character in each pair.

Here is how the files will sort in your file listing:

BaseName_00_!Origin.jpg
BaseName_1A_.tif
BaseName_1B_.tif
BaseName_1B1A_.tif
BaseName_1B1B_.tif
BaseName_1B1C_.tif
BaseName_1B2A_.tif
BaseName_1B2B_.tif
BaseName_1B2C_.tif
BaseName_1B2D_.tif
BaseName_2A_.tif
BaseName_2B_.tif
BaseName_2B1A_.tif
BaseName_2B1B_.tif
BaseName_2B1B1A_.tif
BaseName_2B1B1B_.!Final.tif
BaseName_2B1B1B_.EMail.JPG
BaseName_2B1B1B_.P9x13.TIF
BaseName_2B1B1B_.Screen.JPG
BaseName_2B1B1B_.Web-S.GIF
BaseName_2B1C_.tif
BaseName_2C_.tif
BaseName_2D_.tif
BaseName_3A_.tif
BaseName_3B_.tif

Notice how you can actually see the branching structure just by
looking at the file names.

After you have stewed on all these edits a while you may decide you
don't need to keep all these huge image files around forever. You can just
delete all of the dead end branches if you want to. You could even
delete all the steps up to your final image. If you
didn't want to be reminded that you had been so darn wishy-washy you
could even rename the file restarting the hierarchy code at 1A or just
replacing it leaving a file name like BaseName_00_!Final.tif with the
additional special purpose formats renamed similarly. (Note: You still keep the _00
part to make it easier to use the same regular expressions for all of your file names.)

Organize Photographs Page 9


Alternative systems:
There are going to be some people who read this and are tempted to ask why I didn't just use
one additional character for each additional branch away from the original. It would certainly
make the file names even shorter. That's why it took two days. I just couldn't come up with
any scheme that didn't break as soon as you wanted to add more branches or more steps along
an existing branch. If you can come up with something let me know. I'd love to see it.

Organize Photographs Page 10


Wednesday, June 10, 2020
12:42 PM

Organize Photographs Page 11


Comments
Wednesday, June 10, 2020
12:42 PM
I know it is a rather complicated description and looks like a complex scheme. But I
worked hard to figure out something that would keep track of the order in which
various versions of files were created just by their sort order in the folder. I wanted to
to avoid using a series of folders that I would have to move files around in. Please
also remember that all these extra characters will only be kept around until I am
finished working on an image. Then they will be stripped out or reduced to whatever
is required at that time.

I posted this in another forum and all I got was abuse. People have a tendency to
make fun of me for making up complicated systems when they think they aren't
needed. They start giving me lectures about the KISS principle and Ozcam's Razor.
Well, I can also quote Einstein who said that something should be as simple as
possible but no simpler.

Please remember, I am a 60 year old former computer consultant who has studied
database normalization, quantum physics, complexity theory, and chaos theory, as
well as invented a new form of geometry and laser. I don't need the lectures. I have
done my share of simplifying other systems that people thought were simple because
it didn't take very many words to describe them but, in reality, had grown to a huge,
unmanageable, mess because their few words did not cover the realities of life. (kind
of like that sentence) Ask any software developer and they will tell you, in order to
make a program simple to use it must be more complicated under the surface.

There are going to be people replying to this message saying they just use a simple
system of typing out every modification they have made in the file name. It sounds
good but it quickly gets to where you can't tell which file is a modification of which
previous file. Others will tell me that they just keep track of everything in one
PhotoShop .PSD file. Well, that's great if you can afford PhotoShop. Even if I could,
there is a huge learning curve to using all the layers to control the modifications
rather than just save multiple files. Besides, not everything can be done in PhotoShop.
PhotoShop.

Other people will tell me to just create different folders or categories for the different
stages of my work. It would be great if I knew in advance that I would always make
the exact same series of edits to all of my images in lock-step fashion. But it also
sounds pretty boring. I needed a system that would allow me to keep track of the
order in which files were modified and which files were based on which other ones
no matter where my creative thoughts took me. And I have found one.

I am not trying to tell anyone that they have to use my system just by making this
post so I would appreciate it if people didn't try to tell me I was doing something
intrinsically wrong by using it. Besides, I worked really hard on it so be nice.

People often think that just driving down the straightest road is the simplest way to
get to Rome. What they forget is they are only able to do that because someone else
took a century worth of technology, used it to manufacture thousands of incredibly
complicated parts, that all work together in precision harmony, burning gas that took
millions of years to create and another complicated refinery to produce, driving on a
road that took dozens of complicated machines which are even larger and have more
parts than the car itself, just to be able to drive down the road. So don't talk to me
about keeping things simple.

Organize Photographs Page 12


If I gave them instructions for building an ox cart, many people would tell me it is too
complicated. They would say they can just get in their car and drive. But the ox cart
will work without any gasoline or highways. And it will handle the basics of what
most normal people will really need. It will get them to Rome and carry their things.
If you don't have a car and you don't know how to build one then an ox cart is really
quite simple.

That is what I am designing my file name system for. For those people who may
come along later and won't have access to my database and won't know how to build
one from my metadata. The data will be there so a more astute individual will be able
to use it. But I want anyone who can boot up a computer to at least be able to glean a
little bit of information from the folder structure and file names alone.

For important family pictures I will probably also put text files that have even more
information right in with the picture files. I may eventually write a script that will do
it automatically. Then I will have the best of both worlds and have all my bases
covered too.

Copyright:
This entire document is copyright 2020 by Grant S. Robertson. It may not be
reprinted, posted to a web site, stored in any electronic archive other than this forum,
forum, taught in any class or seminar, nor may it be incorporated into any software
product, without my prior written permission. Individuals are hereby granted the
right to use this system for their personal filing needs and to tell their friends. In
other words, I am happy to share this idea with fellow photographers but if you are
going to be making money off of it then I get a cut. As people who generate intellectual
intellectual property every time they press the shutter release, I imagine you will
understand.

You can contact me at grantsr@gmail.com for permission to republish this


information.

Organize Photographs Page 13

You might also like