Professional Documents
Culture Documents
DupScout Duplicate Files Finder
DupScout Duplicate Files Finder
Flexense Ltd.
DupScout
Duplicate Files Finder
User Manual
Version 3.2
Sep 2011
Flexense Ltd.
Product Overview
In today's world of the high-speed Internet, desktop computers and laptops are constantly flooded with documents, digital images, music and video files. Frequently, people are downloading identical files from different web sites thus wasting storage space with duplicated content. Overtime, computers tend to collect large amounts of duplicate files scattered over multiple directories or disks with different file names what makes it quite difficult to detect them.
DupScout is a free, fast and easy-to-use duplicate files finder utility allowing one to detect and cleanup duplicate files in disks, network shares and NAS storage devices. The user is provided with the ability to search one or more directories, disks or network shares for duplicate files, select original files that should be kept in place and cleanup duplicates thus freeing up wasted storage space.
In addition, power computer users and IT professionals are provided with an advanced product version, named DupScout Pro, which is capable of processing significantly larger amounts of files, allows one to replace duplicate files with links, provides the user with the ability to detect duplicate files among specific file types, adds multi-threaded duplicate files detection mode, provides multiple performance tuning options, allows one to export HTML, Excel CSV and text reports and finally enables execution of user-defined commands using direct desktop shortcuts.
Flexense Ltd.
On the 'Welcome' screen press the 'Next' button. Read the end-user license agreement and press the 'I Agree' button if you agree with the license terms or the 'Cancel' button to stop the installation process.
Select the destination directory, press the 'Install' button and wait for the installation process to complete. That's all you need to do to install the DupScout duplicate files finder utility on your computer.
Flexense Ltd.
Depending on the amount of files that should be searched, the duplicate files detection process may take from a couple of seconds for tens of files to a few hours for large file systems containing millions of files. During the duplicate files detection process, DupScout will display the process dialog showing the total amount of processed files, the number of detected duplicate files and the amount of the wasted storage space. Once the detection process is completed, DupScout will display the list of all detected duplicate file sets sorted by the amount of the wasted storage space. For each duplicate file set, DupScout shows the name of the currently selected original file, the currently selected cleanup action, the number of duplicate files in the set, the size of each file and the amount of storage space wasted by the duplicate files.
Sometimes, there may be thousands of duplicate files and in order to help the user concentrate on the duplicate files wasting significant amounts of storage space, DupScout by default shows top 1000 duplicate file sets sorted by the amount of the wasted storage space. In order to change the default amount of displayed duplicate file sets, open the profile dialog, select the 'Advanced' tab and set the 'Max Dup File Sets' option to an appropriate value.
Flexense Ltd.
In order to select an appropriate duplicates cleanup action, select one or more duplicate file sets, press the right mouse button and select the required duplicates cleanup action. By default, DupScout selects the oldest file in each duplicate file set as the original file. In addition, the user is provided with the ability to select any arbitrary file in each file set as the original file.
The set dialog shows all duplicate files in a set and allows one to manually select a cleanup action and the original file for the set. In order to manually change the original file for a duplicate files set, click on the set item in the set list, select the file that should be set as the original file, press the right mouse button and select the 'Set as Original' menu item.
Flexense Ltd.
The actions preview dialog will display a list of all cleanup actions that should be performed and allow one to select/unselect each specific cleanup action. After carefully reviewing all the selected cleanup actions, press the 'Execute' button to actually cleanup all the selected duplicate files.
Typically, there are lots of duplicate files in the Windows system directory, which are critical for the proper operation of the operating system. All duplicate files located in the Windows system directory and other application specific directories cannot be removed and it is highly recommended to avoid touching these files.
Flexense Ltd.
By default, DupScout categorizes all files by the file extension and shows a list of all types of detected file extensions sorted by the amount of used disk space. For each category, DupScout shows the number of files, the amount of used disk space and the percentage of the used disk space relative to other file categories. Use the 'Categories' combo box to categorize files by the file type, last access time, last modification time or creation time.
One of the most useful features of DupScout is the ability to browse duplicates by one or more specific file categories using file filters. For example, in order to see all files that were accessed 2-3 months ago, select the access time-based file categorization mode and double-click on the 'Files Last Accessed 2-3 Months Ago' file category. DupScout will filter the currently displayed list of duplicates and show all sets that were accessed 2-3 months ago.
Flexense Ltd.
By default, the charts dialog shows the amount of wasted disk space and the number of duplicates for the currently selected second-level file category. For example, in order to open a pie chart showing the amount of wasted disk space per extension, select the 'Categorize by Extension' second-level file category and open the charts dialog.
In addition, the charts dialog provides the user with the ability to copy the displayed chart image to the clipboard allowing one to easily integrate DupScout charts into user's documents and presentations. In order to customize the chart's description, press the 'Options' button and specify a custom chart date, time or title.
Flexense Ltd.
On the 'Report' dialog enter the report title, specify the file name to save the report to and select one of the following report formats: HTML, Excel CSV or ASCII text. By default, DupScout will save a duplicate files report containing top 1000 duplicate file sets sorted by the amount of wasted storage space.
In order to export a full report containing all detected duplicate file sets, enter an appropriate number of duplicate file sets to export on the right side of the report format selector. Keep in mind that reports for large file systems containing millions of files may be very large and difficult to open using standard tools especially when exported to the HTML format.
Flexense Ltd.
The report database dialog displays reports that were submitted to the database and allows one to search reports by the report title, host name, date or directories that were processed. For each report in the database, DupScout displays the report date, time, host name, directories that were processed, the amount of files and storage space the report refers to and the report title. In order to open a report, just click on the report item in the report database dialog.
In order to connect DupScout to an SQL database, the user is required to define an ODBC data source in the computer where DupScout is installed on and to specify the ODBC data source in the DupScout options dialog. Open the options dialog, select the 'Database' tab, enable the ODBC interface and specify a valid user name and password to connect DupScout to an SQL database. In order to export a report to an SQL database, press the 'Save' button on the results dialog and select the 'SQL Database' format. In addition, the user is provided with the ability to use the command line utility, which is available in DupScout Ultimate, to export reports to an SQL database.
10
Flexense Ltd.
In order to analyze reports from multiple hosts, the user needs to connect DupScout to an SQL Database, perform duplicate files search on multiple hosts using the DupScout GUI application or the DupScout command line utility and submit reports from all hosts to the SQL database. Once reports from all hosts are in the database, open the Database dialog and press the Hosts button to open the Hosts Statistics dialog.
The simplest way to submit reports from multiple servers or desktop computers is to use the DupScout command line utility to detect duplicate files on all required hosts through the network. In order to simplify submission of reports to the SQL database, the command line utility may be executed on the same host where the SQL database is installed on. In this case, the user needs to specify one or more network shares to be processed and the host name to be set for each report.
Another option is to execute the command line utility on each specific host, save duplicate files reports and later submit report files from all hosts to the SQL database using the DupScout GUI application. In this case, there is no need to set the host name, which will be set automatically to the name of the host the command line utility is executed on.
11
Flexense Ltd.
In order to analyze duplicate files per user, connect DupScout Ultimate to an SQL Database and submit reports containing duplicates owned by multiple users to the SQL database using the DupScout GUI application or the DupScout command line utility. Once reports are in the database, open the Database dialog and press the Users button to open the Users Statistics dialog.
The simplest way to submit reports from multiple servers or desktop computers is to use the DupScout command line utility to detect duplicate files on all required hosts through the network. In order to simplify submission of reports to the SQL database, the command line utility may be executed on the same host where the SQL database is installed on. In this case, the user needs to specify one or more network shares to be processed and the host name to be set for each report.
Another option is to execute the command line utility on each specific host, save duplicate files reports and later submit report files from all hosts to the SQL database using the DupScout GUI application. In this case, there is no need to set the host name, which will be set automatically to the name of the host the command line utility is executed on.
12
Flexense Ltd.
In order to display a history chart, save a series of reports to an SQL database, open the SQL reports dialog and press the 'History' button. A series of reports may be exported to an SQL database manually using the DupScout GUI application or automatically using the DupScout command line utility.
The DupScout command line utility allows one to detect duplicate files in one or more disks or directories and save a report to an SQL database. In order to generate reports for multiple servers or desktop computers through the network, the user needs to specify one or more network shares that should be processed using the UNC notation and set an appropriate host name for each report saved to the database.
Finally, the command line utility may be used in conjunction with the standard Windows task scheduler to periodically detect duplicate files in one or more servers or desktop computers, save reports to a centralized SQL database and generate history charts showing how the number of duplicate files and the wasted disk space are changing over time. The history charts dialog displays the list of available charts, the list of host computers where the charts were generated on and extended statistical information for each chart. The user is provided with the ability to filter charts by the host name, location, report label, etc. allowing one to select an appropriate history chart. In addition, the charts dialog allows one to change the chart's title and footer, export the chart's image to the clipboard making it very easy to integrate DupScout history charts in user's custom reports and presentations.
13
Flexense Ltd.
By default, DupScout keeps all reports in the reports directory or the SQL database. In order to enable automatic report management, open the 'Options' dialog, select the 'Reports' tab and change the 'Report Files' or 'Report Database' options to appropriate values. The 'Report Files' option is applicable to HTML, text, Excel CSV, XML and DupScout native reports saved to a reports directory or to the user's home directory using the DupScout command line utility. After saving each new report, DupScout will check if there are too many reports of the same type (HTML, XML, CSV, etc.) in the reports directory and delete old reports according to the user-specified configuration. The 'Report Database' option is applicable to reports submitted to an SQL database using the DupScout GUI application or the DupScout command line utility. After saving each new report to the database, DupScout will check if there are too many reports from the same host computer, for the same set of disks or directories and delete old reports according to the userspecified configuration. For example, if reports from two different servers are submitted to the same SQL database, DupScout will keep in the database X last reports for each server. The 'File Categories' option allows one to enable/disable exporting of file categories to HTML, text, Excel CSV and XML reports. Second-level file categories are available when reports are saved using the DupScout GUI application manually. Automatically generated reports or reports saved using the DupScout command line utility always saved without file categories. When the 'File Categories' option is enabled, DupScout GUI application will save second-level file categories to HTML, text, Excel CSV and XML reports. The 'Compressed Reports' option allows one to save automatically generated HTML, text, Excel CSV and XML reports as compressed archive files.
14
Flexense Ltd.
In order to add one or more duplicates removal actions, open the profile dialog, select the 'Actions' tab and press the 'Add' button. By default, the 'Action' dialog shows basic options allowing one to select the original file detection mode and the removal action that should be used for all successfully matched duplicate file sets.
More advanced options may be enabled by pressing the 'More Options' button, which is located in the bottom-left corner of the dialog. In the advanced mode, the dialog allows one to define one or more file matching rules that should be used in order to detect the type of duplicate files that should be processed by this specific duplicates removal action. In order to apply different duplicate files removal actions for different types of files, specify multiple, rule-based removal actions and select an appropriate actions mode. In the 'Select Actions' mode, DupScout will scan the specified input disks or directories, select the defined removal actions for all duplicate file sets matching the specified rules and show an actions preview dialog allowing one to review the selected actions before execution. Another option is to set the actions mode to 'Execute' and to use the DupScout command line utility to execute the specified duplicate files removal actions fully automatically in an unattended mode.
15
Flexense Ltd.
Multiple UNC path names (separated by the semicolon character) may be entered into the directories entry located under the main toolbar or permanently specified in the profile dialog. Duplicate files detected using UNC path names will be prefixed with an appropriate server/share name according to the location of each specific duplicate file. When working with UNC path names, it is important to keep in mind that all cleanup actions will be performed using UNC path names and the user should have appropriate permissions on each specific network share and/or NAS storage device.
In order to add one or more directories to the exclude list, press the 'Manage Profile' button to open the profile dialog, select the 'Exclude' tab and press the 'Add' button. All files and subdirectories located in the specified exclude directory will be excluded from the duplicate files detection process. Keep in mind that exclude directories are case sensitive and should be specified with the same case as stored on the disk. Select an exclude directory and press the 'Delete' button, to remove the selected directory from the exclude list
16
Flexense Ltd.
In order to add one or more file matching rules, open the profile dialog, select the 'Rules' tab and press the 'Add' button. On the 'Rules' dialog select an appropriate rule type and specify all the required parameters. During the duplicates detection process, DupScout Pro will process all the entered input directories and apply the specified file matching rules to all the existing files. Files not matching the specified rules will be skipped from the duplicate files detection process and the results list will contain user-selected files only.
In order to enable multi-threaded duplicate files detection for a profile, open the profile dialog, select the 'Performance' tab and set an appropriate number of processing threads. Take into account that multi-threaded duplicate files detection capabilities are optimized for powerful multi-core/multi-CPU systems when processing large amounts of files located on fast storage devices and it is not recommended to use it on single-core/single-CPU computers.
17
Flexense Ltd.
In order to customize the duplicates detection process, open the profile dialog and select the 'Advanced' options tab. The advanced options tab allows one to control the default report title, the type of the signature used to detect duplicate files, the maximum number of duplicate file sets to report about and the file scanning filter, which may be used to limit the duplicate files detection process to specific file types. Report Title - this parameter sets the default report title to use when exporting HTML, Excel CSV or text reports. Signature Type - this parameter sets the type of the algorithm that should be used to compare files: MD5, SHA1 or SHA256. The SHA256 algorithm is the most reliable one and it is used by default. The MD5 and SHA1 algorithms are significantly faster, but less reliable. Max Dup File Sets - this parameter controls the maximum number of duplicate file sets displayed in the results list. After finishing the search process, DupScout will sort all the detected duplicate file sets by the amount of the wasted storage space and display the top X duplicate file sets as specified by this parameter (default is 1000). File Scanning Filter - this parameter (DupScout Pro only) allows one to specify a file scanning filter to be used during the duplicate files search. The file scanning filter provides the user with the ability to limit the duplicates search process to a specific file type or a custom file set matching the specified file scanning filter. For example, in order to search for duplicate JPEG images only, set the file scanning filter to '*.jpg'. This file scanning filter will match all files with the extension JPG (JPEG Images) and skip all other files.
18
Flexense Ltd.
19
Flexense Ltd.
The simplest way to add a new profile is to press the 'Add' button located on the right side of the profile combo box. The same may be done on the profiles dialog, which may be accessed using the 'Profiles' button located on the main toolbar. The profiles dialog shows all the defined user profiles and allows one to add new profiles, edit profiles and delete profiles. In addition, the user is profiled with the ability to associate a keyboard shortcut with each user-defined profile. Finally, DupScout Pro allows one to create a direct desktop shortcut for each profile, which may be used to find duplicate files in directories specified in the profile in a single mouse click.
In order to edit a profile, click on the profile item in the profiles dialog. Select a profile item and press the 'Delete' button to delete the profile from the product configuration. All the userdefined profiles listed on the profiles dialog are stored in the user-specific product configuration file, which may be exported for backup purposes and later used to restore the product configuration on the same or another computer.
20
Flexense Ltd.
The 'General' tab allows one to control the following options: Show Main Toolbar - Enables/Disables the main toolbar Always Show Profile Dialog Before Start - Instructs DupScout to always show the profile dialog before starting the duplicate files search process. Auto-Close Successfully Completed Tasks - select this option to automatically close the process dialog and show duplicate file list. Automatically Check For Product Updates - select this option to instruct DupScout to automatically check for available product updates. Show Scanning Access Denied Errors - select this option to see error messages when DupScout is prevented to scan files in a directory Process System Files - select this option to detect duplicate files among system files. Abort Operation On Critical Errors - by default DupScout is trying to process as many files as possible logging non-fatal errors in a process log. Select this option to instruct DupScout to abort operation when encountering a critical error.
The 'Shortcuts' tab provides the user with the ability to customize keyboard shortcuts. Click on a shortcut item to edit the currently assigned key sequence. Press the 'Default Shortcuts' button to reset all keyboard shortcuts to default values.
The 'Proxy' tab provides the user with the ability to configure the HTTP proxy settings. DupScout uses the HTTP protocol in order to inquire whether there is a new product version available on the web site.
21
Flexense Ltd.
The first (default) GUI layout displays large toolbar buttons with descriptive text labels under each button and shows the directories entry and the profiles combo box under the main toolbar. The second GUI layout displays small toolbar buttons with descriptive text labels beside each button and shows the directories entry and the profiles combo box under the main toolbar.
The third GUI layout displays small toolbar buttons without descriptive text labels and shows the directories entry and the profiles combo box as a single toolbar.
22
Flexense Ltd.
23
Flexense Ltd.
In order to manually verify that the currently installed product version is up-to-date, select menu 'Help - Check For Updates' on the main menu bar. The update manager will connect to the update server and check if there is a newer version of the product available for download. If there is a new product version available, the update dialog will show the version of the new product update and two links: the 'Release Notes' link and the 'Install' link. Click on the 'Release Notes' link to see more information about new features and bug-fixes provided by this specific product version. Click on the 'Install' link to download and install the new product version.
After clicking on the 'Install' link, please wait while the update manager will download the new product version to the local disk. The update package will be downloaded to a temporary directory on the system drive and automatically deleted after the update manager will finish updating the product.
After download is completed, close all open DupScout applications and press the 'Ok' button when ready. If one or more DupScout applications will be open during the update, the operation will fail and the whole update process will need to be restarted from the beginning. After finishing the update process, DupScout will show a message box informing about the successfully completed operation.
24
Flexense Ltd.
After finishing the purchase process, wait for the following two e-mail messages: the first one with a receipt for your payment and the second one with an unlock key. If you will not receive your unlock key within 24 hours, please check your spam box for e-mail messages originating from support@flexense.com and if it is nor here contact our support team.
After you will receive your unlock key, start the DupScout GUI application and press the 'Register' button located in the top-right corner of the window.
On the register dialog, enter your name and the received unlock key and press the 'Register' button to finish the registration procedure.
25
Flexense Ltd.
64-Bit Operating Systems Windows Windows Windows Windows Windows Windows XP 64-Bit Vista 64-Bit 7 64-Bit Server 2003 64-Bit Server 2008 64-Bit Storage Server 64-Bit
System Requirements
Minimal System Configuration Supported Operating System 1 GHz or better CPU 512 MB of system memory 25 MB of free disk space
Recommended System Configuration Supported Operating System 2+ GHz single-core or dual-core CPU 1 GB of system memory 25 MB of free disk space
26
Flexense Ltd.
* Product features, prices and license terms are subject to change without notice.
27