Datafeeds S3 Download Guide

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Downloading Amazon

Associates Data Feeds

Downloading Amazon Associates Data Feeds

Downloading Amazon Associates Data Feeds


Copyright 2008 Amazon Web Services LLC or its affiliates.
AMAZON and AMAZON.COM are registered trademarks of Amazon.com, Inc. or its Affiliates. All other trademarks are the
property of their respective owners.
2008 Amazon Web Services LLC or its affiliates. All rights reserved.
This book was last updated 2008-08-28.

Downloading Amazon Associates Data Feeds


Workflow Overview

Downloading Data Feeds


Topics
Workflow Overview
Listing Data Feeds
Downloading Data Feeds
Reporting the Download Status
This section describes how to list the Data Feeds you can download and how to download them.

Workflow Overview
The workflow for downloading a Data Feed is:
1. List the Data Feeds that you are permitted to download.
After you register for Data Feeds, the Associates team specifies the Data Feeds you can download.
Listing the Data Feeds you can download is typically a one time event. For more information, go to
Listing Data Feeds.
2. Download the selected Data Feeds in some automated fashion, for example, using crontab to run a
script..
While you can manually download Data Feeds, the more practical solution is to create a script in the
computer language of your choosing and to use an automated scheduler to run the script, such as
crontab. The script should look for Data Feed updates multiple times a day. The information in this
document describes the commands you use to download Data Feeds
3. Optionally, include in the script a report to Amazon Associates that the Data Feed download
succeeded or didn't.
Whether you are listing or downloading Data Feeds, you must include in your request your login name
and password. These values are assigned to you by the Amazon Associates when you sign up to become
a Data Feeds customer. To sign up to receive Data Feeds, contact Customer Support, as listed on the
Associates web site.
Note
1

Downloading Amazon Associates Data Feeds


Listing Data Feeds

The new delivery mechanism will not work with login names and passwords that are
linked to older FTP accounts. People with older FTP accounts will be sent new login
names and passwords.

Listing Data Feeds


To list the Data Feeds that you are permitted to download, use the following command:
curl --user [username:password] --digest -k ht
tps://assoc-datafeeds-na.amazon.com/datafeed/listFeeds?format=text

In the real request, substitute your user name and password with a colon in between, as shown within the
brackets. These values are provided by Amazon when you sign up for Data Feeds. These values identify
you and the Data Feeds you are permitted to download.
Note
For UNIX, use Curl version 7.18 or later, For Microsoft Windows, use Curl version 7.18.1
or later.
Here is a sample response.
us_ecs_ce.xml.gz
Thu Feb 21 18:37:00 PST 2008
"ac87f4105c47d5bfecdc7d0c315bf0d8"
764372624
us_ece_software.xml.gz
Thu Apr 17 19:32:43 PDT 2008
"2349e52cfd8beb98fa4d56096e6ddfd4"
2518414
us_ecs_books.xml.gz Fri Apr 18 01:03:11 PDT 2008
"50a9bd8e23b5d45b5704e73ef948fed6"
187255128

This response shows that the user is able to download three Data Feeds. The response format is:
DataFeed_filename Date_generated "MD5_ID" File_size_in_bytes

Note
MD5 is an encrypted value used to determine the integrity of a transferred file. The 128-bit
hash value is calculated, in part, by everything in the file. In this way, the receiver can use
everything in the file to calculate the MD5 value independently. Comparing it's result with
the MD5 value in the response, you can determined if the file is partial, complete, or
corrupt.

Locales
The URL in the request varies slightly by locale, as shown in the following table

Locale

URL

North America (na)

curl --user username:password --digest -k


https://assoc-datafeeds-na.amazon.com/datafeed/listFeeds?
format=text/html
2

Downloading Amazon Associates Data Feeds


Data Formats

Locale

URL

Europe (eu)

curl --user username:password --digest -k


https://assoc-datafeeds-eu.amazon.com/datafeed/listFeeds?
format=text/html

Japan (fe)

curl --user username:password --digest -k


https://assoc-datafeeds-fe.amazon.com/datafeed/listFeeds?
format=text/html

The listFeeds command returns one or more files the Associate can download, for example:
us_ecs_ce.xml.gz

As you can tell from the abbreviations, the content of this Data Feed file is in XML and the file is
gzipped.

Data Formats
In listFeeds requests, the final parameter, format, specifies the format of the downloaded content.
The default is text/html. So, if you do not include the format parameter, the content of the files will be
text/html. Other choices include:
textTypically used as a human readable form as opposed to the XML.
text/xml
The complete list of possible requests to list Data Feeds is:
curl --user [username:password] --digest -k
https://assoc-datafeeds-na.amazon.com/
datafeed/listFeeds
curl --user [username:password] --digest -k
https://assoc-datafeeds-na.amazon.com/
datafeed/listFeeds?format=text/html
curl --user [username:password] --digest -k
https://assoc-datafeeds-na.amazon.com/
datafeed/listFeeds?format=text/xml
curl --user [username:password] --digest -k
https://assoc-datafeeds-na.amazon.com/
datafeed/listFeeds?format=text

Now that you have the complete list of all Data Feeds that you are allowed to download, you can select
the ones you want.

Downloading Data Feeds


Topics
Writing a Script to Download Data Feeds
Using the Data Feed Download Application
3

Downloading Amazon Associates Data Feeds


Writing a Script to Download Data Feeds
Sample Response
Amazon Associates requires you to update most information on a daily basis. Some data needs to be
refreshed less frequently. Consult your Account Manager to determine the correct refresh rate for each
kind of downloaded data. Because Data Feed files are multiple gigabytes in size, you should select the
minimum number of Data Feeds to download so that you can minimize the download time.
To download a Data Feed, you can write a script that uses the getFeed command or you can use the
Amazon Associates Download Application, which is discussed in Using the Data Feed Download
Application.

Writing a Script to Download Data Feeds


The following is a sample shell script that downloads a Data Feed, us_ecs_baby.xml, and retries the
download automatically if necessary.
#!/bin/bash
rm ecs_baby.xml.gz
# remove the old feed file if it already exists
retries=5
# the number of times you want to retry
success=1
# the status of the download
while [ $retries -ne 0 -a $success -ne 0 ]
do
curl --location --user [username:password] -C - --digest -k
https://assoc-datafeeds-na.amazon.com/datafeed/
getFeed?filename=us_ecs_baby.xml -o us_ecs_baby.xml.gz
success=$?
retries=$(expr $retries - 1)
done

Substitute for username:password the Data Feed user name and password. This script uses the "C "
option of curl to continue the download in case it was interrupted on a previous attempt. The URL in the
request varies slightly by locale, as described by the table in the previous section.
The filename parameter is set equal to the Data Feed that you want to download. We recommend that
you retry the download three to five times if the exit status of the curl commands shows an error. In this
request, the results of the command are stored in the file, us_ecs_baby.xml.gz.

Automatically Downloading Data Feeds


While you can run the script above manually, it is more practical to use a scheduler, such as crontab.
The following script shows one way of automatically retrieving Data Feeds.
#!/bin/bash
filename= us_ecs_baby.xml
# the feed file we want to download
retries=5
# the number of times you want to retry
rm $filename
# remove the file if it already exists
success=1
# the status of the download
while [ $retries -ne 0 -a $success -ne 0 ]
do
curl --location --user [username:password] -C - --digest -k
https://assoc-datafeeds-na.amazon.com/datafeed/
getFeed?filename=$filename
-o $filename
success=$?

Downloading Amazon Associates Data Feeds


Using the Data Feed Download Application
retries=$(expr $retries - 1)
done
reportSuccess=$(expr 1 - $success)
# compute status to report
curl --user [username:password] --digest -k
https://assoc-datafeeds-na.amazon.com/datafeed/
reportStatus?filename=$filename&success=$reportSuccess

Using the Data Feed Download Application


Amazon Associates provides a client application that automatically downloads a Data Feed when data
has changed. The application contains retry logic and reports the status of the download. Download
status is discussed in Reporting the Download Status.
To obtain the Download Application
1. Go to http://associates.datafeeds.s3.amazonaws.com/samples/s3_download_client.zip.
2. Download the file, s3_download_client.zip, and unzip it.
The .zip file contains the client application, fetch-feeds.pl, and a file, feeds_file, of sample data that
you can use to test the client application.
To run the Download Application
On the command line, use the command , fetch-feeds.pl, as shown in the following example.
% fetch-feeds.pl --input ./feeds_file --md5-file ./MD5
--region NA --user myLoginName --dir ./ResultsDir

--pass myPassword

All of these arguments are required and described in the following table.

Argument

Description

dir <dir>

Directory where feeds will be stored

input <filename>

Path to filename containing list of feed names to be downloaded, one per


line

md5-file <filename>

Path to filename where md5 checksums will be stored

pass <pass>

Password for logging onto the Associates S3 Proxy

region <region>

Must be one of the following: NA,EU,FE

user <user>

User name for logging onto Associates S3 Proxy

Sample Response
The following is a sample response after the getFeed command completes.
us_ecs_ce.xml.gz
%Total %Received %Xferd Average Speed Time Time TimeCurrent
100
2459k
100
2459k
0
0
453k 0
Dload
0:00:05

Upload
0:00:05

Total
--:

Spent
--:

Left
--

Speed
767k

Downloading Amazon Associates Data Feeds


Reporting the Download Status

This result shows that 100% of 2459k bytes were sent, 100% of the 2459k bytes were received, the
average transfer rate was 453k per second and that the download took 5 seconds.

Reporting the Download Status


The reportStatus command is optional; it enables you to report to Amazon Associates whether or not
you successfully downloaded the Data Feed. Use the final parameter in the command, success, to
report that the Data Feed downloaded correctly (1) or that it didn't (0).
curl --user [username:password] --digest -k ht
tps://assoc-datafeeds-na.amazon.com/datafeed/reportStatus?filename=us_ecs_bab
y.xml.gz
&success=1

Although including this command in your script is optional, it helps Amazon Associates determine the
health of the service.
Here is a sample response returned upon the report of a success.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML>
<HEAD><TITLE>Report Status Completed</TITLE></HEAD>
<BODY> status updated </BODY>
</HTML>

You might also like