Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

Can Magento really handle over 500,000 products? If yes, how?

When choosing an ecommerce platform for a large product catalog, merchants naturally want to be sure
whether Magento is the right choice for them, so the question arises: How many products can Magento
actually manage? It is hard to give a general answer but it is definitely possible to provide some practical
guidelines.

In an effort to give some useful hints, let us try to find an answer to some more precise questions:

  How many products the different versions of Magento can manage easily by default, hosted on an
average server, with and without extra optimizations?
 What makes the difference between Magento versions in terms of catalog performance?
 What are the main bottlenecks for a big catalog in Magento, and how can Magento be scaled to handle
even more products?
 What are the common mistakes that one can make when handling a huge Magento catalog?

How many products the different versions of Magento can manage easily by
 
default, hosted on an average server, with and without extra scaling?
 Magento CE 1.9.x safely manages ~10,000 – 25,000 products in most cases, without much extra care*. On the
other hand it can manage 100,000 – 200,000 products, or even more via heavy, system wide scaling, resource
tuning and code optimization.

 Magento EE 13.x. 14.x safely manages ~100,000 – 200,000 products in most cases, without much extra
care*, and even 400,000 – 500,000 or even more with proper scaling, optimization and server resources.
 Magento 2 CE safely manages ~ 100,000 – 200,000 products in most cases, without much scaling and
extra care, and even 400,000 – 500,000 or even more with proper scaling, optimization and server
resources.
 Magento 2 EE is designed to be able to manage even more**, depending on some highly advanced
enterprise features like database sharding, job queues, advanced mysql and web server topologies and
proper resources.

 All these are very rough numbers, of course, referring to a catalog with a few attributes, categories and
simple products. The figures are based on Magento’s own performance tests and our own experiences. The
figures may vary greatly by the server setup, software and resources, too.

 What is also very important to note is that due to Magento’s database design, there are some especially
massive aspects that act as multiplication factors to get to the real number of products Magento really
works with in the background.

 Some of most important elements that have a huge impact are:


 number of Magento shops/languages
 number of product attributes
 number of categories and depth of the category tree
 number of configurable/bundle products
 number of customer groups with different product prices

Magento Development Tips (https://aionhill.com/category/magento-development-tips) Page |1


 All this means that as far as catalog management is concerned, a Magento catalog with a few thousand
products in a heavyweight catalog setup, for example ~20 languages, diverse attributes and category
structure, mostly with configurable products and multiple customer price groups, may equal to a Magento
catalog with 100,000+ simple products in a single store.

 These are the features that make Magento really flexible, but they come at a cost.

What makes the difference between these Magento versions in term of catalog
 
performance?
 Magento CE 1.x, including 1.9.x and
o – indexing, especially URL and search indexes are not optimized for large catalogs  (this also
applies to EE up to 1.12.x).
o – No Full Page Cache (FPC) is available by default
o + some frontend optimizations like javascript + css merging, CDN support
 Magento 1.x EE
o + FPC is available, which speeds up catalog browsing and saves a lot of server resources
o + From version 1.13+.x incremental indexing is introduced whereby products that were
changed or added will be re-indexed in the background by cron jobs.
o + From version 1.13+.x full reindex processes are highly improved as well and work well even
for large catalogs.
o + Solr search engine is available by default
 Magento 2 CE
o + Inherits incremental indexing feature from Magento EE 1.13+.x  
o + Inherits FPC from EE 1.13+.x and Varnish frontend cache is added as a choice of FPC. The
requests that are served by the Varnish cache never need to reach the Magento application
servers, which reduces the load on the web nodes while dramatically improving the response
time.
o + Browser cache is utilized for session data caching
o + Checkout process is improved greatly
o + Async order and product updates
o + Client side optimizations like minification, js resources bundling, caching static content, image
compression
o + PHP 7 is supported by default. PHP 7 may have even 200% performance gain over PHP 5.x by
itself.
 Magento 2 EE
o + Has all the features of Magento 2 CE
o + Solr (2.0) and Elasticsearch (2.1) for search
o + Database sharding separating catalog, checkout and order business domains  is available
o + Mysql cluster and Multi-master mysql setup is supported
o + Job queues introduced for advanced background data processing (deferred stock update as the
first implementation)

What are the main bottlenecks for a big catalog in Magento and how can
 
Magento be scaled to handle even more products?
 
Magento Development Tips (https://aionhill.com/category/magento-development-tips) Page |2
 Hosting – a big catalog obviously needs more resources, a VPS with plenty of memory and a multicore
processor is a must, a multi-node server setup may be needed.
 Server software – Nginx as a web server and PHP 7 and Mysql 5.6 or equivalent (Percona/MariaDb)
are highly recommended, even for Magento 1. Fine tuning of these pieces of server software for
Magento is essential in this case.
 Product Import – optimized and tailored product import and update is one of the key elements in this
case. Tools that enable batch database updates are extremely helpful. Any method or tool that uses
single product updates is a huge potential bottleneck. Magmi is one of the best choices here.
 Some other rules of thumb:
o Save only what has changed
o Use dedicated resources for import processes
o Separate price, stock and basic product data import
o Re-index only products and product data that needs to be reindexed    
 Indexing  – Indexing in Magento is a second step of saving product data and it is the trickiest part
when having a huge catalog. It contains of a series of processes to copy product data from database
tables optimized for data storage to tables optimized for different aspects of frontend data access. Since
indexing is “only” needed for Magento frontend features, it is possible to separate indexing from product
save or import. In the newer versions of Magento 1 EE and Magento 2 CE and EE incremental
background indexing is introduced, which speeds up working in the admin but may still not be ideal for
large volume product updates.  
o Indexing in Magento 1 CE is one of the greatest bottlenecks.
 URL indexing tends to provide the most issues, partly because it can be bloated to
millions of records, partly because it runs for a long time and in Magento 1 CE it is not
optimized for big catalogs.
 Beyond a certain catalog size and number of Magento stores the increased overhead of
product flat indexes will outweigh its benefits so it tends to be better not to turn on this
feature for a big catalog.
 Catalog search
o Default MYSQL fulltext search is resource greedy, indexing and the actual search on the
frontend tends to be slow and its accuracy and the relevance of the result set is fairly poor. So it
has to be replaced, even in Magento 1. There are some good, even free alternatives for Magento
1 to replace MYSQL search with Solr, Elasticsearch or Sphinx. Magento 1 EE has Solr and
Magento 2 from 2.1 on Elasticsearch by default. Some 3rd party Elasticsearch and Solr 3rd
extensions can replace the default MYSQL layered navigation engine at the same time, which
can be a huge benefit.
 Full Page Cache
o Full Page Cache (FPC) is a mechanism whereby html pages generated by the server
softwares are cached as a whole. Next time the same web page is required, the cached version
is returned without the need to regenerate the content. In Magento 1 CE has no FPC by default
and in Magento 1 EE caching is managed by Magento code itself. It saves a lot of resources and
results in greater speed, but the best way to do FPC is by implementing caching in a layer in
front of Magento without the need to touch Magento at all when a cached content is served. This
is accomplished by Varnish caching in Magento 2. Though Magento 1 CE has no FPC and
Varnish, there are good extensions to implement these features, too.
 Application Cache
o Magento heavily relies on different types of config and application caches, Redis, a memory
based, scalable application cache is highly recommended with tag management. The extension to
handle Redis is built into Magento starting from the latest version of Magento 1 CE.

Magento Development Tips (https://aionhill.com/category/magento-development-tips) Page |3


 

To sum it up for different Magento versions: 

 Common bottlenecks are product import and indexing in the backend, search and layered
navigation in the frontend. Checkout and Order management are also important factors, but these are
more strongly related to the number of visitors and transactions.
 There is a big chance that batch product import is to be tailored and optimized in all Magento
versions – feature rich and optimized product import is still to be solved even in Magento 2 EE.
 Magento 1 CE is the platform least prepared to handle big catalogs, but through heavy scaling, proper
hosting and 3rd party extensions that provide better indexing, search and caching, it can be considerably
upscaled. Indexing may still remain a bottleneck, however.
 Magento 1 EE has a bunch of optimizations already in place, to implement Varnish cache and fine tune
the server architecture are the best candidates to scale it up even further.
 Magento 2 CE is designed in a way that it should be prepared to serve middle-sized businesses, too, just
like Magento 1 EE. Functionality wise it lacks enterprise features, such as store credits, better CMS
management etc., but as far as performance is concerned it is on pair with Magento 1 EE – or even better
due to Varnish. An obvious way to scale it up is to utilize built in optimization options, fine tune server
architecture and resources and implement Elasticsearch or Solr for catalog search.
 Magento 2 EE targets enterprises even beyond the middle-size range and aims to offer a highly scalable
architecture that utilizes the benefits of cloud computing.

What are the common mistakes that one can make when having a huge
 
Magento catalog?
 In addition to failing to missing the points above, there are some additional mistakes that one can make
when having a huge Magento catalog. 

 Underscaling – Maybe the most important advice here is that it is very important to build a live
Magento store in a way that it has plenty of system reserves and there is a proven way to scale it up
quickly if needed. Performance tests throughout the development cycles are very useful so that the limits
of the system are known.
 Too many / poor quality 3rd party or custom extensions  –  3rd party extensions are not always built
for and tested to be performant for large catalogs and even a small oversight in the extension design can
result in performance disasters.  
 Lack of proper monitoring – System monitoring is essential so that issues and bottlenecks can be
discovered and eliminated in time.

 Some technical details behind the hints


 In the section below we try to provide some details on Magento concepts and terms for the better understanding
of the subject.

 Data Storage in Magento

 One point is that Magento is very complex and its features are virtually maxing out the limits of a
PHP/MYSQL based system. Products are really massive objects, with many aspects to them from its really
versatile attribute scheme through prices, images, categories, product options to different product relations.
Magento Development Tips (https://aionhill.com/category/magento-development-tips) Page |4
What is more, all these properties may have as many different layers of values as many stores and languages are
used. On the top of that, these aspects are potentially extended by a number of 3rd party extensions. All this
means that saving and updating products in the database is a complex action.

 Indexes – Preparing for Magento’s Frontend Data Access

 The other point is that the structure of the database tables where product data is primarily saved is optimized
for flexible data storage but, due to this complexity and the nature of relational databases, not optimized for the
different types of data retrieval at the same time. To save the day, Magento introduced the so-called index tables
that are populated by the process called indexing.

Indexing in Magento: is a process whereby data is copied from database tables optimized for data storage to
tables optimized for frontend data access
.
Similar useful articles

 ERP Magento
 Magento Project
 Business Management Tips

Most indexes, like URL, category/product relation, price, and stock and attribute indexes, as well as search
index are pretty much essential for a stock Magento to work. There are some other indexes that are optional
such as catalog flat and product flat indexes, which flatten the EAV and multi store data to dedicated store
tables and single product rows.

Indexing in Magento has two faces. On the one hand, it enables Magento’s most powerful features to work and
optimizes data access. On the other hand, it makes even more complex, time consuming and resource greedy to
store product data in a usable way for the frontend. 

 Product Attributes Index – By default, it is essential so that layered navigation works. It copies
product/attribute options data to a table structure that is optimal for finding products based on the
different attribute options they have. The number of attribute options are multiplied by the number of
stores/languages Magento has.
 Product Price Index – By default, It is essential so that sorting and filtering by product prices work.
Any Tier prices, configurable, bundle and products complicate this calculation to a great extent since the
minimal price of complex product types depends on the price of the constituent products. The number of
Magento websites and price groups act as further complicating factors here.
 Catalog URL Rewrites Index – By default, it is essential so that SEO friendly URLs and redirection
from old URLs to new ones work. The depth and size of the category structure, the number of old URLs
and stores/languages has a huge affect on this table. The size of this table can be blown up to millions of
records and keep continuously growing even in the case of a catalog with a few thousand products and
few languages. In Magento 1 CE and Magento 1 EE up to 12.x this is definitely the single most
problematic indexer as far as big catalogs are concerned.
 Category Products Index – optimizes product filtering based on categories by storing catalog-product
relations in a separate table. Again, it is an essential index for the frontend. The fact whether layered
navigation is used for a category (“is anchor”) or not has an effect on these product relations.
 Catalog Search Index (fulltext) – It is essential for default MYSQL-based search to work. It merges
the text of product attributes and option labels of individual products so that it is searchable by the

Magento Development Tips (https://aionhill.com/category/magento-development-tips) Page |5


MYSQL fulltext engine. Here again, as many stores there are, so many index rows are created for a
single product.
 Stock Status Index – This is for calculating the fact whether product is purchasable in Magento, which
may be governed by the mixture of some global, website and product level values and settings.
 Product Flat Data Index – This index is optional and was added at a later point in the evolution of
Magento to speed up product listing/sorting and spare server resources. The way it works is to copy
product attribute values which normally could only be retrieved by huge queries joining multiple tables
into a flat structure with only one record per product and store.

 Database design 

 Magento EAV
o EAV stands for Entity-Attribute-Value. Is a dynamic attribute management pattern which
enables adding/removing/modifying product attributes like colour, manufacturer etc., without
changing the structure of the database tables. This is a very powerful and user friendly feature
and Magento has a number of configurational options for custom attributes out of the box.
 Magento multi store
o It is a unique Magento feature that a single Magento database can manage multiple stores in a
way that attributes can have dedicated store level values that may override default values.
This, again, is a highly user friendly approach and makes it easy, for example, to create different
language versions of the same catalog only by changing the description of the products and label
of the product options.
 Magento indexing
o Indexing in Magento is a process whereby data is copied from database tables optimized for data
storage to tables optimized for frontend data access. Different indexes are used for different
types of data access. Indexing processes are built to speed up the shop and generally provide a
huge benefit, however make data storage more complex and resource greedy. Especially in the
case of full reindexing, which is sometimes inevitable, and any Magento shop should be prepared
to be able to perform a full index rebuild, it may take a long time and require huge resources to
reindex all product data.    

Resources:

 Benchmark Performance Test Results for Magento Enterprise Edition 1.14.1


 Serving 4 million page requests an hour with Magento Enterprise
 Optimizing Magento for Peak Performance
 *Magento Enterprise Edition (EE) 1.13 Benchmarking Guide
 **Magento 2.0 – Site Performance and Scalability Optimizations

 SummaryMagento has gone through a great evolution since the beginning and by now with Magento 2 out it is
certainly not only the most feature rich open source online store system, but it is probably the most performant
as well. If it gets the care and resources it needs it can manage huge catalogs with great success.

Magento Development Tips (https://aionhill.com/category/magento-development-tips) Page |6

You might also like