Skip to content
Lawsuit Help Desk

Lawsuit News Center

Unlocking Magento Performance: A Dive into Query Optimization and the Power of DISTINCT

Unlocking Magento Performance: A Dive into Query Optimization and the Power of DISTINCT

Unlocking Magento Performance: A Dive into Query Optimization and the Power of DISTINCT

The contemporary e-commerce landscape demands high-speed, optimized performance, and in the realm of Magento, this often hinges on query execution efficiency. Our article "Unlocking Magento Performance: A Dive into Query Optimization and the Power of DISTINCT", delves into the recurring performance issue related to the caching of popular search terms and the impact of the DISTINCT operator. We'll embark on a technical journey exploring potential solutions and the significance of query structure, offering insights into how these aspects can drastically increase performance and enhance your Magento experience.

The Genesis of the Performance Issue in Magento

The performance issue in Magento, a popular e-commerce platform, particularly surfaced with the upgrade to Magento 2.3. This upgrade led to an unexpected increase in database CPU load, largely attributed to a frequent and time-consuming query. Predominantly, the query in question was selecting distinct counts from the 'search_query' table based on certain conditions, with the main culprit being the 'num_results > 0' condition. The search_query table could stretch to hundreds of thousands of entries, and the larger the table, the longer the query took to complete, thus putting a strain on the database.

The fallout was that the Magento admin panel could become unresponsive due to the persistent issue, even when using Elasticsearch and disabling search suggestions. This issue was replicated on various versions of Magento, including the latest 2.4-develop branch, with the problem being the use of distinct counts on a table with millions of records. Ironically, this caching feature for popular search terms, initially implemented for performance improvement, turned out to be the root of the problem.

Decoding the Controversy of the DISTINCT Operator

At the center of the controversy was the DISTINCT operator. This operator is used in SQL queries to return only distinct (different) values, and in this context, it was applied to select distinct counts from the 'search_query' table. However, its usage in a table with millions of records was causing inefficiency. This is because the larger the table, the more time-consuming the operation of selecting distinct counts becomes.

The DISTINCT operator was found to be unnecessary and even redundant due to the unique constraint already present on 'query_text' and 'store_id.' Consequently, it did not add any value to the query results and instead negatively impacted query performance. It was evident that the DISTINCT operator was adding an unnecessary overhead to the query, increasing the execution time and affecting site performance.

The Role of 'search_query' Table in Performance Degradation

The 'search_query' table in Magento plays a significant role in the performance degradation issue. It is responsible for storing the search terms used by customers on the online store. While this feature can be useful for tracking popular search terms and predicting future trends, it can result in a massive table if the store attracts a high volume of searches.

The table's size can reach hundreds of thousands of entries, and this large size directly affects the performance of the queries run on it. More specifically, the issue arises when Magento attempts to select distinct counts from this table under the 'num_results > 0' condition. This operation becomes particularly slow due to the large table size, and the performance degradation is more pronounced with a high search term cardinality.

The combination of a large 'search_query' table, the use of the DISTINCT operator, and the condition 'num_results > 0' contribute to the performance issue. Even with the unique constraint on the 'query_text' and 'store_id' columns, the query still takes a considerable amount of time to execute, thus slowing down both the Magento admin panel and the front-end user experience.

Exploring Potential Fixes to the Magento Query Conundrum

The crux of the performance issue in Magento lies in the use of the DISTINCT operator in queries of the search_query table. The table, which can reach hundreds of thousands of entries, is queried for distinct counts based on certain conditions, the main culprit being the num_results > 0 condition. The larger the search_query table, the longer these queries take, straining the database CPU load and affecting the responsiveness of the Magento admin panel.

A potential solution lies in eliminating the DISTINCT operator from these queries, specifically within the Magento\Search\Model\ResourceModel\Query\Collection class. The DISTINCT operator, in this case, does not add any value to the query results – a redundancy that becomes apparent when considering the unique constraint already placed on the query_text and store_id columns in the search_query table.

The removal of the DISTINCT operator may have been frowned upon in the early stages, mainly due to concerns about the accuracy of search results. However, it emerged that this modification resulted in significant improvements in query performance, while not detracting from search result accuracy, making it a compelling fix to the Magento query conundrum.

Impact of DISTINCT Operator Removal on Query Performance

The impact of removing the DISTINCT operator from the search_query queries is notable. Execution times have been reported to decrease dramatically, from several seconds to mere milliseconds. This reduction in query execution time directly translates to reduced database CPU load, a more responsive Magento admin panel, and overall, a smoother, more efficient Magento experience.

But like any significant modification, the removal of the DISTINCT operator needs to be evaluated in context. For instance, it's crucial to consider users' specific Magento versions and their search term cardinality. High search term cardinality could exacerbate the performance issue and may require additional measures, like asynchronous insertion of search terms in batches or limiting the number of inserted terms.

The Road Ahead: Future Prospects for Magento Performance Enhancement

The DISTINCT operator issue underscores the importance of query optimization in improving Magento's performance. However, as we move forward, it's clear that the journey towards performance enhancement is far from over.

Several prospective solutions could be explored further, such as adding an index on the search_query table, specifically on the store_id and num_results columns. While initial testing suggests this could resolve the issue, the impact on write loads demands further evaluation.

Eliminating the tracking of search terms entirely is another avenue that has been suggested, although this may have broader implications for user experience and site functionality that would need to be carefully considered.

It's exciting to anticipate the release of future Magento versions, which will undoubtedly continue to address and enhance performance issues. As developers and users alike unlock Magento's full potential, the DISTINCT operator saga serves as a reminder of the power of query optimization and the importance of continuous innovation in the ever-evolving world of e-commerce.

In conclusion, the exploration of Magento's performance issue underscores the intricate interplay between database design, query optimization, and operational efficiency in e-commerce platforms.

  • The unnecessary use of the DISTINCT operator, combined with the 'num_results > 0' condition and the size of the 'search_query' table, were found to be the main culprits behind the problem.

  • The proposed solution of removing the DISTINCT operator showed promising results, remarkably reducing the query execution time and thereby enhancing the overall Magento experience.

  • However, the journey doesn't end here. Future prospects for Magento performance enhancement include adding an index on the 'search_query' table, or possibly eliminating the tracking of search terms altogether.

It's intriguing to contemplate how the continued evolution of Magento will drive further advancements in e-commerce performance optimization, ultimately delivering a seamless and efficient online shopping experience. It is this relentless pursuit of improvement, as exemplified by the DISTINCT operator saga, which will propel Magento and similar platforms into the future of e-commerce.