Skip to content
Lawsuit Help Desk

Lawsuit News Center

Turbocharging Your Magento 2.3: An Insider's Guide to Navigating Database CPU Load Increases and Optimizing Search Term Performance

Turbocharging Your Magento 2.3: An Insider’s Guide to Navigating Database CPU Load Increases and Optimizing Search Term Performance

Turbocharging Your Magento 2.3: An Insider's Guide to Navigating Database CPU Load Increases and Optimizing Search Term Performance

In the complex world of e-commerce, efficient database management is paramount. This article explores the challenges and solutions around increased database CPU load and search term performance issues in Magento 2.3. Through a deep dive into the intricacies of query operations and the removal of the DISTINCT operator, we provide invaluable insights to turbocharge your Magento experience.

Understanding the Performance Issue: Peeling Back the Layers of Magento 2.3

The latest upgrade to Magento 2.3 has brought with it a wind of change in the eCommerce landscape, but not without its share of challenges. The most prominent among these is the significant increase in database CPU load. The problem, as numerous users have reported, is that the new version brought with it a query that is significantly slowing down the system, causing high CPU usage on the Magento 2 admin panel.

The query in question, SELECT DISTINCT COUNT(*) FROM search_query AS main_table WHERE (main_table.store_id = 1) AND (num_results > 0), is a part of the 2.3.0 update, where the Popular Search Term Cache was added. What was expected to finish within a reasonable amount of time has now become a bottleneck, putting a lot of strain on the database, especially on live sites with about 2.7 million search terms.

The longer the search_query table, the longer the query takes to complete. This high CPU usage is causing performance issues and needs an immediate solution, especially since the issue is still relevant and has been reproduced on the latest 2.4-develop branch.

The Culprit Unveiled: The Distinct Count Query and Its Impact on CPU Load

The problem seems to lie with the DISTINCT COUNT function in the query. This function, while useful in certain contexts, is overkill in this case, owing to the unique constraint on query_text and store_id in the search_query table. The DISTINCT operator is effectively not doing anything but hurting query performance.

The main part that takes time is the num_results > 0 part. Removing this part improves the query duration significantly, but there are concerns about the loss of visibility into user search term behaviors. But before we delve into that, let's understand why having the DISTINCT operator in this query is causing high CPU usage in the Magento 2 admin panel.

The DISTINCT operator can be a resource hog. When used in a COUNT function, it directs the database to count only the unique rows. This requires scanning every row of the search_query table – a massive operation, especially when the table contains millions of records. Furthermore, the DISTINCT operator is unnecessary here due to the unique constraint on query_text and store_id. The results are always the same – with or without this operator.

The Unexpected Consequence of Popular Search Term Cache in Magento 2.3

In a bid to improve performance, the 2.3.0 update introduced the Popular Search Term Cache. However, the execution of the DISTINCT COUNT query for this feature has led to an unexpected consequence. The increased CPU load, owing to this query, has become a significant issue for store owners.

The Popular Search Term Cache, which was intended as a performance-enhancing feature, has ended up causing a bottleneck. The larger the search_query table, the more prolonged the execution time of the query, leading to a severe strain on the database. On a live site with approximately 2.7 million search terms, the queries have become sluggish, affecting overall system performance.

This issue underscores the importance of rigorous testing, particularly for performance-related changes. Even well-intentioned features can lead to unintended consequences if not properly evaluated for their impact on system resources.

In conclusion, the introduction of the Popular Search Term Cache in Magento 2.3 has inadvertently led to an increase in CPU load due to the inefficient execution of the DISTINCT COUNT query. The removal of the DISTINCT operator can significantly improve query performance, offering a viable solution to this issue. However, it is crucial to note that this workaround should be evaluated for compatibility with the specific Magento version in use.

Dismantling the Query: How Removing the DISTINCT Operator Improves Performance

The crux of the problem lies within the query SELECT DISTINCT COUNT(*) FROM search_query AS main_table WHERE (main_table.store_id = 1) AND (num_results > 0). When dissected, it becomes evident that the part of the query that is causing the most strain on the CPU is the DISTINCT operator. What was meant to help streamline the database's process has inadvertently become a thorn in its side: a case of a well-meaning addition engendering performance issues.

This operator is not only unnecessary but also detrimental to the performance of Magento 2.3. The search_query table already has a unique constraint on query_text and store_id, meaning the results produced are always the same, regardless of the DISTINCT operator's presence. Logically then, removing this operator should not negatively impact the query's result but could significantly improve its performance.

To further stress this point, consider this: on a live site with about 2.7 million search terms, queries take an extended amount of time to finish, putting undue strain on the database. However, removing the DISTINCT operator reduces the time a query takes to complete while still producing the same results. The DISTINCT operator, then, is not only unnecessary but is also a liability when it comes to performance.

Optimizing for the Future: Evaluating the Compatibility and Effectiveness of the Proposed Solution

The solution – removing the DISTINCT operator from queries within Magento\Search\Model\ResourceModel\Query\Collection – is straightforward and effective. However, it's crucial to evaluate its compatibility with the specific Magento version being used.

Some may worry that this modification might cause other unforeseen issues or disrupt existing functionalities. This is where testing comes in. The implementation of this solution on the latest 2.4-develop branch showed promising results: the removal of the DISTINCT operator significantly improved query performance, and the issue was also still present in Magento 2.4.5-p4, suggesting this solution's broad applicability.

While this solution addresses the performance issue, it's also essential to consider the impact on search term tracking. For businesses that rely heavily on insight into user search term behaviors, an alternative might be to insert only a fraction of the search terms or asynchronous insertion of search terms in batches. However, these solutions require further exploration and testing.

The Path Forward: Addressing Performance Issues in Subsequent Magento Versions

While the removal of the DISTINCT operator offers a quick and effective solution to the performance issue, the path forward needs to address this issue in subsequent Magento versions. The implication of this performance issue goes beyond the 2.3 version. The issue has been reproduced in Magento 2.3.4 community version and Magento 2.4.4-p1, emphasizing the need for a more comprehensive, future-proof solution.

The Magento team must take a two-pronged approach to this issue. First, immediate action should be taken to remove the DISTINCT operator from queries within \Magento\Search\Model\ResourceModel\Query\Collection. This will alleviate the current high CPU usage issues for existing users.

Secondly, long-term strategies should be developed to prevent similar issues from arising in future versions. This could involve a deeper review of the addition of query operators, more comprehensive testing before releases, and perhaps even a reevaluation of the current approach to search term caching.

By addressing the immediate performance issues and implementing proactive strategies, Magento can ensure that its platform remains an efficient and reliable choice for e-commerce businesses worldwide. This, too, is an example of how critical it is to continually evaluate and optimize even the most sophisticated digital systems – a testament to the ever-evolving nature of the technology landscape.
In conclusion, the upgrade to Magento 2.3, although intended to enhance performance, has resulted in an unforeseen increase in CPU load due to the inefficient execution of the DISTINCT COUNT query in the Popular Search Term Cache. The removal of the DISTINCT operator can significantly improve the query's performance, effectively reducing the strain on the database:

  • The DISTINCT operator, a resource-intensive element of the query, is redundant due to the unique constraint on query_text and store_id in the search_query table.

  • Its removal does not impact the quality of the results, as the query's outcome remains the same with or without the operator.

  • This simple modification, tested on the latest 2.4-develop branch, shows promising results, and its broad applicability suggests it could offer immediate relief to existing users grappling with the performance issue.

Looking ahead, this issue underscores the need for Magento to develop a two-pronged approach: immediate remedial action and long-term preventative strategies. This could involve a thorough review of the addition of query operators, comprehensive testing before releases, and a possible reevaluation of the approach to search term caching. By proactively addressing these performance issues, Magento can remain a reliable and efficient platform for e-commerce businesses worldwide, exemplifying the dynamic and evolving nature of digital technology.