Skip to content
Lawsuit Help Desk

Lawsuit News Center

Decoding Magento 2.3: Navigating the Database CPU Load Dilemma and the Path to a High Performance Query Structure

Decoding Magento 2.3: Navigating the Database CPU Load Dilemma and the Path to a High Performance Query Structure

Decoding Magento 2.3: Navigating the Database CPU Load Dilemma and the Path to a High Performance Query Structure

The upgrade to Magento 2.3 heralded a puzzling surge in database CPU load for many websites, primarily attributed to a time-consuming query that was part of the Popular Search Term Cache. This article delves into the intricacies of this issue, exploring its roots within the search_query table, the role of the DISTINCT operator and the quest for an efficient and high performance query structure. Join us as we decode the Magento 2.3 conundrum, providing insights into potential solutions and workarounds for this persistent performance issue.

Unraveling the Complexity: The Notorious Query and Its Impact on Database Load

The upgrade to Magento 2.3 was an anticipated event for many website owners. However, the results of this upgrade brought about a puzzling surge in database CPU load. The culprit? A frequent and time-consuming query: SELECT DISTINCT COUNT(*) FROM search_query AS main_table WHERE (main_table.store_id = 1) AND (num_results > 0). This query, part of the 2.3.0 update's Popular Search Term Cache, was met with growing concern as the size of the search_query table directly related to the duration of the query's completion time. This was especially problematic on live sites with a large number of search terms — about 2.7 million in some instances — leading to significant strain on the database and slow performances.

Dissecting the Query: The Role of 'num_results' Condition and Popular Search Term Cache

The num_results > 0 condition within this notorious query has been identified as a main contributing factor to its time-consuming nature. This condition essentially filters the results to include only those search terms that have yielded results. However, as the search_query table grows in size, the condition takes increasingly longer to process, causing a ripple effect of performance issues. The issue was further compounded by the addition of the Popular Search Term Cache in Magento 2.3. This feature heavily relies on the problematic query, and the larger the search_query table becomes, the slower the query's completion time. Suggestions to improve performance by removing the num_results condition were made, but they required further investigation as this condition plays a vital role in the functionality of the Popular Search Term Cache.

The Workaround Solution: Overriding the Execute Function

While the issue was still under investigation, a temporary workaround was implemented to alleviate the strain on the database. This involved overriding the execute function in Magento's CatalogSearch/Controller/Result/Index. The modified execute function was changed to only use the getNotCacheableResult part, effectively removing the code for the getCacheableResult part.

This workaround solution effectively reverted the performance to a state similar to that before the Magento 2.3 upgrade. However, it was not a permanent solution as it sidestepped the problem, rather than address it directly. Nevertheless, it provided some relief to those with large search_query tables while a more permanent solution was being explored.

The Magento community remains proactive in addressing these performance issues, as evidenced by the ongoing attempts to find a definitive solution. In the face of complexity, the ingenuity and perseverance of the open-source community continue to shine, promising hope for a more efficient and high-performance query structure in future Magento updates.

Understanding the Underlying Issues: The Persistent Performance Issue in Magento 2.4.1-p1 and Beyond

It is critical to understand that the performance issue did not resolve itself with subsequent updates, even in Magento 2.4.1-p1 and 2.4.4-p2 versions. Despite numerous diagnostic efforts and remedial measures, the issue remained obstinately persistent, causing distress to developers and website owners. The primary culprit remained the notorious query that continuously recorded as a slow one. The query's execution time saw a significant increase when the search_query table had a large number of records, causing considerable strain on the database. It was evident that the current structure of the query and the caching mechanism were not addressing the performance problems effectively.

Deciphering Potential Fixes: The Importance of the 'DISTINCT' Operator and Proposed Modifications

The quest for the optimal solution to this performance issue led to a critical revelation – the DISTINCT operator, a common feature in many database queries, was unnecessary in this particular context. This was because the search_query table already had a unique constraint on 'query_text' and 'store_id', rendering the DISTINCT operator redundant. Consequently, removing the DISTINCT operator could significantly reduce the query's execution time and achieve the same results. This crucial modification could potentially resolve the performance issue that had been plaguing Magento 2.3 and subsequent versions.

However, the fixes were far from straightforward. If the performance issue was triggered by a high search term cardinality, further measures would be necessary. Therefore, while the DISTINCT operator's removal was a promising start, it was not an all-encompassing solution, and additional strategic interventions were required.

Exploring Alternative Strategies: Asynchronous Insertion, Partial Term Insertion, and the Trade-Off of Visibility vs. Performance

As the search for the optimal solution continued, several alternative strategies surfaced. One such strategy was the asynchronous insertion of search terms in batches, a technique capable of preventing slowdowns in search request times. Another was the insertion of only a fraction of the search terms, which could also contribute to alleviating the performance issue. However, these strategies came with their own set of trade-offs.

The option to stop tracking search terms entirely was on the table, but it would result in a loss of visibility into user search term behaviors – a vital insight for any e-commerce business. The challenge was to balance the trade-off between visibility and performance adequately.

In conclusion, the path to a high-performance query structure is a complex one, fraught with challenges and trade-offs. However, understanding the underlying issues, identifying potential fixes, and exploring alternative strategies can help navigate this path more successfully. While Magento's update to version 2.3 presented a significant performance issue, it also offered valuable lessons in database management and optimization that are applicable beyond Magento, shaping our understanding of high-performance query structures.
In conclusion, the journey to achieving a high-performance query structure in Magento 2.3 is laden with technical hurdles and strategic decisions. The crux of the problem lies in understanding the dynamics of the troublesome query and its profound impact on database load. A clear pathway towards resolution involves:

  1. Identifying and eliminating redundant elements such as the 'DISTINCT' operator
  2. Exploring alternative strategies like asynchronous insertion and partial term insertion
  3. Balancing the trade-off between visibility and performance

While these measures have potential, they aren't standalone solutions but intricate pieces of a larger, more complex puzzle. The Magento 2.3 update, despite its performance issues, provides an invaluable opportunity to learn, innovate, and refine our approach to database optimization and management. Thus, this challenges not only drives us to develop more efficient query structures but also equips us with insights that transcend the realm of Magento, enhancing our broader understanding of high-performance database systems.