Apache Process Consuming All Memory

By iotk

My team and I recently encountered a difficult to debug issue that was causing one of our servers to crash.  I’m hoping that by writing about it here I might save someone the time we spent reproducing and ultimately resolving the issue, even if it was a bit obscure.

The server in question ran just fine for days until a single Apache process would start consuming all of the available memory.  Once the server’s memory was around 60-70% utilized by this Apache process, kswapd would kick in forcing all processes to swap to disk.  Eventually the machine became so slow that we would have to either restart Apache (if we could) or reboot the machine.  The interesting thing about this issue was that Apache did not crash nor did any part of the system log an issue – no core dumps, error log entries, or PHP warnings or errors.

By chance a project manager was testing on one of my developer’s local environments when his machine exhibited the exact same behavior as the production server.  This was interesting to us because it meant the issue had nothing to do with the operating system (one was RedHat Enterprise 5 the other openSuSE 10.3) and it meant that we could reproduce the issue on a machine other than one in production.

It took us 2 days to track the issue down.  We attempted to quickly resolve it by upgrading all of the software involved in our stack but nothing worked.  But this provided us with a clue – it had to be something in the code that we had written.  Indeed it was.

The following are the versions of software involved in our configuration:

  • RedHat Enterprise 5 or openSuSE 10.3 (both 64-bit)
  • Apache 2.2.3
  • PHP 5.2.4
  • OCI 1.3.2
  • Oracle 11.1.0.7.0
  • Oracle Instant Client 11.1.0.1
  • IOTK

The issue turned out to be a query with 7 cursors, something like this:

select [columns],
       cursor (select [columns]
                 from table_a) cursor_1,
       cursor (select [columns]
                 from table_b) cursor_2,
       cursor (select [columns]
                 from table_c) cursor_3,
       cursor (select [columns]
                 from table_d) cursor_4,
       cursor (select [columns]
                 from table_e) cursor_5,
       cursor (select [columns]
                 from table_f) cursor_6,
       cursor (select [columns]
                 from table_g) cursor_7
  from parent_table
 where id = :id

The parent table had a number of 1-to-many relationships and the display often called for all of the information to be displayed on the same screen. To avoid too many round trips, this is the solution the developer came to. The query was fetching what represented the core of the product, the asset the site was selling so it was used a lot. What’s interesting is that there was no rhyme or reason to why it crashed – it would run fine for some time and then just fall over.

By simply removing this query the problem was resolved. We chose to break this query into 3 and use lazy loading in the object that represented these entities. We created 1 query that contained 2 cursors only (2 cursors in a single query has never given us a problem), another query (lazy loaded) contained 4 queries joined using “union all”, and the last query (also lazy loaded) was a single query.

I’m curious to know how a single query can go on a tear like this and consume so much memory.

Leave a Reply