Just Say No to Materialized MySQL Views

I'm a big fan of MySQL, but I ran into an interesting performance issue this weekend regarding views. MySQL added view support in 5.0, and I've used it to great effect, but it's not all roses.

Edit (2009-04-22): I used the term "materialized view" below in a misleading way.  True materialized views are stored in permanent concrete tables.  Below I refer to views that are created into a temp table on demand, used for the query in question, and then deleted.

Within MySQL, there are several ways to categorize views, but the one I care about is materialized vs. rewritten/merged. Materialized views are views that, as you might guess, are materialized when they're needed. In simpler terms, the view definition is used to create a temporary table and then that temporary table is used for the actual query. Rewritten or merged views are combined with the actual query by the SQL engine and executed as a single query.

The performance problem I ran across was with the former view type. It seems that the internals of MySQL manage materializing in a very inefficient way. I can't provide any sort of hard data to support that, but what I experienced was definitely points to the conclusion.

I had a view that had some fairly intensive stuff in it (both structurally and computationally). I knew it was materialized and was fine with a little bit of a slowdown, but I was seeing times in the 15 second range for certain queries that used it on my personal server (which, admittedly, is a only 3GHz Celeron). Manually materializing the view (i.e. running the view statement to create my own table) and then running the queries in question took a few hundred milliseconds tops, counting both the generation of the temp table and the actual query time (with most of it being the former).

The only explanation I can come up with was that the materializing of the view was happening repeatedly. Very repeatedly. Recasting the "view" as a normal table that I repopulate as needed has eliminated all the performance issues, and the queries are now typically running in tens of milliseconds.

I should mention that I've never had any performance issues with rewritten/merged views. MySQL seems to do a fantastic job efficiency-wise with them, and I continue to use them.

Interestingly, I had a very similar problem with Oracle a few months ago. The main difference was that I wasn't using views, but rather subqueries in the FROM clause (i.e., inline views). In that app, creating separate views was both impossible (we didn't own the database), and unnecessary (only one query used the view), and we were able to get around it with "factored subqueries" (the WITH clause).

11 responses to “Just Say No to Materialized MySQL Views”

  1. Jordan CLark

    Isn't this form of "materializing" data from different sources into a new table a form of denormalization? How much benefit is there when you weigh the cost of building these tables and storing the data in duplicate? Compared to just pulled out fresh data with a normal rewritten/merged view? I thought a big part of using normalizing data was that it was more efficient, with less data there is less to store and index, less to scan, less to update, and more can fit in fast RAM?

    Thanks!

  2. Jordan CLark

    What part of the operation makes it CPU bound? Is the view aggregating data, or running functions to produce the output? Is it possible to denormalize just some fields manually into a new table and join, instead of duplicating everything? Or does this reduce the performance gains?

    Also just out of curiosity how much data is this working with, thousands, tens of thousands of rows, more? Does this size matter, like the larger it is, the more you can save?

    P.S. I wish your blog had a comment email subscription option.

  3. Mark Cahill

    I had a similar problem with views on our web reporting platform. The culprit that time was that the view was not using the index from the underlying table. I forced a primary key on the ID, and the query times went down 1000%. Just something to look into…

  4. Dom

    Thanks for your article! I seem to have hit the same problem with a test database on MySQL 5.1. In particular I wanted to mimic the behaviour of queries in Access (from which I am wishing to migrate the data). In Access you can 'stack' queries on top of one another and supply parameter values to the top level query which are inherited and used by lower level queries.

    MySQL does not allow parameter values (aka user variables) with views, but there is a workaround using a stored function (which can access and then return the user variable). But the performance is terrible.

    Previously I have used sub-queries (in MySQL 4.1), though not with user variables, and they have always been impressively fast (but I had to write my own VBA code to nest sub-query definitions into higher-level queries). Views should be neater but with their inability to access user variables (except inefficiently as it seems through a stored function), and the performance hit you describe, I guess I must stick to coded sub-queries for now.

  5. Vaughn

    "You keep using that word. I do not think it means what you think it means."

    The error is the statement "Materialized views are views that, as you might guess, are materialized when they're needed."

    From http://en.wikipedia.org/wiki/Materialized_view : "A materialized view takes a different approach in which the query result is cached as a concrete table that may be updated from the original base tables from time to time. This enables much more efficient access, at the cost of some data being potentially out-of-date. It is most useful in data warehousing scenarios, where frequent queries of the actual base tables can be extremely expensive."

    From http://dev.mysql.com/doc/refman/5.1/en/faqs-views.html :
    "23.6.5: Does MySQL 5.1 have materialized views?

    No. "

    (Same answer for MySQL 5.0, 5.1, and 6.0)

  6. Mark Cahill

    Wow. Way to resurrect a REALLY old post.

  7. all

    Hi all,
    As a Oracle architect, ex developper, the MV (Materialized view) can be easily simulated with the SQL used to create a dynamic view (standard view) and replace the CREATE VIEW by CREATE TABLES MV_xxxx as select….. I am convinced argument exist to go against this option. () drop table time, refresh (recreate the MV) frequency etc… What so ever it is a KISS solution, and idiot-proof. B rgds – thierry