Computing Graph Limits

If you've ever drawn a graph (manually or with a computer), you know that you have to pick upper and lower bounds for the x and y axes based on the data points you're graphing.  Usually, this is a relatively simple matter, but, at least for me, it's proven to be a difficult problem to solve in a general manner.

I've built some time-based charting components with SVG (because CFCHART sucks), and while they work very well, I simply can't find an algorithm for finding appropriate bounds that works effectively.  My first thought was to extend the range a few percent above and below the min and max, but that usually ends up with grid lines labels that are ugly (1, 1.333333, 1.6666666, 2, etc.).

I'm currently using an implementation that scales based on the number of gridlines, applied to the order of magnitude of the range boundaries.  It works pretty well, except where the order of magnitude changes close to a boundary.  I.e. if the range is from 0.01 up to 1.01, the scale will end up being from 0 to 10, which isn't desirable, because the top 89% of the chart is empty.

So with the success of my last to "fishing for feedback" posts (on BlogCFC, and tabbed documents), here's the next inline.  I'd love to hear about other people's solutions to this sort of problem, because it's quite a nasty one (or I'm an idiot and missing the obvious).

6 responses to “Computing Graph Limits”

  1. Josh

    Instead of trying to calculate bounds first, try to determine a good interval first. For instance, if your values range from 8 – 76, you might find that 10 is a good interval. Then you can base the chart bounds on that inverval. You could make the chart go from 80.

    I don't have a good function on hand to find a good interval, but it shouldn't be too hard to write one.

  2. Josh

    Make that 0 to 80.

  3. Barney

    An interesting turn of scope; I think it's still the same question, but maybe not. I'm thinking ceiling((high – low) / intervalCount) is what you're looking for. However, if high and low are 21,753 and 23,543, it's going to be arbitrary, because ceiling operates at the decimal point, and those numbers won't yield meaningful factional parts.

    This is the same problem that I had to begin with: devising a general solution for finding the limits. There's something lurking in there with scaling the ceiling call (some factor of the range), but I can't put my finger on it.

    Here's a better description of what I do currently:

    compute the range of values, create the smallest number with 1 non-zero digit that is strictly larger than the range, divide that by the number of intervals, and then find the next smallest and next largest intervals below and above the actual range. So for 21753 and 23543, the range is 1790, the 1-sig-fig number is 2000, 10 intervals yields 200 per interval, and the tightest boundaries are 21600 and 23600 (which, as you'd expect, are 2000 apart).

    That worked really well in this case, but when there's a jump right at the end, not so much. Consider 21,500 to 23,501: range is 2001, 1-sig-fig number is 3000, so 300 per interval, with boundaries of 21,300 and 23,700, which are NOT 3000 apart.

    However, what I'm now realizing is that perhaps forcing the number of intervals is what my problem is. The algorithm does yield a good range, just not one that can be broken in the right number of parts. But that number doesn't really matter. What does 10 versus eight intervals really matter?

  4. Patrick McElhaney

    It sounds like you want your interval to be 10^n, where n is a whole number.

    Gotta run, but I hope that helps you move a little bit closer.

  5. Adam Ness

    I did up some formulas for this a few years ago, using CFCHART. I was always using a minimum value of 0, so you might need to adjust it, but here's the psuedocode:

    <cfset orderOfMag = int(log10(maxValue))>
    <cfif orderOfMag gt 1 and (maxValue/10^orderOfmag) lt 5>
    <cfset totalMax = (int(maxValue/10^orderOfMag)*4+1)/4 * 10^orderOfMag>
    <cfset gridLineCount = int(totalmax/10^orderOfMag)*4+1>
    <cfelseif orderOfMag gt -1 or (maxValue/10^orderOfmag) gt 5>
    <cfset totalMax = (int(maxValue/10^orderOfMag)*2+1)/2 * 10^orderOfMag>
    <cfset gridLineCount = int(totalmax/10^orderOfMag)*2+1>
    <cfset totalMax = (int(maxValue/10^orderOfMag)*4+1)/4 * 10^orderOfMag>
    <cfset gridLineCount = int(totalmax/10^orderOfMag)*4+1>

  6. Barney

    Adam, that's good stuff. I'll have to tune my solution a little on that. I don't know why the heck I didn't think of using logarithms, but that should make things much more straightforward.