Squaring gridlines in Excel - an adventure

FINALLY! Finally finally finally. This took countless (wo?)manhours to get it working, but I FINALLY did it.

I can now automatically turn this chart

into this:

Looks simple, eh? Well, it's NOT.

I wanted the display to be squared off so that it showed the circles properly. Of course the values are right, but the user would have to actually check the values and couldn't simply glance at the chart and get the picture. In addition to that, the angle of the sets of circles is actually relevant. You can't display a proper angle on a skewed chart! I also didn't want to modify the chart's own shape or the size of the plot itself. So I went searching for techniques to fix the skewed-ness.

I started off with Jon Peltier's code provided here. It looks like it would work well; in practice it was somewhat lacking for my application. (1. It has to run in a loop and I figured there had to be a way to only run the code once and 2. under some circumstances the loop would run infinity, and while I could modify the loop constraints to fix this, see point 1) The idea was solid though, so I pursued it.

The trouble with formulas

Discovered another Excel quirk today.

Humans see the numbers in a cell, while Excel sees the formula in the cell. The values are simply results of the formula; of secondary importance. Most of the time this isn't a problem, however if a user is applying functions to values that have formulas hidden behind them they can get unexpected results. To see this for yourself, the above site suggested you enter the formula "=RAND()" into some cells and then sort them.

So I did. I used "=ROUND(RAND()*100, 0)" as my formula to give me easy to see results. The left is cells with the formula. The right is after sorting those cells:

In this particular instance, when you change the value of a cell, the _Change event fires which triggers an internal recalculation of the formulas on the sheet. This can be turned off programatically. After turning off automatic calculation, my results look like this:

Thinking it through it's fairly obvious why you get unexpected results from sorting formulas: there is no good way to approach this problem. Either approach gives (eventually) unexpected results. If you only sort the values, and leave behind the formulas in their original cell, the values all change on recalculation. You could destroy the formula and only keep the values, but then you've gone and destroyed the user's formulas.

Many formulas are addressed relatively; for example instead of the cell A1 explicitly referencing the cell "H5" it could relatively addresses the cell 8 to the right and 5 down. If you move that relative address around, the value will change on recalculation. The problem is that the user can't know what to expect without doing tests (or finding others who have done these tests) to see how their version of Excel behaves. The best solution here is to copy just the values, paste them elsewhere and sort those, leaving the original formulas intact.

A wonderful quote from the above site (emphasis mine): "Excel bugs don't exist, but Excel has a lot of features."

Coloring Excel charts the hard way

I’ve been managing chart colors in Excel VBA. I had finished most of it last year, but just today had to track down a bug in the coloring. Working with colors has been an adventure since I am using Office ’03. In ’07, Microsoft made a lot of significant changes to internal stuff that would have made this a lot nicer to work with, although a lot of idiosyncrasies are still there.

Here’s my understanding of how Excel ’03 manages its colors:

Excel has a color palette that holds 56 colors. Those are the only colors you can use in your spreadsheet. It appears that, internally, all objects’ colors are stored not as their RGB value but rather by the index of the color palette. In addition to being limited to using the palette, the palette itself is indexed in a very confusing way (default colors):

You may think that you can overcome the color restriction programmatically – I certainly thought I could. That is how I discovered a very strange idiosyncrasy with Excel’s color management.

Here we have a simple subroutine that changes the plotcolor on a chart:

Sub ColorPlot(myChart as Chart, plotColor As Long)
     myChart.PlotArea.Interior.color = plotColor
End Sub

If I pass the above chart and the color value 1390026 to ColorPlot, I get the following:

Short circuiting and a short temper

I’ve been working with Visual Basic for Applications for a few days again, picking up where I left off last summer. I just remembered one of the things I particularly hate about VB6/A: it doesn’t allow short circuit evaluation.

Short circuiting is when a code flow conditional (Conditional of an If, WHILE, etc.) is only evaluated as much as it needs to be to determine truth. The main thing to remember is that expressions can contain a function call.

Ex1 AND Ex2: if Ex1 is false, won’t evaluate Ex2
Ex1 OR Ex2: if Ex1 is true, won’t evaluate Ex2

It doesn’t sound particularly fantastic when stated so simply, but most other languages do implement it by default on their control structures, or otherwise have a method to do so. VB does not (although I believe .NET does).

My main issue is one of performance. Working with Excel through VBA means you are manipulating a lot of a data in a suboptimal way, especially converting to and from Strings. There is a lot of overhead involved here! In many cases it isn’t a problem and the user won’t notice the inefficiencies (although the “programmer” – me – will, but that’s just the OCD talking) but in some instances those small overheads accumulate – for example, when you’re crunching through several thousand rows of input. It’s mostly bothersome for validating input on the fly – an example would be this ugly IF:

If IsNumeric(latCell.value) And IsNumeric(latCell.Offset(0, 1).value) _
And Not IsEmpty(latCell) And Not IsEmpty(latCell.Offset(0, 1)) Then
     '---Calculate the value
End If

The above makes sure that two inputs aren’t strings or blank cells (blank cells in VBA are “numeric” when their value is taken, which is a whole other problem that I won’t get in to). Ignoring my failure to use variables, do you see a problem? Even if the first cell isn’t numeric (and therefore we won’t be able to do the calculation) the language still evaluates every other expression, wasting time it could use to do something actually useful.
Another point that comes up often is testing properties or objects. When VB tries to test an object that hasn’t been initialized, it throws a (fatal, by default) error. It will also error when an object of type Variant doesn’t currently have the specified property.

Here is an example that I’ve encountered: verifying that Selection contains valid data. I want to make sure that Selection is an object of type Range and also that, if it is a Range, it has only two columns.

If Not TypeName(Selection) = "Range" or Selection.Columns.Count = 2 Then
     MsgBox ("Please select a valid range of pairs")
     Exit Sub
End If

The above will throw an error if the selection isn’t a Range because it will always attempt to count columns. In a short-circuiting language this wouldn’t be an issue. The workaround in VB is to have two separate ifs, or use On Error Resume Next to ignore the error.

In other languages you can actually optimize execution by ordering the expressions in order of likelihood. In VB it doesn’t matter at all – every expression will be evaluated.

So why would you not use short circuiting? It seems pretty obvious that it is beneficial. Well, I haven’t been totally impartial in my presentation. There is a reason why a language wouldn’t implement short circuiting: consistency and (pseudo)determinism.

It is guaranteed that the code will behave exactly the same way every time regardless of the input. People will, either in ignorance or in an attempt to be clever or because of a hatred of fellow programmers*, insert function calls that have side effects into conditionals. When a language short circuits its conditionals it is trickier to track or predict what functions will be called. VB is designed to be used by people with little traditional programming experience and makes it clear its goal is to simplify programming.

In my use of languages, short circuiting is far more beneficial than not and it is frustrating to not have access to it.

*inserting side-effect containing functions into conditionals (especially short-circuited ones) is a form of code obfuscation. Its practice is frowned upon and is definitely not kosher.


Sometimes I like to repost informative posts I've made elsewhere. In response to a post in a bra fitting thread:

You went from an A (or "A.5"?) to a C? Granted, I'm a guy, so it's hard for me to imagine, but that sounds like the smaller size must have been horribly uncomfortable. I'm really curious if somebody can describe what that's like. I'm imagining stuffing my feet into shoes that are 2 sizes two small every day or something. Is it like, you've conditioned yourself to it over the years and don't even notice the discomfort until you find something that fits correctly?

A cupsize without a bandsize is literally meaningless if you are trying to imagine what the breasts look like. It is a misconception that a D is always bigger than a C is always bigger than a B, etc. A 32D is the same amount of breast as a 34C, which is the same amount of breast that a 36B is, but on different frames. Surprisingly few people know this - both men and women. Lots of women say things like "My breasts can't possibly be Ds! I fit into these A-cups!" but that's because they don't actually understand how bra sizing works.

Bras sizing works by (roughly) describing the ratio between the ribcage and the bust measurements. In a perfect world where all manufacturers sized the same, cup sizing would be something like an A cup is a 1" difference, B: 2", C: 3", etc. The band size represents the ribcage measurement (Usually not literally, though, due to stretchy fabrics and manufacturer sizing. I've found that the band size is, on average, two to four inches bigger than ribcage measurements IRL.)

In addition to the above, which you could probably figure out on your own, the same cup size on a different band size accommodates a different volume of breast tissue.
So if someone says they are a 32C and someone else says they are a 36C, the second girl will have "bigger" breasts. Think of it this way: while the difference between their ribcage and bust on both of them is 3", the second girl's ribcage is wider so the total circumference of her bust is 4" bigger than the 32C girl, which "allows" for more breast tissue.
You can easily see this by actually looking at different bras. The cup (and real breast tissue) always starts near the armpit and ends near the center, regardless of what size it is - if there is more ribcage than there will be more cup (and therefore more breast).

You can actually adjust for different ribcages when you want to compare breast size. You drop a size from the band and go up a size in cup (or vice versa), and those breasts will be roughly the same volume. A 34A and a 32B is roughly the same amount of breast tissue, just one is on a smaller frame.

It is very common for women to have too large a band and too small a cup. For most women they don't notice because their breasts are being fully and comfortably covered, but since their band is way too loose they are not getting any support. As they go through the down-band-up-cup process, their band can get a lot smaller and subsequently the cup gets a lot bigger. They were originally using a bra cup size that could accommodate the same volume (via having a larger band) but for the wrong body type. When they finally get a bra that is built for their body type, not only are they finally being supported by a properly fitting band, but the molded cup actually mimics their own breasts. That's why a lot of girls say they look AND feel so much better.
Copyright 2012 Phile not Found. See About
Powered by Blogger

"Whenever you find that you are on the side of the majority, it is time to pause and reflect."