R is 1indexed. Some people probably get unreasonable riled up about this.
1 2  arr < c("first entry", "second entry", "third entry") arr[1] # => [1] "first entry" 
But what happens if you try to use 0
as an index?
1  arr[0] # => character(0) 
This might not be what you want: in fact, I suspect that nine times out of ten this is exactly not what you want. Consider:
1 2 3 4 5 6  indices < c(1, 0, 2, 0, 3) values < arr[indices] length(values) # => 3 length(indices) # => 5 
I’ve had issues with this in the past. For example, consider the problem of trying to turn a number (or a vector of numbers) into Excelstyle columns:
1 2 3  excel.columnify(1) # => "A" excel.columnify(26) # => "Z" excel.columnify(27) # => "AA" 
If you’ve ever encountered R’s LETTERS
constant, the process is pretty straightforward for numbers between 1 and 26:
1 2 3  excel.columnify < function(n) { LETTERS[n] } 
But what if n
is greater than 26? We could do something like this^{1}:
1 2 3 4 5 6 7  excel.columnify < function(n) { ifelse( n > 26, paste0(LETTERS[(n1) %/% 26], LETTERS[(n1) %% 26 + 1]), LETTERS[n] ) } 
This seems to run OK…
1 2 3 4 5  excel.columnify(1:10) # => [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" excel.columnify(20:30) # => [1] "T" "U" "V" "W" "X" "Y" "Z" "AA" "AB" "AC" "AD" 
…until we hit a very specific and weird exception:
1 2 3  excel.columnify(26:53) # => [1] "Z" "AA" "AB" "AC" "AD" "AE" "AF" "AG" "AH" "AI" "AJ" "AK" "AL" "AM" "AN" "AO" "AP" "AQ" "AR" "AS" "AT" # => [22] "AU" "AV" "AW" "AX" "AY" "BZ" "AA" 
Notice how the last couple of items in this vector have the wrong first letter? What’s going on here? If you’ve worked out that we have some zeroindices at work, you’re on the right track.
It’s worth noting how ifelse()
works in R: it evaluates both value arguments (the last two arguments to the function) and splices them together based on the values of the first, boolean argument. Of note here, this means that the paste0
function runs on every item in the array:
1  paste0(LETTERS[(n1) %/% 26], LETTERS[(n1) %% 26 + 1]) 
Let’s examine those numeric vectors we pass as lookups to LETTERS
:
1 2 3  n < 26:53 (n1) %/% 26 # => [1] 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 
We’ve got a zero in this vector! Given what we know about zero lookups, we might correctly guess that this is going to cause issues:
1 2 3 4 5 6 7 8 9 10 11 12  LETTERS[(n1) %/% 26] # => [1] "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" # => [27] "B" length(LETTERS[(n1) %/% 26]) # => [1] 27 length(n) # => [1] 28 length(LETTERS[(n1) %% 26 + 1]) # => [1] 28 
The paste0
function is fine with two arguments of differing lengths: it just duplicates the smallest until it’s the same length as the largest. This is normally useful, but in this case it’s broken.
How do we fix the error? It turns using NA
as an index will return NA
, maintaining the correct vector length:
1 2  LETTERS[c(1, NA, 3)] # => [1] "A" NA "C" 
Our solution is to replace every instance of 0
with NA
, and then swap them back later on:
1 2 3 4 5 6 7 8 9 10 11 12 13  excel.columnify < function(n) { first_letter_lookup < (n1) %/% 26 first_letter_lookup[first_letter_lookup == 0] < NA first_letters < LETTERS[first_letter_lookup] first_letters[is.na(first_letters)] < "" ifelse( n > 26, paste0(first_letters, LETTERS[(n1) %% 26 + 1]), LETTERS[n] ) } 
With this safeguard in place, our bug is fixed:
1 2 3  excel.columnify(26:53) # => [1] "Z" "AA" "AB" "AC" "AD" "AE" "AF" "AG" "AH" "AI" "AJ" "AK" "AL" "AM" "AN" "AO" "AP" "AQ" "AR" "AS" "AT" # => [22] "AU" "AV" "AW" "AX" "AY" "AZ" "BA" 

Trust me, the maths works here. ↩
Comments
Leave a comment
Sorry, you need to enable javascript for commenting. That’s one of the drawbacks of having a static site blog I guess.