Emulating non-rectangular arrays
Emulating non-rectangular arrays
Often times you want the performance of arrays over linked lists while having not conforming to the requirement of having rectangular arrays.
As an example consider an hexagonal grid, here shown with the 1-distance neighbors of cell (3, 3) in medium gray and the 2-distance neighbors in light gray.
Say we want an array that contains, for each cell, the indices of every 1- and 2-distance neighbor for that cell. One slight issue is that cells have a different amount of X-distance neighbors -- cells on the grid border will have fewer neighbors than cells closer to the grid center.
(We want an array of neighbor indices --- instead of a function from cell coordinates to neighbor indices --- for performance reasons.)
We can work around this problem by keeping track of how many neighbors each cell has. Say you have an array neighbors2 of size R x C x N x 2, where R is the number of grid rows, C for columns, and N is the maximum number of 2-distance neighbors for any cell in the grid.
Then, by keeping an additional array n_neighbors2 of size R x C, we can keep track of which indices in neighbors2 are populated and which are just zero padding. For example, to retrieve the the 2-distance neighbors of cell (2, 5), we simply index into the array as such:
neighbors2
R x C x N x 2
R
C
N
n_neighbors2
R x C
neighbors2
someNeigh = neighbors2[2, 5, 0..n_neighbors2[2, 5], ..]
someNeigh = neighbors2[2, 5, 0..n_neighbors2[2, 5], ..]
someNeigh will be a n_neighbors2[2, 5] x 2 array (or view) of indicies, where someNeigh[0, 0] yields the row of the first neighbor, and someNeigh[0, 1] yields the column of the first neighbor and so forth.
Note that the elements at the positions
someNeigh
n_neighbors2[2, 5] x 2
someNeigh[0, 0]
someNeigh[0, 1]
neighbors2[2, 5, n_neighbors2[2, 5]+1.., ..]
neighbors2[2, 5, n_neighbors2[2, 5]+1.., ..]
are irrelevant; this space is just padding to keep the matrix rectangular.
Provided we have a function for finding the d-distance neighbors for any cell:
import Data.Bits (shift)
rows, cols = (7, 7)
type Cell = (Int, Int)
generateNeighs :: Int -> Cell -> [Cell]
generateNeighs d cell1 = [ (row2, col2)
| row2 <- [0..rows-1]
, col2 <- [0..cols-1]
, hexDistance cell1 (row2, col2) == d]
hexDistance :: Cell -> Cell -> Int
hexDistance (r1, c1) (r2, c2) = shift (abs rd + abs (rd + cd) + abs cd) (-1)
where
rd = r1 - r2
cd = c1 - c2
How can we create the aforementioned arrays neighbors2 and n_neighbors2? Assume we know the maximum amount of 2-distance neighbors N beforehand. Then it is possible to modify generateNeighs to always return lists of the same size, as we can fill up remaining entries with (0, 0). That leaves, as I see it, two problems:
neighbors2
n_neighbors2
N
generateNeighs
neighbors2
n_neighbors2
neighbors2
A solution is welcome with either repa or accelerate arrays.
repa
accelerate
@WillNess The question is not to store the grid as an array, but for each cell, a 'list' of its neighbors. That won't be rectangular because the cells have a different amount of neighbors.
– tsorn
Sep 16 '18 at 15:35
I'm almost certain that using an array of neighbor indices will be slower than calculating them on the fly, unless the grid is small. CPU cache is a tightly limited and extremely valuable resource; using some of it to hold values that can be computed in a couple cheap instructions doesn't sound good.
– dfeuer
Sep 16 '18 at 19:12
I'd recommend you consider alternatives to the monolithic-array-with-some-dummy-entries approach. That's a very Matlab thing to do. I'd consider just lining up all the neighbours of all cells in one 1D unboxed vector, and then for each cell storing just an offset into that vector and the number of neighbours. This is much nicer to construct in a pure-functional manner, and it would also need less total memory, if the number of neighbours varies considerably.
– leftaroundabout
Sep 16 '18 at 20:37
@davidcox That leads to wasted space when storing the grid in a rectangular array. See redblobgames.com/grids/hexagons for an excellent resource on hexagonal grids.
– tsorn
Sep 17 '18 at 14:56
3 Answers
3
Here's you picture skewed 30 degrees to the right:

As you can see your array is actually perfectly rectangular.
The indices of a neighborhood's periphery are easily found as six straight pieces around the chosen center cell, e.g. (imagine n == 2 is the distance of the periphery from the center (i,j) == (3,3) in the picture):
n == 2
(i,j) == (3,3)
periphery n (i,j) =
-- 2 (3,3)
let
((i1,j1):ps1) = reverse . take (n+1) . iterate ((i,j)->(i,j+1)) $ (i-n,j)
-- ( 1, 3)
((i2,j2):ps2) = reverse . take (n+1) . iterate ((i,j)->(i+1,j)) $ (i1,j1)
-- ( 1, 5)
.....
ps6 = ....... $ (i5,j5)
in filter isValid (ps6 ++ ... ++ ps2 ++ ps1)
The whole neighborhood is simply
neighborhood n (i,j) = (i,j) : concat [ periphery k (i,j) | k <- [1..n] ]
For each cell/distance combination, simply generate the neighborhood indices on the fly and access your array in O(1) time for each index pair.
Writing out the answer from @WillNess in full, and incorporating the proposal from @leftroundabout to store indecies in a 1D vector instead, and we get this:
import qualified Data.Array.Accelerate as A
import Data.Array.Accelerate (Acc, Array, DIM1, DIM2, DIM3, Z(..), (:.)(..), (!), fromList, use)
rows = 7
cols = 7
type Cell = (Int, Int)
(neighs, nNeighs) = generateNeighs
-- Return a vector of indices of cells at distance 'd' or less from the given cell
getNeighs :: Int -> Cell -> Acc (Array DIM1 Cell)
getNeighs d (r,c) = A.take n $ A.drop start neighs
where
start = nNeighs ! A.constant (Z :. r :. c :. 0)
n = nNeighs ! A.constant (Z :. r :. c :. d)
generateNeighs :: (Acc (Array DIM1 Cell), Acc (Array DIM3 Int))
generateNeighs = (neighsArr, nNeighsArr)
where
idxs = concat [[(r, c) | c <- [0..cols-1]] | r <- [0..rows-1]]
(neighsLi, nNeighsLi, n) = foldl inner (, , 0) idxs
neighsArr = use $ fromList (Z :. n) neighsLi
nNeighsArr = use $ fromList (Z :. rows :. cols :. 5) nNeighsLi
inner (neighs', nNeighs', n') idx = (neighs' ++ cellNeighs, nNeighs'', n'')
where
(cellNeighs, cellNNeighs) = neighborhood idx
n'' = n' + length cellNeighs
nNeighs'' = nNeighs' ++ n' : cellNNeighs
neighborhood :: Cell -> ([Cell], [Int])
neighborhood (r,c) = (neighs, nNeighs)
where
neighsO = [ periphery d (r,c) | d <- [1..4] ]
neighs = (r,c) : concat neighsO
nNeighs = tail $ scanl (+) 1 $ map length neighsO
periphery d (r,c) =
-- The set of d-distance neighbors form a hexagon shape. Traverse each of
-- the sides of this hexagon and gather up the cell indices.
let
ps1 = take d . iterate ((r,c)->(r,c+1)) $ (r-d,c)
ps2 = take d . iterate ((r,c)->(r+1,c)) $ (r-d,c+d)
ps3 = take d . iterate ((r,c)->(r+1,c-1)) $ (r,c+d)
ps4 = take d . iterate ((r,c)->(r,c-1)) $ (r+d,c)
ps5 = take d . iterate ((r,c)->(r-1,c)) $ (r+d,c-d)
ps6 = take d . iterate ((r,c)->(r-1,c+1)) $ (r,c-d)
in filter isValid (ps6 ++ ps5 ++ ps4 ++ ps3 ++ ps2 ++ ps1)
isValid :: Cell -> Bool
isValid (r, c)
| r < 0 || r >= rows = False
| c < 0 || c >= cols = False
| otherwise = True
BTW the reason I used
reverse was that I didn't want to call last to find the next starting point which is the next vertex of the fence polygon. But if we calculate each of the six vertices of the polygon explicitly (which isn't hard; I was just being lazy with that), we don't need all that, and can eliminate the reverses, and also convert the take...iterate to simple enumerations in the style of zip [i,i+1..i+d-1] (repeat j). :) Also seems the d==1 case should be special cased as well (there's no enumerations, just the vertices).– Will Ness
Sep 17 '18 at 17:03
reverse
last
take...iterate
zip [i,i+1..i+d-1] (repeat j)
@WillNess I incorporated your first suggestion in the post. I also benchmarked that code (with suggestion #1) versus also using
zips and repeats. It turns out the latter approach is about 50 % slower according to Criterion (testing code: github.com/tsoernes/haskelldca/blob/master/test/…)– tsorn
Sep 18 '18 at 11:42
zip
repeat
you're right, I later thought about this too (that there's no real need to special case d==1) but wasn't on-line to comment. so
iterate compiles into a faster code than enumerations, interesting. Ah, indeed, zip is not a "good producer" (i.e. it doesn't fuse) but iterate is; so this makes sense. Did you make sure to compile with the -O2 switch BTW?– Will Ness
Sep 18 '18 at 13:59
iterate
zip
iterate
@WillNess I used
-O2 -threaded -rtsopts -with-rtsopts=-N.– tsorn
Sep 19 '18 at 9:58
-O2 -threaded -rtsopts -with-rtsopts=-N
This can be by using the permute function to fill the neighbors for 1 cell at a time.
import Data.Bits (shift)
import Data.Array.Accelerate as A
import qualified Prelude as P
import Prelude hiding ((++), (==))
rows = 7
cols = 7
channels = 70
type Cell = (Int, Int)
(neighs, nNeighs) = fillNeighs
getNeighs :: Cell -> Acc (Array DIM1 Cell)
getNeighs (r, c) = A.take (nNeighs ! sh1) $ slice neighs sh2
where
sh1 = constant (Z :. r :. c)
sh2 = constant (Z :. r :. c :. All)
fillNeighs :: (Acc (Array DIM3 Cell), Acc (Array DIM2 Int))
fillNeighs = (neighs2, nNeighs2)
where
sh = constant (Z :. rows :. cols :. 18) :: Exp DIM3
neighZeros = fill sh (lift (0 :: Int, 0 :: Int)) :: Acc (Array DIM3 Cell)
-- nNeighZeros = fill (constant (Z :. rows :. cols)) 0 :: Acc (Array DIM2 Int)
(neighs2, nNeighs2li) = foldr inner (neighZeros, ) indices
nNeighs2 = use $ fromList (Z :. rows :. cols) nNeighs2li
-- Generate indices by varying column fastest. This assures that fromList, which fills
-- the array in row-major order, gets nNeighs in the correct order.
indices = foldr (r acc -> foldr (c acc2 -> (r, c):acc2 ) acc [0..cols-1]) [0..rows-1]
inner :: Cell
-> (Acc (Array DIM3 Cell), [Int])
-> (Acc (Array DIM3 Cell), [Int])
inner cell (neighs, nNeighs) = (newNeighs, n : nNeighs)
where
(newNeighs, n) = fillCell cell neighs
-- Given an cell and a 3D array to contain cell neighbors,
-- fill in the neighbors for the given cell
-- and return the number of neighbors filled in
fillCell :: Cell -> Acc (Array DIM3 Cell) -> (Acc (Array DIM3 Cell), Int)
fillCell (r, c) arr = (permute const arr indcomb neighs2arr, nNeighs)
where
(ra, ca) = (lift r, lift c) :: (Exp Int, Exp Int)
neighs2li = generateNeighs 2 (r, c)
nNeighs = P.length neighs2li
neighs2arr = use $ fromList (Z :. nNeighs) neighs2li
-- Traverse the 3rd dimension of the given cell
indcomb :: Exp DIM1 -> Exp DIM3
indcomb nsh = index3 ra ca (unindex1 nsh)
generateNeighs :: Int -> Cell -> [Cell]
generateNeighs d cell1 = [ (row2, col2)
| row2 <- [0..rows]
, col2 <- [0..cols]
, hexDistance cell1 (row2, col2) P.== d]
-- Manhattan distance between two cells in an hexagonal grid with an axial coordinate system
hexDistance :: Cell -> Cell -> Int
hexDistance (r1, c1) (r2, c2) = shift (abs rd + abs (rd + cd) + abs cd) (-1)
where
rd = r1 - r2
cd = c1 - c2
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you agree to our terms of service, privacy policy and cookie policy
I didn't read through your question, sorry, but just looking at your image the arrays are perfectly rectangular, you just have a bit special neighbors relationship. looking at your image skewed 30 degrees to the right it is easier to see how that relationship could be defined following the 6 line segments around the chosen center tile. Apologies if you already knew all this and wrote about it in the question (that I skipped over).
– Will Ness
Sep 16 '18 at 14:34