How efficient is it to order by distance (entire table) in geodjango

How efficient is it to order by distance (entire table) in geodjango



Assume that I have the following data model


Person(models.Model):
id = models.BigAutoField(primary_key=True)
name = models.CharField(max_length=50)
location = models.PointField(srid=4326)



Assume also that I have an app that makes queries to this django backend, and the only purpose of this app is to return a (paginated) list of registered users from closest to farthest.



Currently I have this query in mind:


# here we are obtaining all users in ordered form
current_location = me.location
people = Person.objects.distance(current_location).order_by('distance')

# here we are obtaining the first X through pagination
start_index = a
end_index = b

people = people[a:b]



Although this works, it is not as fast as I would like.



I have some concerns over the speed of this query. If the table were large (1 million+) then wouldn't the database (Postgres SQL w/ PostGIS) have to measure the distance between current_location and every location in the database before performing an order_by on that subsequently 1 million rows?


current_location


location


order_by



Can somebody suggest on how to properly return nearby users ordered by distance in an efficient manner?




1 Answer
1



If you want to sort every entry on that table by distance then it will be slow as expected and there is nothing that can be done (that I am aware of at this point of time and my knowledge.)!



You can make your calculation more efficient by following this steps and making some assumptions:



Enable spatial indexing on your tables. To do that in GeoDjango, follow the doc instructions and fit them to your model:



Note



In PostGIS, ST_Distance_Sphere does not limit the geometry types geographic distance queries are performed with. [4] However, these queries may take a long time, as great-circle distances must be calculated on the fly for every row in the query. This is because the spatial index on traditional geometry fields cannot be used.



For much better performance on WGS84 distance queries, consider using geography columns in your database instead because they are able to use their spatial index in distance queries. You can tell GeoDjango to use a geography column by setting geography=True in your field definition.


geography=True



Now you can narrow down your query with some logical constrains:



Ex: My user will not look for people more than 50km from his current position.



Narrow down the search using dwithin spatial lookup which utilizes the above mentioned spatial indexing, therefore it is pretty fast.


dwithin



Finally apply the distance order by on the remaining rows.


distance



The final query can look like this:


current_location = me.location
people = People.objects.filter(
location__dwithin=(current_location, D(km=50))
).annotate(
distance=Distance('location', current_location)
).order_by('distance')



P.S: Rather than creating a custom pagination attempt, it is more efficient to utilize the pagination methods provided for the django views:



Or you can use Django Rest Framework and use it's pagination:





I wanted to use OGRGeometry as it has a constructor that can take a location string like 'SRID=4326;POINT (-122 37)' but you can't use this as an arg to dwithin. Make sure to use a django.contrib.gis.geos.Point for current_location.
– Harry Moreno
Aug 25 at 3:32


OGRGeometry


'SRID=4326;POINT (-122 37)'


dwithin


django.contrib.gis.geos.Point


current_location






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)