How do I get the k nearest neighbors for geodjango?
How do I get the k nearest neighbors for geodjango?
Assume that I have the following model:
class Person:
id = models.BigAutoField(primary_key=True)
name = models.CharField(max_length=150)
location = models.PointField()
How would I go by obtaining the k nearest neighbors (KNN) by location using geodjango?
Would I have to write custom SQL for that?
I am using PostgresSQL with PostGIS.
1 Answer
1
You can use a raw()
sql query to utilize postgis order_by
operators:
raw()
order_by
<->
which gets the nearest neighbor using the centers of the bounding boxes to calculate the inter-object distances.
<->
<#>
which gets the nearest neighbor using the bounding boxes themselves to calculate the inter-object distances.
<#>
In your case the one you want seems to be the <->
operator, thus the raw query:
<->
knn = Person.objects.raw(
'SELECT * FROM myapp_person
ORDER BY location <-> ST_SetSRID(ST_MakePoint(%s, %s),4326)',
[location.x, location.y]
)[:k]
EDIT due to own derpiness: You can omit the [:k]
to add LIMIT 1
on the raw SQL query. (Don't use both as I did!)
[:k]
LIMIT 1
In the process of answering your other question: How efficient is it to order by distance (entire table) in geodjango ,another solution maybe possible:
By enabling spatial indexing
and narrowing down your query through logical constrains (as explained in my answer of the above -linked question) you can achieve a pretty fast KNN query as follows:
spatial indexing
current_location = me.location
people = People.objects.filter(
location__dwithin=(current_location, D(km=50))
).annotate(
distance=Distance('location', current_location)
).order_by('distance')[:k]
<->
You can use a geography column or a geometry one. What matters on speeding up your query is to use
spatial_idex
. For further reading on the subject, have a look here: boundlessgeo.com/2011/09/… Good luck @Alan :)– John Moutafis
Aug 2 '17 at 8:24
spatial_idex
Hello, looking back at your answer, I am confused by the purpose of
LIMIT 1
in knn = Person.objects.raw('SELECT * FROM myapp_person...
, why do we need that?– AlanSTACK
Aug 31 '17 at 5:36
LIMIT 1
knn = Person.objects.raw('SELECT * FROM myapp_person...
@Alan That will return the first nearest neighbor. You can change it at will! I will edit this in my answer as well.
– John Moutafis
Aug 31 '17 at 8:06
I thought
[:k]
took care of that? I am confused since LIMIT K
and [:k]
seem to be serving the same purpose here– AlanSTACK
Aug 31 '17 at 15:13
[:k]
LIMIT K
[:k]
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
For this scenario (obtaining knn), would it be still helpful to use a geography column? Or would it be pointless - since I am assuming the calculations involving
<->
would be different– AlanSTACK
Aug 2 '17 at 8:20