Customizing Solr's Distance Units

in #solr6 years ago

Solr provides built in support for distance related sorting and searching. Spatial Search topic has a nice and detailed page on Solr Wiki.

Now imagine that you have task, you want to deal with distance, in a way that you can cluster some certain distances into a single "clustered unit".

Real life scenario? Well assume that you are city center, and looking for a restaurant. As it is rush hour, although some restaurants appear to be closer to you, it could be actually harder to reach there.

Here is a lovely map:

Screen Shot 2018-05-05 at 20.21.05.png

In the above plane, we are the black dot, and colorful ones are the restaurants. And below, you can see the the same map with layers based on the "reaching time"

Screen Shot 2018-05-05 at 20.21.47.png

So here, the outermost circle is 5 km limit, while innermost is 1 km.Based on our rush hour scenario,

Red Restaurants are within 5 mins from us. ( 0 to 1 km)

Yellow Restaurants are within 10 mins from us (1 km to 4 km)

Green Restaurants are within 20 mins from us (4 km to 5 km)

Gray Restaurants are more than 30 mins from us (5+ km)

While geofilt function differs on sorting yellow restaurants, we need some function to consider all yellows identical for their distance parameter.

In order to write a customized distance function, it is quite useful to take a look at GeoDistValueSourceParser.java and HaversineConstFunction.java classes. You can use grepcode, if you feel lazy to download whole source code.

You can find the code for the plugin for the above on my github

So how to use it? Well, here comes the configs for solrconfig.xml file. Simply, add the below line to the config file:

<valueSourceParser name="strips" class="com.hcetavaj.solr.Strip" />

After this, don't forget to add the jar file into the classpath or to the shared lib folder of the collection.

Once the configs are done, you can add the newly created custom value parser for geofilt. Just add

_strips_:strips("1,4,5") 

to your fl while keeping the rest of the params the same for geofilt, for the case mentioned above. You can also use it for sorting, hence getting the results like the second image above.

If you check the code carefully, you will see that it can also support queries with identical strip sizes. In order to use it in that way, simply use it with two integers, first one is for the width and the second for the amount of the strips: strips(2,5) - 5 strips with 2 km width.

Coin Marketplace

STEEM 0.14
TRX 0.12
JST 0.025
BTC 54114.77
ETH 2331.62
USDT 1.00
SBD 2.13