Nested JSON Objects with Solr

We use Solr for storing different types structured data. Solr works fine and feels intuitive to use as long as structured entity has all properties of basic types like string, number, date etc. But the moment we like to index an entity with relations (which is quite common), intuitiveness of the response will need to be compromised with. Some teams have different strategy to take care of this. We have tried different approaches and settled with a custom response writer along with a naming convention in schema. Yes, those who has to work with dynamic schema or schemaless, following wont help.

To explain better, I will take small and well-known entity as example : Employee 🙂 . Lets say following is the structure of our entity

Employee

  • Name
  • Age
  • Date of Joining
  • Address
    • Street
    • City
    • Country
  • Projects[]
    • Name
    • Status

Object Address and Projects are nested objects. Compare following two json structures – traditional approach of storing vs more intuitive json for consumers (solr response) would look like below :

Solr default flat structure response
  {
    name : "Ganesh Gembali"
    age : 30
    dataOfJoining : 01-01-2013
    addressStreet : "Adyar"
    addressCity : "Chennai"
    addressCountry : "India"
    projectName : ["MI2", "Learning Solr"]
    projectStatus : ["Finished","In progress"]
}

Intuitive nested json response
{
    name : "Ganesh Gembali"
    age : 30
    dataOfJoining : 01-01-2013
    address:{
        street : "Adyar",
        city : "Chennai",
        country : "India"
    },
    projects : [
        {name:"MI2",status:"Fimished"},
        {name:"Learning Solr",status:"In progress"},
    ]
}

So the goal is to get response look like second one.

Initially we started using custom response writer which makes use of velocity to render response like above. But it’s not intuitive during querying.

Ex : If we want to find all people in city, chennai :
/solr/people?select?q=addressCity:Chennai

Where as in response it goes like an object. It will be more confusing when we have more levels.

Final Solution :

Solution is a combination of naming convention and custom response writer.
Naming convention to following which looks like typical dot notation for accessing properties. Every field name should follow dot convention for sub-fields/properties. For example Address object has a street, it should look like this address.street. So with this field names for our example would look like :

  • name
  • age
  • dateOfJoining
  • address.street
  • address.city
  • address.country
  • projects.name
  • projects.status

Custom response writer which converts flat structure into hierarchical based on field name. This is more like normal json writer available, except it groups fields and creates a json object if field name has “.” in its name. As dot notations are quite familiar to any developer, queries also look intuitive.

Ex : again to find people from same city :
/solr/people?select?q=address.city:Chennai

There could be many other ways to achieve same but we found this more intuitive and effective to use. Any suggestions and other ideas please share.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s