samedi 14 mars 2015

Aggregating by day-of-week or month-of-year in Elastic

Aggregations are one useful tool of Elastic (previously named Elasticsearch) to summarize get statistics over the data. With aggregations, getting the average price of some products, or the number of post made each week is made easy.

Still, no default aggregation exist for finding information over the day-of-week or month-of-year in the current version of Elastic. I've seen and answered many questions on Stack Overflow relating to this issue.

Script aggregations can however solve this by using the following code:

Date date = new Date(doc['created_time'].value) ; 
java.text.SimpleDateFormat format = new java.text.SimpleDateFormat('EEE');
format.format(date)
The 'EEE' means that we want the day of week. For the hour-of-day use 'HH' or 'MMM' for month-of-year.

Put then the script into your aggregation's JSON:
{
    "aggs": {
        "perWeekDay": {
            "terms": {
                "script": "Date date = new Date(doc['created_time'].value) ;java.text.SimpleDateFormat format = new java.text.SimpleDateFormat('EEE');format.format(date)"
            }
        }
    }
}
And it's done.