samedi 30 janvier 2016

Aliases of aliases: myth or reality?

The dangers of aliases of aliases in Elasticsearch

This post introduces the Elastic aliases and explain the danger of misusing them.

The aliases

In Elasticsearch, aliases provide a way to give alternative names to an index or a group of index. Searching, deleting and adding documents over multiple indexes as if manipulating only one index is made easy with this technique.

Aliases can also be used in your application code instead of the index name to provide the possibility of changing the index without modifying the source and to make zero-downtime reindexing. With the addition of filters, aliases can give the impression different small communities each have their own index, by filtering the retrieved documents by some user identifier, while in fact using a single index.

Aliases of aliases

Going further, you could imagine many use cases where aliases of aliases could provide a way group more aliases and add additional filters to them. As an example, in the shared index case, you could want an alias of alias to refer to three specific communities and add a filter to match only the documents beyond a specific date.

Does it work? Let's try it! Launch a local Elastic instance and run the following queries in your sense:

 DELETE index
 POST index
 {
     "mappings": {
         "user": {
             "properties": {
                 "name": {
                     "type": "string"
                 }
             }
         }
     }
 }
 
 POST _aliases
 {
     "actions": [
        {
           "add": {
              "index": "index",
              "alias": "alias1",
              "filter" : { "term" : { "name" : "helain" } }
           }
        },
        {
           "add": {
              "index": "index",
              "alias": "alias2",
              "filter" : { "term" : { "name" : "nialeh" } }
           }
        }
     ]
 }

 POST _aliases
 {
     "actions": [
        {
           "add": {
              "index": ["alias1","alias2"],
              "alias": "meta-alias"
           }
        }
     ]
 }


With these steps, you just created the index and two aliases alias1 and alias2 of this index, each alias with it's own filter. The last operation added an alias meta-alias covering alias1 and alias2.

Let's now see our aliases with a GET _aliases:

{
   "index": {
      "aliases": {
         "meta-alias": {},
         "alias2": {
            "filter": {
               "term": {
                  "name": "nialeh"
               }
            }
         },
         "alias1": {
            "filter": {
               "term": {
                  "name": "helain"
               }
            }
         }
      }
   }
}

From the result, we can see that our meta-alias doesn't refer to both our aliases, but only to our index. If you index some documents and try to search on meta-alias, you will see that the filters of alias1 and alias2 aren't applied to your queries.

This behavior happens because aliases aren't real indexes; a query on an alias will be forwarded to the referred index, with the addition of the filter if existing. When we created meta-alias, an alias creation query was sent to alias1 and alias2, that forwarded it to index. Meta-alias consequently forwards it's requests to index only and the filters of alias1 and alias2 will never be applied. If you don't expect  this behavior, it can results into exposure of information you would have wanted filtered.

Conclusion

Creating an alias of an alias has the same result as if creating an alias over the referred indexes directly.

Aucun commentaire:

Enregistrer un commentaire