How to Implement Elasticsearch When Developing a Rails Web App

Information storing and searching are two of the most crucial aspects for any web application. They affect the overall success of your project. The same is true when you aim to develop a perfect Rails web app.

A web product can contain tons of data, making its storage and search extremely challenging. Thus, it is always a great idea to build convenient and powerful algorithms. This is where Elasticsearch comes in handy.

In this article, we will walk you through the whole process of developing a test Ruby on Rails app with Elasticsearch integration.

Defining the terms

Before jumping into the Ruby on Rails web application development process and search algorithms implementing, let’s discuss the key terms and install the tools and services needed.

Elasticsearch is an extremely fast, open-source JSON-based search service. It allows storing, scanning, and analyzing the required data in milliseconds. The service is about integrating of complex search requests and requirements.

That is the reason Elasticsearch is loved by influencers, such as NASA, Microsoft, eBay, Uber, GitHub, Facebook, Warner Brothers, and others.

Now let’s take a closer look at the main terms of Elasticsearch.

Mapping. A process of defining the way both a document and its fields are stored and indexed.

Indexing. An act of keeping data in Elasticsearch. An Elasticsearch cluster can consist of different indices which in their turn contain various types.

Analysis process. A process of rendering a text into tokens or terms that are put on to the inverted index for searching. The analysis is fulfilled by an analyzer which can be of two types, namely an inbuilt analyzer or a custom analyzer defined per index.

Analyzer. A package of three building units where each of them modifies the input stream. An analyzer includes character filters, tokenizer, and token filters.

Elasticsearch Analysis  process | Codica

The flow of a document indexing can be presented the following way:

1) Character filters. First of all, it goes through one or several character filters. It receives original text fields and then transforms the value by adding, deleting, or modifying characters. For example, it can remove html markup from text. The full list of character filters can be found here.

2) Tokenizer. After that, the analyzer separates text into tokens that are usually words. For example, a ten-words text is divided into an array of 10 tokens. The analyzer may have only one tokenizer. Standard tokenizer is applied by default. It splits text with whitespaces and also deletes most of the symbols, such as periods, commas, semicolons, etc. You can find the list of all available tokenizers here.

3) Token filters. Token filters are close to character filters. The main difference is that token filters work with the token stream, while character filters work with the character stream. There are various token filters. Lowercase Token Filter is the simplest one. Find the full list of all available token filters here.

Analysis process in Elasticsearch: document indexing | Codica

Inverted index. The results from the analysis are starting within an inverted index. The purpose of an inverted index is to store a text in a structure that allows for very efficient fast full-text searches. When performing full-text searches, we are actually querying an inverted index, not the documents defined when indexing them.

All the full-text fields have a single inverted index per field.

An inverted index includes all of the unique terms that are shown in any document covered by the index.

Let’s take a look at two sentences. The first one is “I am a Ruby programmer”. The second one is “This project was built in Ruby”.

In the inverted index, they will be saved as follows:

Inverted index
Term Document #1 Document #2
I
am
a
Ruby
programmer
this
project
was
built
in

If we search for “Ruby”, we will see that both documents contain the term.

Inverted index
Term Document #1 Document #2
I
am
a
Ruby
programmer
this
project
was
built
in

Step #1: Installing the tools

Before starting with actual code writing, we need a set of tools and services. As for us, we also use ready-made solutions called gems to increase the speed of the development process.

Install Ruby 2.6.1

We will use RVM to manage multiple Ruby versions installed on our system. Check the Ruby version set up on the system with rvm list and install the following one rvm install 2.6.1.

Install Rails 5.2.3

With Ruby already installed, we will need the Rails gem as well.

    
    ~
    gem install rails -v 5.2.3
  

To check the version of the installed Rails gem, type the following request Rails -v.

Install Elasticsearch 6.4.0

As we have saved Elasticsearch to the Downloads folder, we run the service by typing in the following request:

    
    ~
    /Downloads/elasticsearch-6.4.0/bin/elasticsearch
  

To make sure the tool is started, open it with http://localhost:9200/.

At this point, we see the following on the screen:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 { "name": "v5W3xjV", "cluster_name": "elasticsearch", "cluster_uuid": "aNCXXbKyTkSIAlNNZTDc3A", "version": { "number": "6.4.0", "build_flavor": "default", "build_type": "tar", "build_hash": "595516e", "build_date": "2018-08-17T23:18:47.308994Z", "build_snapshot": false, "lucene_version": "7.4.0", "minimum_wire_compatibility_version": "5.6.0", "minimum_index_compatibility_version": "5.0.0" }, "tagline": "You Know, for Search" }

Install Kibana 6.4.2

You can download Kibana here. We have saved Kibana to the Downloads folder. We type in the following request to run the service:

    
    ~
    /Downloads/kibana-6.4.2-linux-x86_64/bin/kibana
  

In order to be sure that Kibana is running, navigate to http://localhost:5601/.

At this step, we see the window:

Installing Kinaba for web app development | Codica

All the needed tools and services are installed. Now you are ready to start with the project development and Elasticsearch integration.

Step #2: Initiating a new Rails app

Along with the PostgreSQL database, we're going to use the Rails in API mode:

    
    ~
    rvm use 2.6.1
  
    
    ~
    rails new elasticsearch_rails --api -T -d postgresql
  
    
    ~
    cd elasticsearch_rails
  
    
    ~
    bundle install
  

The first thing to do is to configure the database. At this point, we modify our config/database.yml structure similar to this:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 default: &default adapter: postgresql encoding: unicode pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %> username: postgres development: <<: *default database: elasticsearch_rails_development test: <<: *default database: elasticsearch_rails_test production: <<: *default database: elasticsearch_rails_production username: elasticsearch_rails password: <%= ENV['DB_PASSWORD'] %>

Finally, we have created the rails db:create database.

We also need to build a model which we will index and make searchable. Let’s create a Location table with two fields, such as name and level:

    
    ~
    rails generate model location name level
  

After we have created the table, we run the migration with the rails db:migrate command.

We have prepared all the test data needed. Copy the contents of the following file, insert it into db/seeds.rb, and run rails db:seed.

Step #3: Using Elasticsearch with Rails

To integrate the search engine to the Rails application, we need to add two gems to Gemfile:

    
    ~
    gem ’elasticsearch-model’
  
    
    ~
    gem ’elasticsearch-rails’
  

Don’t forget to run bundle install to install these gems.

Now we are ready to add actual functionality to the location model. For this purpose, we use the so-called concerns.

We create a new app/models/concerns/searchable.rb file.

The next step is to add the following code:

1 2 3 4 5 6 7 8 module Searchable extend ActiveSupport::Concern included do include Elasticsearch::Model include Elasticsearch::Model::Callbacks end end

Finally, we include the created module to the location model:

1 2 3 class Location < ApplicationRecord include Searchable end

At this stage, we reproduce the following steps:

  • With Elasticsearch::Model module, we add Elasticsearch integration to the model.
  • With Elasticsearch::Model::Callbacks, we add callbacks. Why is it important? Each time an object is saved, updated or deleted, the related indexed data gets updated accordingly, too.

The last thing we need to do is to index our model.

Open the Rails rails c console and run Location.import force: true.

force: true option will create an index if it doesn't exist. To check whether the index has been built, open Kibana dev tools at http://localhost:5601/ and insert GET _cat/indices?v.

As you can see, we have created the index with the name locations:

Implementing Elasticsearch: Kibana dev tools screenshot | Codica

Since the index was built automatically, the default configuration was applied to all fields.

Now it is time to develop a test query. You can find more information about Elasticsearch Query DSL here.

Open Kibana development tools and navigate to http://localhost:5601.

Afterwards, insert the following code:

1 2 3 4 5 6 GET locations/_search { "query": { "match_all": {} } }

Implementing Elasticsearch: Making query in Kibana dev tools | Codica

The hits attribute of the response’s JSON and especially its _source attribute are the first features we should take into account. As you can see, all fields in the Location model were serialized and indexed.

We also can make a test query through the Rails app. Open rails c console and insert:

results = Location.search(‘san’)
results.map(&:name) # => ["san francisco", "american samoa"]

You may also like: Building a Slack Bot for Internal Time Tracking

Step #4: Building a custom index with autocomplete functionality

Before creating a new index, we need to delete the previous one. For this purpose, open rails c Location.__elasticsearch__.delete_index!. The previous index was removed.

The next step is to edit the app/models/concerns/searchable.rb file so it would look like this:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 module Searchable extend ActiveSupport::Concern included do include Elasticsearch::Model include Elasticsearch::Model::Callbacks def as_indexed_json(_options = {}) as_json(only: %i[name level]) end settings settings_attributes do mappings dynamic: false do indexes :name, type: :text, analyzer: :autocomplete indexes :level, type: :keyword end end def self.search(query, filters) set_filters = lambda do |context_type, filter| @search_definition[:query][:bool][context_type] |= [filter] end @search_definition = { size: 5, query: { bool: { must: [], should: [], filter: [] } } } if query.blank? set_filters.call(:must, match_all: {}) else set_filters.call( :must, match: { name: { query: query, fuzziness: 1 } } ) end if filters[:level].present? set_filters.call(:filter, term: { level: filters[:level] }) end __elasticsearch__.search(@search_definition) end end class_methods do def settings_attributes { index: { analysis: { analyzer: { autocomplete: { type: :custom, tokenizer: :standard, filter: %i[lowercase autocomplete] } }, filter: { autocomplete: { type: :edge_ngram, min_gram: 2, max_gram: 25 } } } } } end end end

In this code snippet, we are serializing our model attributes to JSON with the key as_indexed_json method.

We will work only with two fields, i.e. name and level:

1 2 3 def as_indexed_json(_options = {}) as_json(only: %i[name level]) end

We are going to define the index configuration:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 settings settings_attributes do mappings dynamic: false do # we use our autocomplete custom analyzer that we have defined above indexes :name, type: :text, analyzer: :autocomplete indexes :level, type: :keyword end end def settings_attributes { index: { analysis: { analyzer: { # we define custom analyzer with name autocomplete autocomplete: { # type should be custom for custom analyzers type: :custom, # we use standard tokenizer tokenizer: :standard, # we apply two token filters # autocomplete filter is a custom filter that we defined above filter: %i[lowercase autocomplete] } }, filter: { # we define custom token filter with name autocomplete autocomplete: { type: :edge_ngram, min_gram: 2, max_gram: 25 } } } } } end end

Here we define a custom analyzer named autocomplete with standard tokenizer and with lowercase and autocomplete filters.

Autocomplete filter is of edge_ngram type. The edge_ngram tokenizer divides the text into smaller parts (grams).

For example, the word “ruby” will be split into [“ru”, “rub”, “ruby”].

edge_ngram are useful when we need to implement autocomplete functionality. However, there is another way to integrate the options needed, the so-called completion suggester approach.

We apply mappings to the name and level fields. The keyword data type is used with the level field. The text data type is applied to the name field along with our custom autocomplete analyzer.

And finally, we will explain the search method we use:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 def self.search(query, filters) # a lambda function adds conditions to a search definition set_filters = lambda do |context_type, filter| @search_definition[:query][:bool][context_type] |= [filter] end @search_definition = { # we indicate that there should be no more than 5 documents to return size: 5, # we define an empty query with the ability to # dynamically change the definition # Query DSL https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html query: { bool: { must: [], should: [], filter: [] } } } # match all documents if query.blank? set_filters.call(:must, match_all: {}) else set_filters.call( :must, match: { name: { query: query, # fuzziness means you can make one typo and still match your document fuzziness: 1 } } ) end # the system will return only those documents that pass this filter if filters[:level].present? set_filters.call(:filter, term: { level: filters[:level] }) end __elasticsearch__.search(@search_definition) end

Now it is time to open the Rails console and check the following request to be sure the project works correctly:

rails c
results = Location.search('san francisco', {})
results.map(&:name) # => ["san francisco", "american samoa"]

However, it is always a good idea to verify the accuracy of the product performance with a few mistakes in the request to make sure the project functions properly:

results = Location.search('Asan francus', {})
results.map(&:name) # => ["san francisco"]

As you remember, we have one filter defined. It is used to filter Location by level. There are two objects in the database with the same name, i.e. New York, which are of different levels. The first level refers to the state, and the second one - to the city:

results = ation.import force: true=>"new york", :level=>"state"}
results = Location.search('new york', { level: city })
results.map { |result| { name: result.name, level: result.level } } # [{:name=>"new york", :level=>"city"}

Step #5: Making the search request available by API

In the final development stage, we will create a controller through which the search queries will pass:

    
    ~
    rails generate controller Home search
  

Open app/controllers/home_controller.rb and insert the following code snippet in it:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 class HomeController < ApplicationController def search results = Location.search(search_params[:q], search_params) locations = results.map do |r| r.merge(r.delete('_source')).merge('id': r.delete('_id')) end render json: { locations: locations }, status: :ok end private def search_params params.permit(:q, :level) end end

Let's see the project in action.

Run the Rails server by typing rails s and then go to http://localhost:3000//home/search?q=new&level=state.

In the below code, we request all documents containing the name “new” and whose level is equal to the state.

This is what the response looks like:

{
  "locations": [
    {
      "_index": "locations",
      "_type": "_doc",
      "_id": "41",
      "_score": 3.676841,
      "name": "new york",
      "level": "state",
      "id": "41"
    },
    {
      "_index": "locations",
      "_type": "_doc",
      "_id": "17",
      "_score": 3.5186555,
      "name": "new jersey",
      "level": "state",
      "id": "17"
    },
    {
      "_index": "locations",
      "_type": "_doc",
      "_id": "10",
      "_score": 2.7157228,
      "name": "new hampshire",
      "level": "state",
      "id": "10"
    }
  ]
}

Congratulations! Your test Rails web app is ready, with the basic functionality of the searching service integrated.

Summary

We hope that our guide was helpful, and we highly recommend you to learn all the possibilities of Elasticsearch to improve your development skills.

Elasticsearch is a perfect tool for integrating a fast full-text search with powerful features:

  • Speed plays an important role in providing a customer with a positive user experience.
  • Flexibility is about modifying the search performance and optimizing various datasets and use cases.
  • If a user makes a typo in a search, Elasticsearch still returns relevant results for what the customer is looking for.
  • The service makes it possible to search both for specific keywords and other matching data stored in your database.

Elasticsearch has powerful features and works smoothly with Ruby on Rails app development. What could be better?

Further reading: How We Created a Multi-site Setup for a Vehicle Marketplace: Challenges, Solutions and Insights

Contents
Related posts
We use cookies to improve your experience with our site, including analytics and personalisation.