Implementing Full-Text Search in Node.js with Elasticsearch

NodeJS, Elasticsearch|NOVEMBER 14, 2025|0 VIEWS

Learn how to implement powerful full-text search capabilities in your Node.js applications using Elasticsearch, including indexing, querying, and advanced search features

Introduction

Full-text search is a critical feature for modern applications, enabling users to quickly find relevant content across large datasets. While traditional database queries work for exact matches, they fall short when dealing with fuzzy searches, relevance ranking, and complex search requirements.

Elasticsearch is a distributed, RESTful search and analytics engine built on Apache Lucene. It provides powerful full-text search capabilities, real-time indexing, and excellent scalability, making it the go-to solution for search functionality in production applications.

In this comprehensive guide, we'll build a complete full-text search system for a Node.js application that includes:

Elasticsearch setup and configuration
Node.js client integration
Document indexing and management
Basic and advanced search queries
Relevance scoring and ranking
Search result highlighting
Performance optimization
Production deployment considerations

By the end of this tutorial, you'll have a production-ready search implementation that can handle complex search requirements efficiently.

Understanding Elasticsearch

What is Elasticsearch?

Elasticsearch is an open-source, distributed search and analytics engine designed for horizontal scalability, reliability, and real-time search. It's built on top of Apache Lucene and provides a simple RESTful API for indexing and searching data.

Key Features:

Full-text search: Advanced text analysis and search capabilities
Distributed: Automatically distributes data and queries across nodes
Real-time: Near real-time indexing and search
RESTful API: Simple HTTP interface for all operations
Schema-free: JSON documents with dynamic mapping
Scalable: Handles petabytes of data across clusters

Core Concepts

Index: Similar to a database in traditional systems, an index is a collection of documents.

Document: A JSON object that represents a single entity (like a blog post, product, or user).

Type: A logical category/partition of an index (deprecated in newer versions, but still used in some contexts).

Mapping: Defines the schema for documents in an index, including field types and analyzers.

Shard: A subset of an index's data. Elasticsearch splits indices into shards for distribution.

Replica: A copy of a shard for redundancy and improved query performance.

Why Use Elasticsearch for Full-Text Search?

Advantages:

Fast: Optimized for search performance with inverted indices
Flexible: Supports complex queries, filters, and aggregations
Scalable: Handles large datasets and high query volumes
Relevant: Advanced relevance scoring algorithms
Analytics: Built-in aggregation capabilities for analytics

Use Cases:

E-commerce product search
Content management systems
Log analysis and monitoring
Application search
Real-time analytics
Autocomplete and suggestions

Project Setup

Let's start by setting up our Node.js project with Elasticsearch integration.

Initialize the Project

mkdir elasticsearch-nodejs
cd elasticsearch-nodejs
npm init -y

Install Dependencies

# Production dependencies
npm install @elastic/elasticsearch express dotenv

# Development dependencies
npm install -D nodemon @types/node

Dependencies explained:

@elastic/elasticsearch: Official Elasticsearch client for Node.js
express: Web framework for building the API
dotenv: Environment variables management

Project Structure

elasticsearch-nodejs/
├── src/
│   ├── config/
│   │   └── elasticsearch.js
│   ├── services/
│   │   ├── searchService.js
│   │   └── indexService.js
│   ├── controllers/
│   │   └── searchController.js
│   ├── models/
│   │   └── documentModel.js
│   ├── routes/
│   │   └── search.js
│   └── app.js
├── .env
├── .gitignore
├── package.json
└── server.js

Elasticsearch Setup

Local Installation

Using Docker (Recommended)

# Run Elasticsearch in Docker
docker run -d \
  --name elasticsearch \
  -p 9200:9200 \
  -p 9300:9300 \
  -e "discovery.type=single-node" \
  -e "xpack.security.enabled=false" \
  docker.elastic.co/elasticsearch/elasticsearch:8.11.0

Using Homebrew (macOS)

brew install elasticsearch
brew services start elasticsearch

Verify Installation

curl http://localhost:9200

You should see a response like:

{
  "name": "node-1",
  "cluster_name": "elasticsearch",
  "cluster_uuid": "...",
  "version": {
    "number": "8.11.0"
  }
}

Cloud Options

For production, consider:

Elastic Cloud: Managed Elasticsearch service
AWS Elasticsearch Service: AWS-managed solution
Azure Cognitive Search: Azure's search service

Elasticsearch Client Configuration

Create the Elasticsearch client configuration:

// src/config/elasticsearch.js
const { Client } = require('@elastic/elasticsearch');
require('dotenv').config();

class ElasticsearchClient {
  constructor() {
    this.client = new Client({
      node: process.env.ELASTICSEARCH_URL || 'http://localhost:9200',
      auth: process.env.ELASTICSEARCH_AUTH
        ? {
            username: process.env.ELASTICSEARCH_USERNAME,
            password: process.env.ELASTICSEARCH_PASSWORD,
          }
        : undefined,
      ssl: process.env.ELASTICSEARCH_SSL === 'true'
        ? {
            rejectUnauthorized: false, // For self-signed certificates
          }
        : undefined,
      requestTimeout: 60000,
      pingTimeout: 3000,
      sniffOnStart: false,
      sniffInterval: false,
    });
  }

  // Check if Elasticsearch is available
  async ping() {
    try {
      await this.client.ping();
      return true;
    } catch (error) {
      console.error('Elasticsearch connection failed:', error.message);
      return false;
    }
  }

  // Get client instance
  getClient() {
    return this.client;
  }

  // Health check
  async healthCheck() {
    try {
      const response = await this.client.cluster.health();
      return {
        status: response.status,
        numberOfNodes: response.number_of_nodes,
        activeShards: response.active_shards,
      };
    } catch (error) {
      throw new Error(`Health check failed: ${error.message}`);
    }
  }
}

module.exports = new ElasticsearchClient();

Index Management Service

Create a service to manage Elasticsearch indices:

// src/services/indexService.js
const elasticsearchClient = require('../config/elasticsearch');

class IndexService {
  constructor() {
    this.client = elasticsearchClient.getClient();
  }

  // Create an index with mapping
  async createIndex(indexName, mapping = {}) {
    try {
      const indexExists = await this.client.indices.exists({
        index: indexName,
      });

      if (indexExists) {
        console.log(`Index ${indexName} already exists`);
        return { exists: true, index: indexName };
      }

      const response = await this.client.indices.create({
        index: indexName,
        body: {
          settings: {
            number_of_shards: 1,
            number_of_replicas: 1,
            analysis: {
              analyzer: {
                custom_analyzer: {
                  type: 'custom',
                  tokenizer: 'standard',
                  filter: ['lowercase', 'stop', 'snowball'],
                },
              },
            },
          },
          mappings: {
            properties: mapping,
          },
        },
      });

      console.log(`Index ${indexName} created successfully`);
      return { created: true, index: indexName, response };
    } catch (error) {
      console.error(`Error creating index ${indexName}:`, error.message);
      throw error;
    }
  }

  // Delete an index
  async deleteIndex(indexName) {
    try {
      const indexExists = await this.client.indices.exists({
        index: indexName,
      });

      if (!indexExists) {
        return { exists: false, index: indexName };
      }

      const response = await this.client.indices.delete({
        index: indexName,
      });

      console.log(`Index ${indexName} deleted successfully`);
      return { deleted: true, index: indexName, response };
    } catch (error) {
      console.error(`Error deleting index ${indexName}:`, error.message);
      throw error;
    }
  }

  // Update index mapping
  async updateMapping(indexName, mapping) {
    try {
      const response = await this.client.indices.putMapping({
        index: indexName,
        body: {
          properties: mapping,
        },
      });

      console.log(`Mapping updated for index ${indexName}`);
      return { updated: true, index: indexName, response };
    } catch (error) {
      console.error(`Error updating mapping for ${indexName}:`, error.message);
      throw error;
    }
  }

  // Get index information
  async getIndexInfo(indexName) {
    try {
      const response = await this.client.indices.get({
        index: indexName,
      });

      return response[indexName];
    } catch (error) {
      console.error(`Error getting index info for ${indexName}:`, error.message);
      throw error;
    }
  }

  // Refresh index (make indexed documents searchable)
  async refreshIndex(indexName) {
    try {
      await this.client.indices.refresh({ index: indexName });
      return { refreshed: true, index: indexName };
    } catch (error) {
      console.error(`Error refreshing index ${indexName}:`, error.message);
      throw error;
    }
  }
}

module.exports = new IndexService();

Document Model

Create a model for documents we'll be indexing:

// src/models/documentModel.js
class Document {
  constructor(data) {
    this.id = data.id;
    this.title = data.title;
    this.content = data.content;
    this.author = data.author;
    this.tags = data.tags || [];
    this.category = data.category;
    this.createdAt = data.createdAt || new Date().toISOString();
    this.updatedAt = data.updatedAt || new Date().toISOString();
    this.published = data.published !== undefined ? data.published : true;
  }

  toJSON() {
    return {
      id: this.id,
      title: this.title,
      content: this.content,
      author: this.author,
      tags: this.tags,
      category: this.category,
      createdAt: this.createdAt,
      updatedAt: this.updatedAt,
      published: this.published,
    };
  }

  // Validate document
  validate() {
    const errors = [];

    if (!this.title || this.title.trim().length === 0) {
      errors.push('Title is required');
    }

    if (!this.content || this.content.trim().length === 0) {
      errors.push('Content is required');
    }

    if (!this.author || this.author.trim().length === 0) {
      errors.push('Author is required');
    }

    return {
      isValid: errors.length === 0,
      errors,
    };
  }
}

module.exports = Document;

Search Service

Create the main search service with indexing and querying capabilities:

// src/services/searchService.js
const elasticsearchClient = require('../config/elasticsearch');
const indexService = require('./indexService');

class SearchService {
  constructor() {
    this.client = elasticsearchClient.getClient();
    this.indexName = process.env.ELASTICSEARCH_INDEX || 'documents';
  }

  // Initialize index with proper mapping
  async initializeIndex() {
    const mapping = {
      title: {
        type: 'text',
        analyzer: 'custom_analyzer',
        fields: {
          keyword: {
            type: 'keyword',
          },
        },
      },
      content: {
        type: 'text',
        analyzer: 'custom_analyzer',
      },
      author: {
        type: 'text',
        fields: {
          keyword: {
            type: 'keyword',
          },
        },
      },
      tags: {
        type: 'keyword',
      },
      category: {
        type: 'keyword',
      },
      createdAt: {
        type: 'date',
      },
      updatedAt: {
        type: 'date',
      },
      published: {
        type: 'boolean',
      },
    };

    return await indexService.createIndex(this.indexName, mapping);
  }

  // Index a single document
  async indexDocument(document) {
    try {
      const response = await this.client.index({
        index: this.indexName,
        id: document.id,
        body: {
          title: document.title,
          content: document.content,
          author: document.author,
          tags: document.tags,
          category: document.category,
          createdAt: document.createdAt,
          updatedAt: document.updatedAt,
          published: document.published,
        },
        refresh: 'wait_for', // Wait for the document to be searchable
      });

      return {
        success: true,
        id: response._id,
        result: response.result,
      };
    } catch (error) {
      console.error('Error indexing document:', error.message);
      throw error;
    }
  }

  // Bulk index documents
  async bulkIndexDocuments(documents) {
    try {
      const body = documents.flatMap((doc) => [
        {
          index: {
            _index: this.indexName,
            _id: doc.id,
          },
        },
        {
          title: doc.title,
          content: doc.content,
          author: doc.author,
          tags: doc.tags,
          category: doc.category,
          createdAt: doc.createdAt,
          updatedAt: doc.updatedAt,
          published: doc.published,
        },
      ]);

      const response = await this.client.bulk({
        refresh: true,
        body,
      });

      if (response.errors) {
        const erroredDocuments = [];
        response.items.forEach((action, i) => {
          const operation = Object.keys(action)[0];
          if (action[operation].error) {
            erroredDocuments.push({
              status: action[operation].status,
              error: action[operation].error,
              document: documents[i],
            });
          }
        });

        return {
          success: false,
          errors: erroredDocuments,
          indexed: response.items.length - erroredDocuments.length,
        };
      }

      return {
        success: true,
        indexed: response.items.length,
      };
    } catch (error) {
      console.error('Error bulk indexing documents:', error.message);
      throw error;
    }
  }

  // Update a document
  async updateDocument(documentId, updates) {
    try {
      const response = await this.client.update({
        index: this.indexName,
        id: documentId,
        body: {
          doc: updates,
          doc_as_upsert: true,
        },
        refresh: 'wait_for',
      });

      return {
        success: true,
        id: response._id,
        result: response.result,
      };
    } catch (error) {
      console.error('Error updating document:', error.message);
      throw error;
    }
  }

  // Delete a document
  async deleteDocument(documentId) {
    try {
      const response = await this.client.delete({
        index: this.indexName,
        id: documentId,
        refresh: 'wait_for',
      });

      return {
        success: true,
        id: response._id,
        result: response.result,
      };
    } catch (error) {
      if (error.meta?.statusCode === 404) {
        return {
          success: false,
          message: 'Document not found',
        };
      }
      console.error('Error deleting document:', error.message);
      throw error;
    }
  }

  // Basic search
  async search(query, options = {}) {
    try {
      const {
        page = 1,
        limit = 10,
        sort = [],
        filters = {},
        highlight = true,
      } = options;

      const from = (page - 1) * limit;

      const searchBody = {
        query: {
          bool: {
            must: [
              {
                multi_match: {
                  query: query,
                  fields: ['title^3', 'content^2', 'author'], // Boost title and content
                  type: 'best_fields',
                  fuzziness: 'AUTO',
                },
              },
            ],
            filter: [],
          },
        },
        from,
        size: limit,
        sort: sort.length > 0 ? sort : [{ _score: { order: 'desc' } }],
      };

      // Add filters
      if (filters.published !== undefined) {
        searchBody.query.bool.filter.push({
          term: { published: filters.published },
        });
      }

      if (filters.category) {
        searchBody.query.bool.filter.push({
          term: { category: filters.category },
        });
      }

      if (filters.tags && filters.tags.length > 0) {
        searchBody.query.bool.filter.push({
          terms: { tags: filters.tags },
        });
      }

      if (filters.author) {
        searchBody.query.bool.filter.push({
          term: { 'author.keyword': filters.author },
        });
      }

      // Add date range filter
      if (filters.dateFrom || filters.dateTo) {
        const dateFilter = {};
        if (filters.dateFrom) {
          dateFilter.gte = filters.dateFrom;
        }
        if (filters.dateTo) {
          dateFilter.lte = filters.dateTo;
        }
        searchBody.query.bool.filter.push({
          range: { createdAt: dateFilter },
        });
      }

      // Add highlighting
      if (highlight) {
        searchBody.highlight = {
          fields: {
            title: {},
            content: {
              fragment_size: 150,
              number_of_fragments: 3,
            },
          },
          pre_tags: ['<mark>'],
          post_tags: ['</mark>'],
        };
      }

      const response = await this.client.search({
        index: this.indexName,
        body: searchBody,
      });

      return {
        total: response.hits.total.value,
        hits: response.hits.hits.map((hit) => ({
          id: hit._id,
          score: hit._score,
          source: hit._source,
          highlight: hit.highlight,
        })),
        page,
        limit,
        totalPages: Math.ceil(response.hits.total.value / limit),
      };
    } catch (error) {
      console.error('Search error:', error.message);
      throw error;
    }
  }

  // Advanced search with aggregations
  async advancedSearch(query, options = {}) {
    try {
      const {
        page = 1,
        limit = 10,
        aggregations = {},
        filters = {},
      } = options;

      const searchBody = {
        query: {
          bool: {
            must: [
              {
                multi_match: {
                  query: query,
                  fields: ['title^3', 'content^2', 'author'],
                  type: 'best_fields',
                  fuzziness: 'AUTO',
                },
              },
            ],
            filter: [],
          },
        },
        from: (page - 1) * limit,
        size: limit,
        aggs: {},
      };

      // Add aggregations
      if (aggregations.categories) {
        searchBody.aggs.categories = {
          terms: {
            field: 'category',
            size: 10,
          },
        };
      }

      if (aggregations.tags) {
        searchBody.aggs.tags = {
          terms: {
            field: 'tags',
            size: 20,
          },
        };
      }

      if (aggregations.authors) {
        searchBody.aggs.authors = {
          terms: {
            field: 'author.keyword',
            size: 10,
          },
        };
      }

      if (aggregations.dateHistogram) {
        searchBody.aggs.dates = {
          date_histogram: {
            field: 'createdAt',
            calendar_interval: aggregations.dateHistogram.interval || 'month',
          },
        };
      }

      // Add filters (same as basic search)
      if (filters.published !== undefined) {
        searchBody.query.bool.filter.push({
          term: { published: filters.published },
        });
      }

      if (filters.category) {
        searchBody.query.bool.filter.push({
          term: { category: filters.category },
        });
      }

      const response = await this.client.search({
        index: this.indexName,
        body: searchBody,
      });

      return {
        total: response.hits.total.value,
        hits: response.hits.hits.map((hit) => ({
          id: hit._id,
          score: hit._score,
          source: hit._source,
          highlight: hit.highlight,
        })),
        aggregations: response.aggregations,
        page,
        limit,
        totalPages: Math.ceil(response.hits.total.value / limit),
      };
    } catch (error) {
      console.error('Advanced search error:', error.message);
      throw error;
    }
  }

  // Autocomplete/suggestions
  async autocomplete(query, field = 'title') {
    try {
      const response = await this.client.search({
        index: this.indexName,
        body: {
          suggest: {
            text: query,
            title_suggest: {
              completion: {
                field: `${field}.suggest`,
                size: 5,
              },
            },
          },
        },
      });

      return response.suggest.title_suggest[0].options.map((option) => ({
        text: option.text,
        score: option._score,
        source: option._source,
      }));
    } catch (error) {
      console.error('Autocomplete error:', error.message);
      throw error;
    }
  }

  // Get document by ID
  async getDocumentById(documentId) {
    try {
      const response = await this.client.get({
        index: this.indexName,
        id: documentId,
      });

      return {
        id: response._id,
        source: response._source,
      };
    } catch (error) {
      if (error.meta?.statusCode === 404) {
        return null;
      }
      throw error;
    }
  }

  // Get similar documents
  async getSimilarDocuments(documentId, limit = 5) {
    try {
      const doc = await this.getDocumentById(documentId);
      if (!doc) {
        return [];
      }

      const response = await this.client.search({
        index: this.indexName,
        body: {
          query: {
            more_like_this: {
              fields: ['title', 'content', 'tags'],
              like: [
                {
                  _index: this.indexName,
                  _id: documentId,
                },
              ],
              min_term_freq: 1,
              min_doc_freq: 1,
            },
          },
          size: limit,
        },
      });

      return response.hits.hits.map((hit) => ({
        id: hit._id,
        score: hit._score,
        source: hit._source,
      }));
    } catch (error) {
      console.error('Error getting similar documents:', error.message);
      throw error;
    }
  }
}

module.exports = new SearchService();

Search Controller

Create the controller to handle HTTP requests:

// src/controllers/searchController.js
const searchService = require('../services/searchService');
const indexService = require('../services/indexService');
const Document = require('../models/documentModel');

class SearchController {
  // Initialize index
  async initializeIndex(req, res) {
    try {
      const result = await searchService.initializeIndex();
      res.json({
        success: true,
        message: 'Index initialized successfully',
        data: result,
      });
    } catch (error) {
      res.status(500).json({
        success: false,
        message: 'Failed to initialize index',
        error: error.message,
      });
    }
  }

  // Index a document
  async indexDocument(req, res) {
    try {
      const document = new Document(req.body);
      const validation = document.validate();

      if (!validation.isValid) {
        return res.status(400).json({
          success: false,
          message: 'Validation failed',
          errors: validation.errors,
        });
      }

      const result = await searchService.indexDocument(document);
      res.status(201).json({
        success: true,
        message: 'Document indexed successfully',
        data: result,
      });
    } catch (error) {
      res.status(500).json({
        success: false,
        message: 'Failed to index document',
        error: error.message,
      });
    }
  }

  // Bulk index documents
  async bulkIndexDocuments(req, res) {
    try {
      const { documents } = req.body;

      if (!Array.isArray(documents) || documents.length === 0) {
        return res.status(400).json({
          success: false,
          message: 'Documents array is required',
        });
      }

      const validatedDocuments = documents.map((doc) => {
        const document = new Document(doc);
        const validation = document.validate();
        if (!validation.isValid) {
          throw new Error(`Invalid document: ${validation.errors.join(', ')}`);
        }
        return document;
      });

      const result = await searchService.bulkIndexDocuments(validatedDocuments);
      res.status(201).json({
        success: true,
        message: 'Documents indexed successfully',
        data: result,
      });
    } catch (error) {
      res.status(500).json({
        success: false,
        message: 'Failed to bulk index documents',
        error: error.message,
      });
    }
  }

  // Update a document
  async updateDocument(req, res) {
    try {
      const { id } = req.params;
      const updates = req.body;

      const result = await searchService.updateDocument(id, updates);
      res.json({
        success: true,
        message: 'Document updated successfully',
        data: result,
      });
    } catch (error) {
      res.status(500).json({
        success: false,
        message: 'Failed to update document',
        error: error.message,
      });
    }
  }

  // Delete a document
  async deleteDocument(req, res) {
    try {
      const { id } = req.params;

      const result = await searchService.deleteDocument(id);
      if (!result.success) {
        return res.status(404).json({
          success: false,
          message: result.message,
        });
      }

      res.json({
        success: true,
        message: 'Document deleted successfully',
        data: result,
      });
    } catch (error) {
      res.status(500).json({
        success: false,
        message: 'Failed to delete document',
        error: error.message,
      });
    }
  }

  // Search documents
  async search(req, res) {
    try {
      const { q, page, limit, category, tags, author, published, dateFrom, dateTo } = req.query;

      if (!q) {
        return res.status(400).json({
          success: false,
          message: 'Search query is required',
        });
      }

      const options = {
        page: parseInt(page) || 1,
        limit: parseInt(limit) || 10,
        filters: {
          ...(category && { category }),
          ...(tags && { tags: tags.split(',') }),
          ...(author && { author }),
          ...(published !== undefined && { published: published === 'true' }),
          ...(dateFrom && { dateFrom }),
          ...(dateTo && { dateTo }),
        },
        highlight: true,
      };

      const result = await searchService.search(q, options);
      res.json({
        success: true,
        data: result,
      });
    } catch (error) {
      res.status(500).json({
        success: false,
        message: 'Search failed',
        error: error.message,
      });
    }
  }

  // Advanced search with aggregations
  async advancedSearch(req, res) {
    try {
      const {
        q,
        page,
        limit,
        category,
        tags,
        author,
        published,
        aggregations,
      } = req.body;

      if (!q) {
        return res.status(400).json({
          success: false,
          message: 'Search query is required',
        });
      }

      const options = {
        page: parseInt(page) || 1,
        limit: parseInt(limit) || 10,
        filters: {
          ...(category && { category }),
          ...(tags && { tags: Array.isArray(tags) ? tags : [tags] }),
          ...(author && { author }),
          ...(published !== undefined && { published }),
        },
        aggregations: aggregations || {
          categories: true,
          tags: true,
          authors: true,
        },
      };

      const result = await searchService.advancedSearch(q, options);
      res.json({
        success: true,
        data: result,
      });
    } catch (error) {
      res.status(500).json({
        success: false,
        message: 'Advanced search failed',
        error: error.message,
      });
    }
  }

  // Autocomplete
  async autocomplete(req, res) {
    try {
      const { q, field } = req.query;

      if (!q) {
        return res.status(400).json({
          success: false,
          message: 'Query is required',
        });
      }

      const suggestions = await searchService.autocomplete(q, field);
      res.json({
        success: true,
        data: suggestions,
      });
    } catch (error) {
      res.status(500).json({
        success: false,
        message: 'Autocomplete failed',
        error: error.message,
      });
    }
  }

  // Get document by ID
  async getDocument(req, res) {
    try {
      const { id } = req.params;

      const document = await searchService.getDocumentById(id);
      if (!document) {
        return res.status(404).json({
          success: false,
          message: 'Document not found',
        });
      }

      res.json({
        success: true,
        data: document,
      });
    } catch (error) {
      res.status(500).json({
        success: false,
        message: 'Failed to get document',
        error: error.message,
      });
    }
  }

  // Get similar documents
  async getSimilarDocuments(req, res) {
    try {
      const { id } = req.params;
      const { limit } = req.query;

      const documents = await searchService.getSimilarDocuments(
        id,
        parseInt(limit) || 5
      );
      res.json({
        success: true,
        data: documents,
      });
    } catch (error) {
      res.status(500).json({
        success: false,
        message: 'Failed to get similar documents',
        error: error.message,
      });
    }
  }
}

module.exports = new SearchController();

Routes Setup

Create the routes:

// src/routes/search.js
const express = require('express');
const searchController = require('../controllers/searchController');

const router = express.Router();

// Index management
router.post('/index/initialize', searchController.initializeIndex);

// Document operations
router.post('/documents', searchController.indexDocument);
router.post('/documents/bulk', searchController.bulkIndexDocuments);
router.put('/documents/:id', searchController.updateDocument);
router.delete('/documents/:id', searchController.deleteDocument);
router.get('/documents/:id', searchController.getDocument);

// Search operations
router.get('/search', searchController.search);
router.post('/search/advanced', searchController.advancedSearch);
router.get('/autocomplete', searchController.autocomplete);
router.get('/documents/:id/similar', searchController.getSimilarDocuments);

module.exports = router;

Application Setup

Create the main application file:

// src/app.js
const express = require('express');
const cors = require('cors');
require('dotenv').config();

const searchRoutes = require('./routes/search');
const elasticsearchClient = require('./config/elasticsearch');

const app = express();

// Middleware
app.use(cors());
app.use(express.json({ limit: '10mb' }));
app.use(express.urlencoded({ extended: true }));

// Health check
app.get('/health', async (req, res) => {
  try {
    const isConnected = await elasticsearchClient.ping();
    const health = await elasticsearchClient.healthCheck();

    res.json({
      success: true,
      elasticsearch: {
        connected: isConnected,
        ...health,
      },
      timestamp: new Date().toISOString(),
    });
  } catch (error) {
    res.status(503).json({
      success: false,
      message: 'Service unavailable',
      error: error.message,
    });
  }
});

// Routes
app.use('/api', searchRoutes);

// 404 handler
app.use('*', (req, res) => {
  res.status(404).json({
    success: false,
    message: 'Route not found',
  });
});

// Error handler
app.use((error, req, res, next) => {
  console.error('Unhandled error:', error);
  res.status(error.status || 500).json({
    success: false,
    message: error.message || 'Internal server error',
    ...(process.env.NODE_ENV === 'development' && { stack: error.stack }),
  });
});

module.exports = app;

Server entry point:

// server.js
const app = require('./src/app');

const PORT = process.env.PORT || 3000;

app.listen(PORT, () => {
  console.log(`Server running on port ${PORT}`);
  console.log(`Environment: ${process.env.NODE_ENV || 'development'}`);
});

Environment Configuration

Create your environment configuration:

# .env
# Server Configuration
PORT=3000
NODE_ENV=development

# Elasticsearch Configuration
ELASTICSEARCH_URL=http://localhost:9200
ELASTICSEARCH_INDEX=documents
ELASTICSEARCH_SSL=false

# Optional: For secured Elasticsearch
# ELASTICSEARCH_USERNAME=elastic
# ELASTICSEARCH_PASSWORD=your-password

Testing the API

Package.json Scripts

Add these scripts to your package.json:

{
  "scripts": {
    "start": "node server.js",
    "dev": "nodemon server.js",
    "test": "echo \"Error: no test specified\" && exit 1"
  }
}

Testing with cURL

Start your server:

npm run dev

Initialize the index:

curl -X POST http://localhost:3000/api/index/initialize

Index a document:

curl -X POST http://localhost:3000/api/documents \
  -H "Content-Type: application/json" \
  -d '{
    "id": "1",
    "title": "Getting Started with Elasticsearch",
    "content": "Elasticsearch is a powerful search engine built on Apache Lucene...",
    "author": "John Doe",
    "tags": ["elasticsearch", "search", "tutorial"],
    "category": "technology",
    "published": true
  }'

Search documents:

curl "http://localhost:3000/api/search?q=elasticsearch&page=1&limit=10"

Advanced search:

curl -X POST http://localhost:3000/api/search/advanced \
  -H "Content-Type: application/json" \
  -d '{
    "q": "elasticsearch",
    "page": 1,
    "limit": 10,
    "filters": {
      "published": true,
      "category": "technology"
    },
    "aggregations": {
      "categories": true,
      "tags": true
    }
  }'

Bulk index documents:

curl -X POST http://localhost:3000/api/documents/bulk \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [
      {
        "id": "2",
        "title": "Node.js Best Practices",
        "content": "Node.js is a JavaScript runtime...",
        "author": "Jane Smith",
        "tags": ["nodejs", "javascript"],
        "category": "programming",
        "published": true
      },
      {
        "id": "3",
        "title": "Introduction to Full-Text Search",
        "content": "Full-text search allows users to search...",
        "author": "John Doe",
        "tags": ["search", "tutorial"],
        "category": "technology",
        "published": true
      }
    ]
  }'

Advanced Features

Custom Analyzers

Create custom analyzers for better text processing:

// Custom analyzer configuration
const customAnalyzer = {
  settings: {
    analysis: {
      analyzer: {
        custom_analyzer: {
          type: 'custom',
          tokenizer: 'standard',
          filter: [
            'lowercase',
            'stop',
            'snowball',
            'asciifolding', // Remove accents
          ],
        },
        ngram_analyzer: {
          type: 'custom',
          tokenizer: 'standard',
          filter: ['lowercase', 'ngram_filter'],
        },
      },
      filter: {
        ngram_filter: {
          type: 'ngram',
          min_gram: 2,
          max_gram: 15,
        },
      },
    },
  },
};

Fuzzy Search

Implement fuzzy search for typo tolerance:

// Fuzzy search example
async fuzzySearch(query, options = {}) {
  const searchBody = {
    query: {
      multi_match: {
        query: query,
        fields: ['title^3', 'content^2'],
        fuzziness: 'AUTO', // or '1', '2', etc.
        prefix_length: 2, // Minimum prefix length for fuzzy matching
      },
    },
  };

  const response = await this.client.search({
    index: this.indexName,
    body: searchBody,
  });

  return response.hits.hits;
}

Phrase Matching

Search for exact phrases:

// Phrase search example
async phraseSearch(query, options = {}) {
  const searchBody = {
    query: {
      match_phrase: {
        content: {
          query: query,
          slop: 2, // Allow words to be out of order by 2 positions
        },
      },
    },
  };

  const response = await this.client.search({
    index: this.indexName,
    body: searchBody,
  });

  return response.hits.hits;
}

Faceted Search

Implement faceted search with aggregations:

// Faceted search example
async facetedSearch(query, facets = {}) {
  const searchBody = {
    query: {
      multi_match: {
        query: query,
        fields: ['title^3', 'content^2'],
      },
    },
    aggs: {
      categories: {
        terms: { field: 'category' },
      },
      tags: {
        terms: { field: 'tags', size: 20 },
      },
      price_ranges: {
        range: {
          field: 'price',
          ranges: [
            { to: 50 },
            { from: 50, to: 100 },
            { from: 100 },
          ],
        },
      },
    },
  };

  const response = await this.client.search({
    index: this.indexName,
    body: searchBody,
  });

  return {
    results: response.hits.hits,
    facets: response.aggregations,
  };
}

Performance Optimization

1. Index Settings

Optimize index settings for your use case:

const optimizedSettings = {
  settings: {
    number_of_shards: 3, // Distribute across shards
    number_of_replicas: 1, // For redundancy
    refresh_interval: '30s', // Reduce refresh frequency for better indexing performance
    index: {
      max_result_window: 50000, // Increase max result window if needed
    },
  },
};

2. Bulk Operations

Always use bulk operations for multiple documents:

// Good: Bulk indexing
await searchService.bulkIndexDocuments(documents);

// Bad: Individual indexing
for (const doc of documents) {
  await searchService.indexDocument(doc);
}

3. Connection Pooling

Configure connection pooling:

const client = new Client({
  node: process.env.ELASTICSEARCH_URL,
  maxRetries: 3,
  requestTimeout: 60000,
  sniffOnStart: true,
  sniffInterval: 60000,
  sniffOnConnectionFault: true,
});

4. Caching

Implement caching for frequent queries:

const NodeCache = require('node-cache');
const cache = new NodeCache({ stdTTL: 600 }); // 10 minutes

async searchWithCache(query, options) {
  const cacheKey = JSON.stringify({ query, options });
  const cached = cache.get(cacheKey);

  if (cached) {
    return cached;
  }

  const result = await this.search(query, options);
  cache.set(cacheKey, result);
  return result;
}

5. Pagination

Use search_after for deep pagination instead of from/size:

// For deep pagination (beyond 10,000 results)
async searchWithSearchAfter(query, searchAfter = null) {
  const searchBody = {
    query: {
      multi_match: {
        query: query,
        fields: ['title^3', 'content^2'],
      },
    },
    size: 100,
    sort: [{ _score: 'desc' }, { _id: 'asc' }],
  };

  if (searchAfter) {
    searchBody.search_after = searchAfter;
  }

  const response = await this.client.search({
    index: this.indexName,
    body: searchBody,
  });

  const lastHit = response.hits.hits[response.hits.hits.length - 1];
  const nextSearchAfter = lastHit
    ? [lastHit._score, lastHit._id]
    : null;

  return {
    hits: response.hits.hits,
    nextSearchAfter,
  };
}

Production Deployment

Environment Variables

# Production .env
NODE_ENV=production
PORT=3000

# Elasticsearch Cloud
ELASTICSEARCH_URL=https://your-cluster.es.region.cloud.es.io:9243
ELASTICSEARCH_USERNAME=elastic
ELASTICSEARCH_PASSWORD=your-secure-password
ELASTICSEARCH_SSL=true
ELASTICSEARCH_INDEX=documents-prod

Error Handling

Implement comprehensive error handling:

class ElasticsearchError extends Error {
  constructor(message, statusCode, originalError) {
    super(message);
    this.name = 'ElasticsearchError';
    this.statusCode = statusCode;
    this.originalError = originalError;
  }
}

async handleElasticsearchError(error) {
  if (error.meta?.statusCode === 404) {
    throw new ElasticsearchError('Resource not found', 404, error);
  }

  if (error.meta?.statusCode === 429) {
    throw new ElasticsearchError('Rate limit exceeded', 429, error);
  }

  if (error.meta?.statusCode >= 500) {
    throw new ElasticsearchError('Elasticsearch server error', 500, error);
  }

  throw new ElasticsearchError('Elasticsearch error', 500, error);
}

Monitoring

Add monitoring and logging:

// Add request logging
app.use((req, res, next) => {
  const start = Date.now();
  res.on('finish', () => {
    const duration = Date.now() - start;
    console.log({
      method: req.method,
      url: req.url,
      status: res.statusCode,
      duration: `${duration}ms`,
    });
  });
  next();
});

// Add Elasticsearch query logging
async searchWithLogging(query, options) {
  const start = Date.now();
  try {
    const result = await this.search(query, options);
    const duration = Date.now() - start;

    console.log({
      query,
      duration: `${duration}ms`,
      results: result.total,
    });

    return result;
  } catch (error) {
    const duration = Date.now() - start;
    console.error({
      query,
      duration: `${duration}ms`,
      error: error.message,
    });
    throw error;
  }
}

Health Checks

Implement comprehensive health checks:

app.get('/health/detailed', async (req, res) => {
  try {
    const [ping, health, clusterStats] = await Promise.all([
      elasticsearchClient.ping(),
      elasticsearchClient.healthCheck(),
      elasticsearchClient.getClient().cluster.stats(),
    ]);

    res.json({
      status: ping ? 'healthy' : 'unhealthy',
      elasticsearch: {
        connected: ping,
        cluster: health,
        stats: clusterStats,
      },
      timestamp: new Date().toISOString(),
    });
  } catch (error) {
    res.status(503).json({
      status: 'unhealthy',
      error: error.message,
    });
  }
});

Best Practices

1. Index Naming

Use consistent naming conventions:

// Good: Environment-specific indices
const indexName = `documents-${process.env.NODE_ENV}`;

// Good: Date-based indices for time-series data
const indexName = `logs-${new Date().toISOString().split('T')[0]}`;

2. Document Structure

Keep documents focused and avoid nested objects when possible:

// Good: Flat structure
{
  "title": "Article Title",
  "content": "Content...",
  "author": "John Doe",
  "tags": ["tag1", "tag2"],
}

// Avoid: Deeply nested structures
{
  "article": {
    "metadata": {
      "author": {
        "name": "John Doe"
      }
    }
  }
}

3. Field Mapping

Define explicit mappings for better performance:

// Always define mappings explicitly
const mapping = {
  title: {
    type: 'text',
    analyzer: 'custom_analyzer',
    fields: {
      keyword: { type: 'keyword' }, // For exact matches
    },
  },
  createdAt: {
    type: 'date',
    format: 'strict_date_optional_time||epoch_millis',
  },
};

4. Query Optimization

Optimize queries for performance:

// Use filters instead of queries when possible
// Filters are cached and faster
{
  query: {
    bool: {
      must: [
        { match: { title: 'search term' } } // Query (for scoring)
      ],
      filter: [
        { term: { published: true } } // Filter (cached, no scoring)
      ]
    }
  }
}

5. Index Aliases

Use aliases for zero-downtime reindexing:

// Create alias
await client.indices.putAlias({
  index: 'documents-v1',
  name: 'documents',
});

// Switch alias to new index
await client.indices.updateAliases({
  body: {
    actions: [
      { remove: { index: 'documents-v1', alias: 'documents' } },
      { add: { index: 'documents-v2', alias: 'documents' } },
    ],
  },
});

Conclusion

You now have a comprehensive full-text search system for your Node.js application that includes:

✅ Elasticsearch integration
✅ Document indexing and management
✅ Basic and advanced search queries
✅ Relevance scoring and ranking
✅ Search result highlighting
✅ Autocomplete and suggestions
✅ Similar document recommendations
✅ Performance optimization techniques
✅ Production-ready deployment practices

Key Takeaways

Index Design: Proper index mapping and settings are crucial for performance
Query Optimization: Use filters for exact matches, queries for relevance scoring
Bulk Operations: Always use bulk operations for multiple documents
Monitoring: Implement comprehensive logging and health checks
Scalability: Design for horizontal scaling from the start

Next Steps

Implement search analytics and tracking
Add search result personalization
Implement A/B testing for search relevance
Set up automated index optimization
Add support for multi-language search
Implement search result caching strategies
Consider implementing Elasticsearch's machine learning features for better relevance

This search implementation provides a solid foundation for building powerful search capabilities in your Node.js applications. Remember to monitor performance, optimize queries, and scale your Elasticsearch cluster as your data grows.

Welcome to Huy Ha's zone of coding passion