Implementing Full-Text Search in Node.js with Elasticsearch
Introduction
Full-text search is a critical feature for modern applications, enabling users to quickly find relevant content across large datasets. While traditional database queries work for exact matches, they fall short when dealing with fuzzy searches, relevance ranking, and complex search requirements.
Elasticsearch is a distributed, RESTful search and analytics engine built on Apache Lucene. It provides powerful full-text search capabilities, real-time indexing, and excellent scalability, making it the go-to solution for search functionality in production applications.
In this comprehensive guide, we'll build a complete full-text search system for a Node.js application that includes:
- Elasticsearch setup and configuration
- Node.js client integration
- Document indexing and management
- Basic and advanced search queries
- Relevance scoring and ranking
- Search result highlighting
- Performance optimization
- Production deployment considerations
By the end of this tutorial, you'll have a production-ready search implementation that can handle complex search requirements efficiently.
Understanding Elasticsearch
What is Elasticsearch?
Elasticsearch is an open-source, distributed search and analytics engine designed for horizontal scalability, reliability, and real-time search. It's built on top of Apache Lucene and provides a simple RESTful API for indexing and searching data.
Key Features:
- Full-text search: Advanced text analysis and search capabilities
- Distributed: Automatically distributes data and queries across nodes
- Real-time: Near real-time indexing and search
- RESTful API: Simple HTTP interface for all operations
- Schema-free: JSON documents with dynamic mapping
- Scalable: Handles petabytes of data across clusters
Core Concepts
Index: Similar to a database in traditional systems, an index is a collection of documents.
Document: A JSON object that represents a single entity (like a blog post, product, or user).
Type: A logical category/partition of an index (deprecated in newer versions, but still used in some contexts).
Mapping: Defines the schema for documents in an index, including field types and analyzers.
Shard: A subset of an index's data. Elasticsearch splits indices into shards for distribution.
Replica: A copy of a shard for redundancy and improved query performance.
Why Use Elasticsearch for Full-Text Search?
Advantages:
- Fast: Optimized for search performance with inverted indices
- Flexible: Supports complex queries, filters, and aggregations
- Scalable: Handles large datasets and high query volumes
- Relevant: Advanced relevance scoring algorithms
- Analytics: Built-in aggregation capabilities for analytics
Use Cases:
- E-commerce product search
- Content management systems
- Log analysis and monitoring
- Application search
- Real-time analytics
- Autocomplete and suggestions
Project Setup
Let's start by setting up our Node.js project with Elasticsearch integration.
Initialize the Project
mkdir elasticsearch-nodejs
cd elasticsearch-nodejs
npm init -yInstall Dependencies
# Production dependencies
npm install @elastic/elasticsearch express dotenv
# Development dependencies
npm install -D nodemon @types/nodeDependencies explained:
@elastic/elasticsearch: Official Elasticsearch client for Node.jsexpress: Web framework for building the APIdotenv: Environment variables management
Project Structure
elasticsearch-nodejs/
├── src/
│ ├── config/
│ │ └── elasticsearch.js
│ ├── services/
│ │ ├── searchService.js
│ │ └── indexService.js
│ ├── controllers/
│ │ └── searchController.js
│ ├── models/
│ │ └── documentModel.js
│ ├── routes/
│ │ └── search.js
│ └── app.js
├── .env
├── .gitignore
├── package.json
└── server.jsElasticsearch Setup
Local Installation
Using Docker (Recommended)
# Run Elasticsearch in Docker
docker run -d \
--name elasticsearch \
-p 9200:9200 \
-p 9300:9300 \
-e "discovery.type=single-node" \
-e "xpack.security.enabled=false" \
docker.elastic.co/elasticsearch/elasticsearch:8.11.0Using Homebrew (macOS)
brew install elasticsearch
brew services start elasticsearchVerify Installation
curl http://localhost:9200You should see a response like:
{
"name": "node-1",
"cluster_name": "elasticsearch",
"cluster_uuid": "...",
"version": {
"number": "8.11.0"
}
}Cloud Options
For production, consider:
- Elastic Cloud: Managed Elasticsearch service
- AWS Elasticsearch Service: AWS-managed solution
- Azure Cognitive Search: Azure's search service
Elasticsearch Client Configuration
Create the Elasticsearch client configuration:
// src/config/elasticsearch.js
const { Client } = require('@elastic/elasticsearch');
require('dotenv').config();
class ElasticsearchClient {
constructor() {
this.client = new Client({
node: process.env.ELASTICSEARCH_URL || 'http://localhost:9200',
auth: process.env.ELASTICSEARCH_AUTH
? {
username: process.env.ELASTICSEARCH_USERNAME,
password: process.env.ELASTICSEARCH_PASSWORD,
}
: undefined,
ssl: process.env.ELASTICSEARCH_SSL === 'true'
? {
rejectUnauthorized: false, // For self-signed certificates
}
: undefined,
requestTimeout: 60000,
pingTimeout: 3000,
sniffOnStart: false,
sniffInterval: false,
});
}
// Check if Elasticsearch is available
async ping() {
try {
await this.client.ping();
return true;
} catch (error) {
console.error('Elasticsearch connection failed:', error.message);
return false;
}
}
// Get client instance
getClient() {
return this.client;
}
// Health check
async healthCheck() {
try {
const response = await this.client.cluster.health();
return {
status: response.status,
numberOfNodes: response.number_of_nodes,
activeShards: response.active_shards,
};
} catch (error) {
throw new Error(`Health check failed: ${error.message}`);
}
}
}
module.exports = new ElasticsearchClient();Index Management Service
Create a service to manage Elasticsearch indices:
// src/services/indexService.js
const elasticsearchClient = require('../config/elasticsearch');
class IndexService {
constructor() {
this.client = elasticsearchClient.getClient();
}
// Create an index with mapping
async createIndex(indexName, mapping = {}) {
try {
const indexExists = await this.client.indices.exists({
index: indexName,
});
if (indexExists) {
console.log(`Index ${indexName} already exists`);
return { exists: true, index: indexName };
}
const response = await this.client.indices.create({
index: indexName,
body: {
settings: {
number_of_shards: 1,
number_of_replicas: 1,
analysis: {
analyzer: {
custom_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'stop', 'snowball'],
},
},
},
},
mappings: {
properties: mapping,
},
},
});
console.log(`Index ${indexName} created successfully`);
return { created: true, index: indexName, response };
} catch (error) {
console.error(`Error creating index ${indexName}:`, error.message);
throw error;
}
}
// Delete an index
async deleteIndex(indexName) {
try {
const indexExists = await this.client.indices.exists({
index: indexName,
});
if (!indexExists) {
return { exists: false, index: indexName };
}
const response = await this.client.indices.delete({
index: indexName,
});
console.log(`Index ${indexName} deleted successfully`);
return { deleted: true, index: indexName, response };
} catch (error) {
console.error(`Error deleting index ${indexName}:`, error.message);
throw error;
}
}
// Update index mapping
async updateMapping(indexName, mapping) {
try {
const response = await this.client.indices.putMapping({
index: indexName,
body: {
properties: mapping,
},
});
console.log(`Mapping updated for index ${indexName}`);
return { updated: true, index: indexName, response };
} catch (error) {
console.error(`Error updating mapping for ${indexName}:`, error.message);
throw error;
}
}
// Get index information
async getIndexInfo(indexName) {
try {
const response = await this.client.indices.get({
index: indexName,
});
return response[indexName];
} catch (error) {
console.error(`Error getting index info for ${indexName}:`, error.message);
throw error;
}
}
// Refresh index (make indexed documents searchable)
async refreshIndex(indexName) {
try {
await this.client.indices.refresh({ index: indexName });
return { refreshed: true, index: indexName };
} catch (error) {
console.error(`Error refreshing index ${indexName}:`, error.message);
throw error;
}
}
}
module.exports = new IndexService();Document Model
Create a model for documents we'll be indexing:
// src/models/documentModel.js
class Document {
constructor(data) {
this.id = data.id;
this.title = data.title;
this.content = data.content;
this.author = data.author;
this.tags = data.tags || [];
this.category = data.category;
this.createdAt = data.createdAt || new Date().toISOString();
this.updatedAt = data.updatedAt || new Date().toISOString();
this.published = data.published !== undefined ? data.published : true;
}
toJSON() {
return {
id: this.id,
title: this.title,
content: this.content,
author: this.author,
tags: this.tags,
category: this.category,
createdAt: this.createdAt,
updatedAt: this.updatedAt,
published: this.published,
};
}
// Validate document
validate() {
const errors = [];
if (!this.title || this.title.trim().length === 0) {
errors.push('Title is required');
}
if (!this.content || this.content.trim().length === 0) {
errors.push('Content is required');
}
if (!this.author || this.author.trim().length === 0) {
errors.push('Author is required');
}
return {
isValid: errors.length === 0,
errors,
};
}
}
module.exports = Document;Search Service
Create the main search service with indexing and querying capabilities:
// src/services/searchService.js
const elasticsearchClient = require('../config/elasticsearch');
const indexService = require('./indexService');
class SearchService {
constructor() {
this.client = elasticsearchClient.getClient();
this.indexName = process.env.ELASTICSEARCH_INDEX || 'documents';
}
// Initialize index with proper mapping
async initializeIndex() {
const mapping = {
title: {
type: 'text',
analyzer: 'custom_analyzer',
fields: {
keyword: {
type: 'keyword',
},
},
},
content: {
type: 'text',
analyzer: 'custom_analyzer',
},
author: {
type: 'text',
fields: {
keyword: {
type: 'keyword',
},
},
},
tags: {
type: 'keyword',
},
category: {
type: 'keyword',
},
createdAt: {
type: 'date',
},
updatedAt: {
type: 'date',
},
published: {
type: 'boolean',
},
};
return await indexService.createIndex(this.indexName, mapping);
}
// Index a single document
async indexDocument(document) {
try {
const response = await this.client.index({
index: this.indexName,
id: document.id,
body: {
title: document.title,
content: document.content,
author: document.author,
tags: document.tags,
category: document.category,
createdAt: document.createdAt,
updatedAt: document.updatedAt,
published: document.published,
},
refresh: 'wait_for', // Wait for the document to be searchable
});
return {
success: true,
id: response._id,
result: response.result,
};
} catch (error) {
console.error('Error indexing document:', error.message);
throw error;
}
}
// Bulk index documents
async bulkIndexDocuments(documents) {
try {
const body = documents.flatMap((doc) => [
{
index: {
_index: this.indexName,
_id: doc.id,
},
},
{
title: doc.title,
content: doc.content,
author: doc.author,
tags: doc.tags,
category: doc.category,
createdAt: doc.createdAt,
updatedAt: doc.updatedAt,
published: doc.published,
},
]);
const response = await this.client.bulk({
refresh: true,
body,
});
if (response.errors) {
const erroredDocuments = [];
response.items.forEach((action, i) => {
const operation = Object.keys(action)[0];
if (action[operation].error) {
erroredDocuments.push({
status: action[operation].status,
error: action[operation].error,
document: documents[i],
});
}
});
return {
success: false,
errors: erroredDocuments,
indexed: response.items.length - erroredDocuments.length,
};
}
return {
success: true,
indexed: response.items.length,
};
} catch (error) {
console.error('Error bulk indexing documents:', error.message);
throw error;
}
}
// Update a document
async updateDocument(documentId, updates) {
try {
const response = await this.client.update({
index: this.indexName,
id: documentId,
body: {
doc: updates,
doc_as_upsert: true,
},
refresh: 'wait_for',
});
return {
success: true,
id: response._id,
result: response.result,
};
} catch (error) {
console.error('Error updating document:', error.message);
throw error;
}
}
// Delete a document
async deleteDocument(documentId) {
try {
const response = await this.client.delete({
index: this.indexName,
id: documentId,
refresh: 'wait_for',
});
return {
success: true,
id: response._id,
result: response.result,
};
} catch (error) {
if (error.meta?.statusCode === 404) {
return {
success: false,
message: 'Document not found',
};
}
console.error('Error deleting document:', error.message);
throw error;
}
}
// Basic search
async search(query, options = {}) {
try {
const {
page = 1,
limit = 10,
sort = [],
filters = {},
highlight = true,
} = options;
const from = (page - 1) * limit;
const searchBody = {
query: {
bool: {
must: [
{
multi_match: {
query: query,
fields: ['title^3', 'content^2', 'author'], // Boost title and content
type: 'best_fields',
fuzziness: 'AUTO',
},
},
],
filter: [],
},
},
from,
size: limit,
sort: sort.length > 0 ? sort : [{ _score: { order: 'desc' } }],
};
// Add filters
if (filters.published !== undefined) {
searchBody.query.bool.filter.push({
term: { published: filters.published },
});
}
if (filters.category) {
searchBody.query.bool.filter.push({
term: { category: filters.category },
});
}
if (filters.tags && filters.tags.length > 0) {
searchBody.query.bool.filter.push({
terms: { tags: filters.tags },
});
}
if (filters.author) {
searchBody.query.bool.filter.push({
term: { 'author.keyword': filters.author },
});
}
// Add date range filter
if (filters.dateFrom || filters.dateTo) {
const dateFilter = {};
if (filters.dateFrom) {
dateFilter.gte = filters.dateFrom;
}
if (filters.dateTo) {
dateFilter.lte = filters.dateTo;
}
searchBody.query.bool.filter.push({
range: { createdAt: dateFilter },
});
}
// Add highlighting
if (highlight) {
searchBody.highlight = {
fields: {
title: {},
content: {
fragment_size: 150,
number_of_fragments: 3,
},
},
pre_tags: ['<mark>'],
post_tags: ['</mark>'],
};
}
const response = await this.client.search({
index: this.indexName,
body: searchBody,
});
return {
total: response.hits.total.value,
hits: response.hits.hits.map((hit) => ({
id: hit._id,
score: hit._score,
source: hit._source,
highlight: hit.highlight,
})),
page,
limit,
totalPages: Math.ceil(response.hits.total.value / limit),
};
} catch (error) {
console.error('Search error:', error.message);
throw error;
}
}
// Advanced search with aggregations
async advancedSearch(query, options = {}) {
try {
const {
page = 1,
limit = 10,
aggregations = {},
filters = {},
} = options;
const searchBody = {
query: {
bool: {
must: [
{
multi_match: {
query: query,
fields: ['title^3', 'content^2', 'author'],
type: 'best_fields',
fuzziness: 'AUTO',
},
},
],
filter: [],
},
},
from: (page - 1) * limit,
size: limit,
aggs: {},
};
// Add aggregations
if (aggregations.categories) {
searchBody.aggs.categories = {
terms: {
field: 'category',
size: 10,
},
};
}
if (aggregations.tags) {
searchBody.aggs.tags = {
terms: {
field: 'tags',
size: 20,
},
};
}
if (aggregations.authors) {
searchBody.aggs.authors = {
terms: {
field: 'author.keyword',
size: 10,
},
};
}
if (aggregations.dateHistogram) {
searchBody.aggs.dates = {
date_histogram: {
field: 'createdAt',
calendar_interval: aggregations.dateHistogram.interval || 'month',
},
};
}
// Add filters (same as basic search)
if (filters.published !== undefined) {
searchBody.query.bool.filter.push({
term: { published: filters.published },
});
}
if (filters.category) {
searchBody.query.bool.filter.push({
term: { category: filters.category },
});
}
const response = await this.client.search({
index: this.indexName,
body: searchBody,
});
return {
total: response.hits.total.value,
hits: response.hits.hits.map((hit) => ({
id: hit._id,
score: hit._score,
source: hit._source,
highlight: hit.highlight,
})),
aggregations: response.aggregations,
page,
limit,
totalPages: Math.ceil(response.hits.total.value / limit),
};
} catch (error) {
console.error('Advanced search error:', error.message);
throw error;
}
}
// Autocomplete/suggestions
async autocomplete(query, field = 'title') {
try {
const response = await this.client.search({
index: this.indexName,
body: {
suggest: {
text: query,
title_suggest: {
completion: {
field: `${field}.suggest`,
size: 5,
},
},
},
},
});
return response.suggest.title_suggest[0].options.map((option) => ({
text: option.text,
score: option._score,
source: option._source,
}));
} catch (error) {
console.error('Autocomplete error:', error.message);
throw error;
}
}
// Get document by ID
async getDocumentById(documentId) {
try {
const response = await this.client.get({
index: this.indexName,
id: documentId,
});
return {
id: response._id,
source: response._source,
};
} catch (error) {
if (error.meta?.statusCode === 404) {
return null;
}
throw error;
}
}
// Get similar documents
async getSimilarDocuments(documentId, limit = 5) {
try {
const doc = await this.getDocumentById(documentId);
if (!doc) {
return [];
}
const response = await this.client.search({
index: this.indexName,
body: {
query: {
more_like_this: {
fields: ['title', 'content', 'tags'],
like: [
{
_index: this.indexName,
_id: documentId,
},
],
min_term_freq: 1,
min_doc_freq: 1,
},
},
size: limit,
},
});
return response.hits.hits.map((hit) => ({
id: hit._id,
score: hit._score,
source: hit._source,
}));
} catch (error) {
console.error('Error getting similar documents:', error.message);
throw error;
}
}
}
module.exports = new SearchService();Search Controller
Create the controller to handle HTTP requests:
// src/controllers/searchController.js
const searchService = require('../services/searchService');
const indexService = require('../services/indexService');
const Document = require('../models/documentModel');
class SearchController {
// Initialize index
async initializeIndex(req, res) {
try {
const result = await searchService.initializeIndex();
res.json({
success: true,
message: 'Index initialized successfully',
data: result,
});
} catch (error) {
res.status(500).json({
success: false,
message: 'Failed to initialize index',
error: error.message,
});
}
}
// Index a document
async indexDocument(req, res) {
try {
const document = new Document(req.body);
const validation = document.validate();
if (!validation.isValid) {
return res.status(400).json({
success: false,
message: 'Validation failed',
errors: validation.errors,
});
}
const result = await searchService.indexDocument(document);
res.status(201).json({
success: true,
message: 'Document indexed successfully',
data: result,
});
} catch (error) {
res.status(500).json({
success: false,
message: 'Failed to index document',
error: error.message,
});
}
}
// Bulk index documents
async bulkIndexDocuments(req, res) {
try {
const { documents } = req.body;
if (!Array.isArray(documents) || documents.length === 0) {
return res.status(400).json({
success: false,
message: 'Documents array is required',
});
}
const validatedDocuments = documents.map((doc) => {
const document = new Document(doc);
const validation = document.validate();
if (!validation.isValid) {
throw new Error(`Invalid document: ${validation.errors.join(', ')}`);
}
return document;
});
const result = await searchService.bulkIndexDocuments(validatedDocuments);
res.status(201).json({
success: true,
message: 'Documents indexed successfully',
data: result,
});
} catch (error) {
res.status(500).json({
success: false,
message: 'Failed to bulk index documents',
error: error.message,
});
}
}
// Update a document
async updateDocument(req, res) {
try {
const { id } = req.params;
const updates = req.body;
const result = await searchService.updateDocument(id, updates);
res.json({
success: true,
message: 'Document updated successfully',
data: result,
});
} catch (error) {
res.status(500).json({
success: false,
message: 'Failed to update document',
error: error.message,
});
}
}
// Delete a document
async deleteDocument(req, res) {
try {
const { id } = req.params;
const result = await searchService.deleteDocument(id);
if (!result.success) {
return res.status(404).json({
success: false,
message: result.message,
});
}
res.json({
success: true,
message: 'Document deleted successfully',
data: result,
});
} catch (error) {
res.status(500).json({
success: false,
message: 'Failed to delete document',
error: error.message,
});
}
}
// Search documents
async search(req, res) {
try {
const { q, page, limit, category, tags, author, published, dateFrom, dateTo } = req.query;
if (!q) {
return res.status(400).json({
success: false,
message: 'Search query is required',
});
}
const options = {
page: parseInt(page) || 1,
limit: parseInt(limit) || 10,
filters: {
...(category && { category }),
...(tags && { tags: tags.split(',') }),
...(author && { author }),
...(published !== undefined && { published: published === 'true' }),
...(dateFrom && { dateFrom }),
...(dateTo && { dateTo }),
},
highlight: true,
};
const result = await searchService.search(q, options);
res.json({
success: true,
data: result,
});
} catch (error) {
res.status(500).json({
success: false,
message: 'Search failed',
error: error.message,
});
}
}
// Advanced search with aggregations
async advancedSearch(req, res) {
try {
const {
q,
page,
limit,
category,
tags,
author,
published,
aggregations,
} = req.body;
if (!q) {
return res.status(400).json({
success: false,
message: 'Search query is required',
});
}
const options = {
page: parseInt(page) || 1,
limit: parseInt(limit) || 10,
filters: {
...(category && { category }),
...(tags && { tags: Array.isArray(tags) ? tags : [tags] }),
...(author && { author }),
...(published !== undefined && { published }),
},
aggregations: aggregations || {
categories: true,
tags: true,
authors: true,
},
};
const result = await searchService.advancedSearch(q, options);
res.json({
success: true,
data: result,
});
} catch (error) {
res.status(500).json({
success: false,
message: 'Advanced search failed',
error: error.message,
});
}
}
// Autocomplete
async autocomplete(req, res) {
try {
const { q, field } = req.query;
if (!q) {
return res.status(400).json({
success: false,
message: 'Query is required',
});
}
const suggestions = await searchService.autocomplete(q, field);
res.json({
success: true,
data: suggestions,
});
} catch (error) {
res.status(500).json({
success: false,
message: 'Autocomplete failed',
error: error.message,
});
}
}
// Get document by ID
async getDocument(req, res) {
try {
const { id } = req.params;
const document = await searchService.getDocumentById(id);
if (!document) {
return res.status(404).json({
success: false,
message: 'Document not found',
});
}
res.json({
success: true,
data: document,
});
} catch (error) {
res.status(500).json({
success: false,
message: 'Failed to get document',
error: error.message,
});
}
}
// Get similar documents
async getSimilarDocuments(req, res) {
try {
const { id } = req.params;
const { limit } = req.query;
const documents = await searchService.getSimilarDocuments(
id,
parseInt(limit) || 5
);
res.json({
success: true,
data: documents,
});
} catch (error) {
res.status(500).json({
success: false,
message: 'Failed to get similar documents',
error: error.message,
});
}
}
}
module.exports = new SearchController();Routes Setup
Create the routes:
// src/routes/search.js
const express = require('express');
const searchController = require('../controllers/searchController');
const router = express.Router();
// Index management
router.post('/index/initialize', searchController.initializeIndex);
// Document operations
router.post('/documents', searchController.indexDocument);
router.post('/documents/bulk', searchController.bulkIndexDocuments);
router.put('/documents/:id', searchController.updateDocument);
router.delete('/documents/:id', searchController.deleteDocument);
router.get('/documents/:id', searchController.getDocument);
// Search operations
router.get('/search', searchController.search);
router.post('/search/advanced', searchController.advancedSearch);
router.get('/autocomplete', searchController.autocomplete);
router.get('/documents/:id/similar', searchController.getSimilarDocuments);
module.exports = router;Application Setup
Create the main application file:
// src/app.js
const express = require('express');
const cors = require('cors');
require('dotenv').config();
const searchRoutes = require('./routes/search');
const elasticsearchClient = require('./config/elasticsearch');
const app = express();
// Middleware
app.use(cors());
app.use(express.json({ limit: '10mb' }));
app.use(express.urlencoded({ extended: true }));
// Health check
app.get('/health', async (req, res) => {
try {
const isConnected = await elasticsearchClient.ping();
const health = await elasticsearchClient.healthCheck();
res.json({
success: true,
elasticsearch: {
connected: isConnected,
...health,
},
timestamp: new Date().toISOString(),
});
} catch (error) {
res.status(503).json({
success: false,
message: 'Service unavailable',
error: error.message,
});
}
});
// Routes
app.use('/api', searchRoutes);
// 404 handler
app.use('*', (req, res) => {
res.status(404).json({
success: false,
message: 'Route not found',
});
});
// Error handler
app.use((error, req, res, next) => {
console.error('Unhandled error:', error);
res.status(error.status || 500).json({
success: false,
message: error.message || 'Internal server error',
...(process.env.NODE_ENV === 'development' && { stack: error.stack }),
});
});
module.exports = app;Server entry point:
// server.js
const app = require('./src/app');
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
console.log(`Server running on port ${PORT}`);
console.log(`Environment: ${process.env.NODE_ENV || 'development'}`);
});Environment Configuration
Create your environment configuration:
# .env
# Server Configuration
PORT=3000
NODE_ENV=development
# Elasticsearch Configuration
ELASTICSEARCH_URL=http://localhost:9200
ELASTICSEARCH_INDEX=documents
ELASTICSEARCH_SSL=false
# Optional: For secured Elasticsearch
# ELASTICSEARCH_USERNAME=elastic
# ELASTICSEARCH_PASSWORD=your-passwordTesting the API
Package.json Scripts
Add these scripts to your package.json:
{
"scripts": {
"start": "node server.js",
"dev": "nodemon server.js",
"test": "echo \"Error: no test specified\" && exit 1"
}
}Testing with cURL
Start your server:
npm run devInitialize the index:
curl -X POST http://localhost:3000/api/index/initializeIndex a document:
curl -X POST http://localhost:3000/api/documents \
-H "Content-Type: application/json" \
-d '{
"id": "1",
"title": "Getting Started with Elasticsearch",
"content": "Elasticsearch is a powerful search engine built on Apache Lucene...",
"author": "John Doe",
"tags": ["elasticsearch", "search", "tutorial"],
"category": "technology",
"published": true
}'Search documents:
curl "http://localhost:3000/api/search?q=elasticsearch&page=1&limit=10"Advanced search:
curl -X POST http://localhost:3000/api/search/advanced \
-H "Content-Type: application/json" \
-d '{
"q": "elasticsearch",
"page": 1,
"limit": 10,
"filters": {
"published": true,
"category": "technology"
},
"aggregations": {
"categories": true,
"tags": true
}
}'Bulk index documents:
curl -X POST http://localhost:3000/api/documents/bulk \
-H "Content-Type: application/json" \
-d '{
"documents": [
{
"id": "2",
"title": "Node.js Best Practices",
"content": "Node.js is a JavaScript runtime...",
"author": "Jane Smith",
"tags": ["nodejs", "javascript"],
"category": "programming",
"published": true
},
{
"id": "3",
"title": "Introduction to Full-Text Search",
"content": "Full-text search allows users to search...",
"author": "John Doe",
"tags": ["search", "tutorial"],
"category": "technology",
"published": true
}
]
}'Advanced Features
Custom Analyzers
Create custom analyzers for better text processing:
// Custom analyzer configuration
const customAnalyzer = {
settings: {
analysis: {
analyzer: {
custom_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: [
'lowercase',
'stop',
'snowball',
'asciifolding', // Remove accents
],
},
ngram_analyzer: {
type: 'custom',
tokenizer: 'standard',
filter: ['lowercase', 'ngram_filter'],
},
},
filter: {
ngram_filter: {
type: 'ngram',
min_gram: 2,
max_gram: 15,
},
},
},
},
};Fuzzy Search
Implement fuzzy search for typo tolerance:
// Fuzzy search example
async fuzzySearch(query, options = {}) {
const searchBody = {
query: {
multi_match: {
query: query,
fields: ['title^3', 'content^2'],
fuzziness: 'AUTO', // or '1', '2', etc.
prefix_length: 2, // Minimum prefix length for fuzzy matching
},
},
};
const response = await this.client.search({
index: this.indexName,
body: searchBody,
});
return response.hits.hits;
}Phrase Matching
Search for exact phrases:
// Phrase search example
async phraseSearch(query, options = {}) {
const searchBody = {
query: {
match_phrase: {
content: {
query: query,
slop: 2, // Allow words to be out of order by 2 positions
},
},
},
};
const response = await this.client.search({
index: this.indexName,
body: searchBody,
});
return response.hits.hits;
}Faceted Search
Implement faceted search with aggregations:
// Faceted search example
async facetedSearch(query, facets = {}) {
const searchBody = {
query: {
multi_match: {
query: query,
fields: ['title^3', 'content^2'],
},
},
aggs: {
categories: {
terms: { field: 'category' },
},
tags: {
terms: { field: 'tags', size: 20 },
},
price_ranges: {
range: {
field: 'price',
ranges: [
{ to: 50 },
{ from: 50, to: 100 },
{ from: 100 },
],
},
},
},
};
const response = await this.client.search({
index: this.indexName,
body: searchBody,
});
return {
results: response.hits.hits,
facets: response.aggregations,
};
}Performance Optimization
1. Index Settings
Optimize index settings for your use case:
const optimizedSettings = {
settings: {
number_of_shards: 3, // Distribute across shards
number_of_replicas: 1, // For redundancy
refresh_interval: '30s', // Reduce refresh frequency for better indexing performance
index: {
max_result_window: 50000, // Increase max result window if needed
},
},
};2. Bulk Operations
Always use bulk operations for multiple documents:
// Good: Bulk indexing
await searchService.bulkIndexDocuments(documents);
// Bad: Individual indexing
for (const doc of documents) {
await searchService.indexDocument(doc);
}3. Connection Pooling
Configure connection pooling:
const client = new Client({
node: process.env.ELASTICSEARCH_URL,
maxRetries: 3,
requestTimeout: 60000,
sniffOnStart: true,
sniffInterval: 60000,
sniffOnConnectionFault: true,
});4. Caching
Implement caching for frequent queries:
const NodeCache = require('node-cache');
const cache = new NodeCache({ stdTTL: 600 }); // 10 minutes
async searchWithCache(query, options) {
const cacheKey = JSON.stringify({ query, options });
const cached = cache.get(cacheKey);
if (cached) {
return cached;
}
const result = await this.search(query, options);
cache.set(cacheKey, result);
return result;
}5. Pagination
Use search_after for deep pagination instead of from/size:
// For deep pagination (beyond 10,000 results)
async searchWithSearchAfter(query, searchAfter = null) {
const searchBody = {
query: {
multi_match: {
query: query,
fields: ['title^3', 'content^2'],
},
},
size: 100,
sort: [{ _score: 'desc' }, { _id: 'asc' }],
};
if (searchAfter) {
searchBody.search_after = searchAfter;
}
const response = await this.client.search({
index: this.indexName,
body: searchBody,
});
const lastHit = response.hits.hits[response.hits.hits.length - 1];
const nextSearchAfter = lastHit
? [lastHit._score, lastHit._id]
: null;
return {
hits: response.hits.hits,
nextSearchAfter,
};
}Production Deployment
Environment Variables
# Production .env
NODE_ENV=production
PORT=3000
# Elasticsearch Cloud
ELASTICSEARCH_URL=https://your-cluster.es.region.cloud.es.io:9243
ELASTICSEARCH_USERNAME=elastic
ELASTICSEARCH_PASSWORD=your-secure-password
ELASTICSEARCH_SSL=true
ELASTICSEARCH_INDEX=documents-prodError Handling
Implement comprehensive error handling:
class ElasticsearchError extends Error {
constructor(message, statusCode, originalError) {
super(message);
this.name = 'ElasticsearchError';
this.statusCode = statusCode;
this.originalError = originalError;
}
}
async handleElasticsearchError(error) {
if (error.meta?.statusCode === 404) {
throw new ElasticsearchError('Resource not found', 404, error);
}
if (error.meta?.statusCode === 429) {
throw new ElasticsearchError('Rate limit exceeded', 429, error);
}
if (error.meta?.statusCode >= 500) {
throw new ElasticsearchError('Elasticsearch server error', 500, error);
}
throw new ElasticsearchError('Elasticsearch error', 500, error);
}Monitoring
Add monitoring and logging:
// Add request logging
app.use((req, res, next) => {
const start = Date.now();
res.on('finish', () => {
const duration = Date.now() - start;
console.log({
method: req.method,
url: req.url,
status: res.statusCode,
duration: `${duration}ms`,
});
});
next();
});
// Add Elasticsearch query logging
async searchWithLogging(query, options) {
const start = Date.now();
try {
const result = await this.search(query, options);
const duration = Date.now() - start;
console.log({
query,
duration: `${duration}ms`,
results: result.total,
});
return result;
} catch (error) {
const duration = Date.now() - start;
console.error({
query,
duration: `${duration}ms`,
error: error.message,
});
throw error;
}
}Health Checks
Implement comprehensive health checks:
app.get('/health/detailed', async (req, res) => {
try {
const [ping, health, clusterStats] = await Promise.all([
elasticsearchClient.ping(),
elasticsearchClient.healthCheck(),
elasticsearchClient.getClient().cluster.stats(),
]);
res.json({
status: ping ? 'healthy' : 'unhealthy',
elasticsearch: {
connected: ping,
cluster: health,
stats: clusterStats,
},
timestamp: new Date().toISOString(),
});
} catch (error) {
res.status(503).json({
status: 'unhealthy',
error: error.message,
});
}
});Best Practices
1. Index Naming
Use consistent naming conventions:
// Good: Environment-specific indices
const indexName = `documents-${process.env.NODE_ENV}`;
// Good: Date-based indices for time-series data
const indexName = `logs-${new Date().toISOString().split('T')[0]}`;2. Document Structure
Keep documents focused and avoid nested objects when possible:
// Good: Flat structure
{
"title": "Article Title",
"content": "Content...",
"author": "John Doe",
"tags": ["tag1", "tag2"],
}
// Avoid: Deeply nested structures
{
"article": {
"metadata": {
"author": {
"name": "John Doe"
}
}
}
}3. Field Mapping
Define explicit mappings for better performance:
// Always define mappings explicitly
const mapping = {
title: {
type: 'text',
analyzer: 'custom_analyzer',
fields: {
keyword: { type: 'keyword' }, // For exact matches
},
},
createdAt: {
type: 'date',
format: 'strict_date_optional_time||epoch_millis',
},
};4. Query Optimization
Optimize queries for performance:
// Use filters instead of queries when possible
// Filters are cached and faster
{
query: {
bool: {
must: [
{ match: { title: 'search term' } } // Query (for scoring)
],
filter: [
{ term: { published: true } } // Filter (cached, no scoring)
]
}
}
}5. Index Aliases
Use aliases for zero-downtime reindexing:
// Create alias
await client.indices.putAlias({
index: 'documents-v1',
name: 'documents',
});
// Switch alias to new index
await client.indices.updateAliases({
body: {
actions: [
{ remove: { index: 'documents-v1', alias: 'documents' } },
{ add: { index: 'documents-v2', alias: 'documents' } },
],
},
});Conclusion
You now have a comprehensive full-text search system for your Node.js application that includes:
✅ Elasticsearch integration
✅ Document indexing and management
✅ Basic and advanced search queries
✅ Relevance scoring and ranking
✅ Search result highlighting
✅ Autocomplete and suggestions
✅ Similar document recommendations
✅ Performance optimization techniques
✅ Production-ready deployment practices
Key Takeaways
- Index Design: Proper index mapping and settings are crucial for performance
- Query Optimization: Use filters for exact matches, queries for relevance scoring
- Bulk Operations: Always use bulk operations for multiple documents
- Monitoring: Implement comprehensive logging and health checks
- Scalability: Design for horizontal scaling from the start
Next Steps
- Implement search analytics and tracking
- Add search result personalization
- Implement A/B testing for search relevance
- Set up automated index optimization
- Add support for multi-language search
- Implement search result caching strategies
- Consider implementing Elasticsearch's machine learning features for better relevance
This search implementation provides a solid foundation for building powerful search capabilities in your Node.js applications. Remember to monitor performance, optimize queries, and scale your Elasticsearch cluster as your data grows.