# Introduction to OpenSearch

OpenSearch is search engine that nicely scales to billion size documents. Its indexes can be composed of multiple fields, each one indexing different parts of the documents. Each field has its own data type, e.g., text, keywords, numbers, knn-vectors. Text fields use a specific analyser and a retrieval model.

A server is available on the cluster for this course. You can also set up your own server in your local machine. Docker is a convenient solution: https://opensearch.org/docs/latest/opensearch/install/docker/

## OpenSearch connection settings

For this course, a server is available on the cluster for this course. If you really need to you can set up your own server in your local machine. I advise you to use docker: https://opensearch.org/docs/latest/opensearch/install/docker/


In [1]:
import pprint as pp
import requests

host = '10.10.255.202'
port = 8200
user = '' # Add your user name here.
password = '' # Add your user password here. For testing only. Don't store credentials in code. 
index_name = user


## OpenSearch Python API 

OpenSearch communicates via REST which can be accessed with CURL (or its Python port, the requests library). For your conveninence we will use the Python client. A short introduction is available here:
https://opensearch.org/docs/latest/clients/python/

## Opening and closing a connection
The example below establishes a connection with the server and, if your index is already created, it displays the index settings, mappings and the number of indexed documents.

In [2]:
import pprint as pp
from opensearchpy import OpenSearch
from opensearchpy import helpers

# Optional client certificates if you don't want to use HTTP basic authentication.
# client_cert_path = '/full/path/to/client.pem'
# client_key_path = '/full/path/to/client-key.pem'

# Create the client with SSL/TLS enabled, but hostname verification disabled.
client = OpenSearch(
    hosts = [{'host': host, 'port': port}],
    http_compress = True, # enables gzip compression for request bodies
    http_auth = (user, password),
    # client_cert = client_cert_path,
    # client_key = client_key_path,
    use_ssl = True,
    verify_certs = False,
    ssl_assert_hostname = False,
    ssl_show_warn = False
    #, ca_certs = ca_certs_path
)

if client.indices.exists(index_name):

    resp = client.indices.open(index = index_name)
    print(resp)

    print('\n----------------------------------------------------------------------------------- INDEX SETTINGS')
    settings = client.indices.get_settings(index = index_name)
    pp.pprint(settings)

    print('\n----------------------------------------------------------------------------------- INDEX MAPPINGS')
    mappings = client.indices.get_mapping(index = index_name)
    pp.pprint(mappings)

    print('\n----------------------------------------------------------------------------------- INDEX #DOCs')
    print(client.count(index = index_name))
    

{'acknowledged': True, 'shards_acknowledged': True}

----------------------------------------------------------------------------------- INDEX SETTINGS
{'user219': {'settings': {'index': {'creation_date': '1648455050590',
                                    'knn': 'true',
                                    'number_of_replicas': '0',
                                    'number_of_shards': '4',
                                    'provided_name': 'user219',
                                    'refresh_interval': '1s',
                                    'uuid': 'gtX44npLTmiiAFtC7MjeEw',
                                    'version': {'created': '135238227'}}}}}

----------------------------------------------------------------------------------- INDEX MAPPINGS
{'user219': {'mappings': {'properties': {'contents': {'analyzer': 'standard',
                                                      'similarity': 'BM25',
                                                      'type': 'text'},
      

To release resources in the OpenSearch server you should always close the index handle.

In [5]:
resp = client.indices.close(index = index_name, timeout="600s")
print(resp)

{'acknowledged': True, 'shards_acknowledged': True, 'indices': {'user219': {'closed': True}}}


# Index creation and configuration
In this section we will see how to create an index, inspect the configuration and delete an index if needed.

## Create an index with your own settings

Lets first create an index distributed across 4 shards, no replicas, and with support for knn-vector data types. In terms of indexed data, we define two data properties: doc_id and contents, with data types keyword and text respectively.

Property type | Description
-----|-----
text|A string sequence of characters that represent full-text values.
keyword|A string sequence of structured characters, such as an email or ZIP code.
boolean|OpenSearch accepts true and false as boolean values. An empty string is equal to false. 
integer|A signed 32-bit number. 
float|A single-precision 32-bit floating point number. 
double|A double-precision 64-bit floating point number.
date|if new string fields match a dateâ€™s format, then the string is processed as a date field. For example, date: "2012/03/11" is processed as a date.
objects|Objects are standard JSON objects, which can have fields and mappings of their own. For example, a movies object can have additional properties such as title, year, and director.


In [3]:

index_body = {
   "settings":{
      "index":{
         "number_of_replicas":0,
         "number_of_shards":4,
         "refresh_interval":"-1",
         "knn":"true"
      }
   },
   "mappings":{
       "dynamic":      "strict",
       "properties":{
         "doc_id":{
            "type":"keyword"
         },
         "tags":{
            "type":"keyword"
         },
         "contents":{
            "type":"text",
            "analyzer":"standard",
            "similarity":"BM25"
         }
      }
   }
}

if client.indices.exists(index=index_name):
    print("Index already existed. Nothing to be done.")
else:        
    response = client.indices.create(index_name, body=index_body)
    print('\nCreating index:')
    print(response)


Index already existed. Nothing to be done.


## Check the indexes, settings and mappings
Once you create an index, you should verify that it is created according to your requirements.

In [20]:
print('\n----------------------------------------------------------------------------------- INDEX SETTINGS')
index_settings = {
    "settings":{
      "index":{
         "refresh_interval" : "1s"
      }
   }
}
client.indices.put_settings(index = index_name, body = index_settings)
settings = client.indices.get_settings(index = index_name)
pp.pprint(settings)

print('\n----------------------------------------------------------------------------------- INDEX MAPPINGS')
mappings = client.indices.get_mapping(index = index_name)
pp.pprint(mappings)

print('\n----------------------------------------------------------------------------------- INDEX #DOCs')
print(client.count(index = index_name))



----------------------------------------------------------------------------------- INDEX SETTINGS
{'user219': {'settings': {'index': {'creation_date': '1648455050590',
                                    'knn': 'true',
                                    'number_of_replicas': '0',
                                    'number_of_shards': '4',
                                    'provided_name': 'user219',
                                    'refresh_interval': '1s',
                                    'uuid': 'gtX44npLTmiiAFtC7MjeEw',
                                    'version': {'created': '135238227'}}}}}

----------------------------------------------------------------------------------- INDEX MAPPINGS
{'user219': {'mappings': {'properties': {'contents': {'analyzer': 'standard',
                                                      'similarity': 'BM25',
                                                      'type': 'text'},
                                         'doc_id': {'type'

## Index deletion
Be absolutely sure tha you want to delete the index. There is no way of recovering it!

In [5]:
be absolutely sure that you want to comment this line and actually delete the index!!!

if client.indices.exists(index=index_name):
    # Delete the index.
    response = client.indices.delete(
        index = index_name,
        timeout = "600s"
    )
    print('\nDeleting index:')
    print(response)


Deleting index:
{'acknowledged': True}


# Document processing and indexing

## Built-in document tokenizers and analyzers

The built-in tokenizers and analyzers include: standard, simple, whitespace, stop, keyword, pattern, [language], fingerprint.


In [9]:
anls = {
  "analyzer": "whitespace",
  "text": "the quick brown fox"
}
client.indices.analyze(body=anls, index=index_name)

{'tokens': [{'token': 'the',
   'start_offset': 0,
   'end_offset': 3,
   'type': 'word',
   'position': 0},
  {'token': 'quick',
   'start_offset': 4,
   'end_offset': 9,
   'type': 'word',
   'position': 1},
  {'token': 'brown',
   'start_offset': 10,
   'end_offset': 15,
   'type': 'word',
   'position': 2},
  {'token': 'fox',
   'start_offset': 16,
   'end_offset': 19,
   'type': 'word',
   'position': 3}]}

In [10]:
anls = {
  "analyzer": "standard",
  "text": "the quick brown fox"
}
client.indices.analyze(body=anls, index=index_name)


{'tokens': [{'token': 'the',
   'start_offset': 0,
   'end_offset': 3,
   'type': '<ALPHANUM>',
   'position': 0},
  {'token': 'quick',
   'start_offset': 4,
   'end_offset': 9,
   'type': '<ALPHANUM>',
   'position': 1},
  {'token': 'brown',
   'start_offset': 10,
   'end_offset': 15,
   'type': '<ALPHANUM>',
   'position': 2},
  {'token': 'fox',
   'start_offset': 16,
   'end_offset': 19,
   'type': '<ALPHANUM>',
   'position': 3}]}


## Simple document indexing

You can index documents in OpenSearch by adding one document at a time. This is done with JSON data as follows. Note that the id parameter in the API call is the unique identifier, which works as a key.

If you add one document that already exists in the index, it will update the data. You can update only part of the fields and leave all the others unchanged.


In [10]:
docs = ["Around 9 Million people live in London", "London is known for its financial district"]

doc = {
    'doc_id': 'documentA',
    'tags': ['red', 'blue'],
    'contents': docs[0]
}
resp = client.index(index=index_name, id=1, body=doc)
print(resp['result'])

doc = {
    'doc_id': 'documentB',
    'tags': ['red'],
    'contents': docs[1]
}
resp = client.index(index=index_name, id=2, body=doc)
print(resp['result'])


created
created


## Deleting a single document
Similarly, you can delete one document at a time as shown below.

In [None]:
Delete the document.

response = client.delete(
    index = index_name,
    id = id
)

print('\nDeleting document:')
print(response)


# Search methods

OpenSearch supports many different search methods and has its own Query Syntax Language covering a wide range of search options:

https://opensearch.org/docs/latest/opensearch/query-dsl/index/

## Text-based search

OpenSearch is one of the best solutions for searching text. The text-based search documentation is available here:

https://opensearch.org/docs/latest/opensearch/query-dsl/full-text/

In the example below the 'query'  parameter indicates the search query, the 'size' parameter indicates the number of documents to be returned, the parametner 'source' indicates which fields should be returned in the search results, and the 'fields' parameter indicates the list of fields to be searched. 


In [28]:
qtxt = "How many people live in London?"

query_bm25 = {
  'size': 5,
  '_source': ['_tags'],
#  '_source': ['doc_id'],
#  '_source': '',
  'query': {
    'multi_match': {
      'query': qtxt,
      'fields': ['contents']
    }
  }
}

response = client.search(
    body = query_bm25,
    index = index_name
)

print('\nSearch results:')
pp.pprint(response)



Search results:
{'_shards': {'failed': 0, 'skipped': 0, 'successful': 4, 'total': 4},
 'hits': {'hits': [{'_id': '1',
                    '_index': 'user219',
                    '_score': 2.2617629,
                    '_source': {},
                    '_type': '_doc'},
                   {'_id': '2',
                    '_index': 'user219',
                    '_score': 0.18232156,
                    '_source': {},
                    '_type': '_doc'}],
          'max_score': 2.2617629,
          'total': {'relation': 'eq', 'value': 2}},
 'timed_out': False,
 'took': 3}



## Term queries

https://opensearch.org/docs/latest/opensearch/query-dsl/term/


In [39]:
qtxt = "How many people live in London?"

query_bm25 = {
  'size': 5,
  '_source': ['contents'],
  'query': {
        "term": {
            "tags" : 'red'
        }
   }
}

response = client.search(
    body = query_bm25,
    index = index_name
)

print('\nSearch results:')
pp.pprint(response)



Search results:
{'_shards': {'failed': 0, 'skipped': 0, 'successful': 4, 'total': 4},
 'hits': {'hits': [{'_id': '2',
                    '_index': 'user219',
                    '_score': 0.21110919,
                    '_source': {'contents': 'London is known for its financial '
                                            'district'},
                    '_type': '_doc'},
                   {'_id': '1',
                    '_index': 'user219',
                    '_score': 0.160443,
                    '_source': {'contents': 'Around 9 Million people live in '
                                            'London'},
                    '_type': '_doc'}],
          'max_score': 0.21110919,
          'total': {'relation': 'eq', 'value': 2}},
 'timed_out': False,
 'took': 3}



## Boolean queries

https://opensearch.org/docs/latest/opensearch/query-dsl/bool/


In [38]:
qtxt = "How many people live in London?"

query_bm25 = {
  'size': 5,
  '_source': ['contents'],
#  '_source': ['doc_id'],
#  '_source': '',
  'query': {
      'bool':{
          "must":{
            "term": {
                "tags" : 'red'
            }
          },
          "should": 
        {
            'multi_match': {
              'query': qtxt,
              'fields': ['contents']
            }
        }
      }
  }
}

response = client.search(
    body = query_bm25,
    index = index_name
)

print('\nSearch results:')
pp.pprint(response)



Search results:
{'_shards': {'failed': 0, 'skipped': 0, 'successful': 4, 'total': 4},
 'hits': {'hits': [],
          'max_score': None,
          'total': {'relation': 'eq', 'value': 0}},
 'timed_out': False,
 'took': 2}


# Approximate KNN search in semantic embeddings

## Create an index with dense vectors

To create the dense vector field you can use the configuration provided below. The 'dimension' property indicates the dimensionality of the indexed vectors, the 'space_type' indicates similarity function, and the parameters are specific to the indexing method, the 'hnsw' (Hierarichical Navigable Small World).

For details see the https://opensearch.org/docs/latest/search-plugins/knn/approximate-knn/

In [None]:
index_body = {
   "settings":{
      "index":{
         "number_of_replicas":0,
         "number_of_shards":4,
         "refresh_interval":"-1",
         "knn":"true"
      }
   },    
   "mappings":{
      "dynamic":      "strict",
      "properties":{
         "doc_id":{
            "type":"keyword"
         },
         "contents":{
            "type":"text",
            "analyzer": "standard",
#            "analyzer":"my_analyzer",
            "similarity":"BM25"
         },
         "sentence_embedding":{
            "type":"knn_vector",
            "dimension": 768,
            "method":{
               "name":"hnsw",
               "space_type":"innerproduct",
               "engine":"faiss",
               "parameters":{
                  "ef_construction":256,
                  "m":48
               }
            }
         }
      }
   }
}

if client.indices.exists(index=index_name):
    print("Index already existed. You may force the new mappings.")
else:        
    response = client.indices.create(index_name, body=index_body)
    print('\nCreating index:')
    print(response)


## Dual-Encoders

To compute the embedding vectors of each document, we can use the Transformer encodres trained in the MSMARCO Dataset. There are many other models available in the HuggingFace repository.

In [15]:
from transformers import AutoTokenizer, AutoModel
import torch
import torch.nn.functional as F

#Mean Pooling - Take average of all tokens
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output.last_hidden_state #First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)


#Encode text
def encode(texts):
    # Tokenize sentences
    encoded_input = tokenizer(texts, padding=True, truncation=True, return_tensors='pt')

    # Compute token embeddings
    with torch.no_grad():
        model_output = model(**encoded_input, return_dict=True)

    # Perform pooling
    embeddings = mean_pooling(model_output, encoded_input['attention_mask'])

    # Normalize embeddings
    embeddings = F.normalize(embeddings, p=2, dim=1)
    
    return embeddings


# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/msmarco-distilbert-base-v2")
model = AutoModel.from_pretrained("sentence-transformers/msmarco-distilbert-base-v2")

# Sentences we want sentence embeddings for
docs = ["Around 9 Million people live in London", "London is known for its financial district"]
doc_emb = encode(docs)


## Indexing document embedding vectors

In the previous step we saw how to compute the embedding representation of a document. You can index document embeddings in OpenSearch by adding a new field to your JSON file as shown below.


In [16]:
doc = {
    'doc_id': 'documentA',
    'contents': docs[0],
    'sentence_embedding': doc_emb[0].numpy()
}
resp = client.index(index=index_name, id=1, body=doc)
print(resp['result'])

doc = {
    'doc_id': 'documentB',
    'contents': docs[1],
    'sentence_embedding': doc_emb[1].numpy()
}
resp = client.index(index=index_name, id=2, body=doc)
print(resp['result'])


created
created


## Embedding spaces search

Similarly, you need to compute the embedding representation of the query and submit it in the search query as shown next.

In [18]:
# Compute the query embedding
query = "How many people live in London?"
query_emb = encode(query)

query_denc = {
  'size': 5,
#  '_source': ['doc_id', 'contents', 'sentence_embedding'],
#  '_source': ['doc_id', 'contents'],
  '_source': ['doc_id'],
   "query": {
        "knn": {
          "sentence_embedding": {
            "vector": query_emb[0].numpy(),
            "k": 2
          }
        }
      }
}

response = client.search(
    body = query_denc,
    index = index_name
)

print('\nSearch results:')
pp.pprint(response)



Search results:
{'_shards': {'failed': 0, 'skipped': 0, 'successful': 4, 'total': 4},
 'hits': {'hits': [{'_id': '1',
                    '_index': 'user219',
                    '_score': 1.9538686,
                    '_source': {'doc_id': 'documentA'},
                    '_type': '_doc'},
                   {'_id': '2',
                    '_index': 'user219',
                    '_score': 1.434623,
                    '_source': {'doc_id': 'documentB'},
                    '_type': '_doc'}],
          'max_score': 1.9538686,
          'total': {'relation': 'eq', 'value': 2}},
 'timed_out': False,
 'took': 27}


## Training dual-encoders

You can fine-tune dual-encoders in your domain data and get some extra points out of your search architecture.

      https://www.sbert.net/docs/training/overview.html
    

In [None]:
from sentence_transformers import SentenceTransformer, InputExample, losses
from torch.utils.data import DataLoader

#Define the model. Either from scratch of by loading a pre-trained model
model = SentenceTransformer('distilbert-base-nli-mean-tokens')

#Define your train examples. You need more than just two examples...
train_examples = [InputExample(texts=['My first sentence', 'My second sentence'], label=0.8),
    InputExample(texts=['Another pair', 'Unrelated sentence'], label=0.3)]

#Define your train dataset, the dataloader and the train loss
train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=16)
train_loss = losses.CosineSimilarityLoss(model)

#Tune the model
model.fit(train_objectives=[(train_dataloader, train_loss)], epochs=1, warmup_steps=100)

# Specialized document analyzers
Text data can be represented in different ways. In this section we will see different ways of transforming natural language documents into a computational representations.

## Create an index with specific analyzers
The built-in analyzers offer a wide range of text processing methods. Each field can have its own analyzer and users can also define their own analyzers.

In [7]:

index_body = {
   "settings":{
      "index":{
         "number_of_replicas":0,
         "number_of_shards":4,
         "refresh_interval":"-1",
         "knn":"true"
      },
      "analysis":{
         "filter":{
            "edge_ngram_filter":{
               "type":"edge_ngram",
               "min_gram":1,
               "max_gram":20
            }
         },
         "analyzer":{
            "my_analyzer":{
               "type":"custom",
               "tokenizer":"standard",
               "filter":[
                  "lowercase",
                  "edge_ngram_filter"
               ]
            }
         }
      }
   },
   "mappings":{
      "properties":{
         "doc_id":{
            "type":"keyword"
         },
         "contents":{
            "type":"text",
            "analyzer":"my_analyzer",
            "similarity":"BM25"
         }
      }
   }
}

if client.indices.exists(index=index_name):
    print("Index already existed. Nothing to be done.")
else:        
    response = client.indices.create(index_name, body=index_body)
    print('\nCreating index:')
    print(response)



Creating index:
{'acknowledged': True, 'shards_acknowledged': True, 'index': 'user220'}


## Spacy: Tokens, lemmas and POS
If the built-in text processing methods are not sufficient for your problem, you can use external libraries like Spacy to extract other representations of text, such as POS and lemmas, or to use other stemmers.

In [13]:
import spacy
from spacy import displacy
from pathlib import Path

nlp = spacy.load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")

save_figures = False

print("token".ljust(10), "lemma".ljust(10), "pos".ljust(6), "tag".ljust(6), "dep".ljust(10),
            "shape".ljust(10), "alpha", "stop")
print("------------------------------------------------------------------------------")
for token in doc:
    print(token.text.ljust(10), token.lemma_.ljust(10), token.pos_.ljust(6), token.tag_.ljust(6), token.dep_.ljust(10),
            token.shape_.ljust(10), token.is_alpha, token.is_stop)


token      lemma      pos    tag    dep        shape      alpha stop
------------------------------------------------------------------------------
Apple      Apple      PROPN  NNP    nsubj      Xxxxx      True False
is         be         AUX    VBZ    aux        xx         True True
looking    look       VERB   VBG    ROOT       xxxx       True False
at         at         ADP    IN     prep       xx         True True
buying     buy        VERB   VBG    pcomp      xxxx       True False
U.K.       U.K.       PROPN  NNP    compound   X.X.       False False
startup    startup    NOUN   NN     dobj       xxxx       True False
for        for        ADP    IN     prep       xxx        True True
$          $          SYM    $      quantmod   $          False False
1          1          NUM    CD     compound   d          False False
billion    billion    NUM    CD     pobj       xxxx       True False


## Named entity recognition
Spacy also identifies the mentions of relevant named entities. This can have a number of applications, such as selecting the documents that mention a given named entity.

In [14]:
import spacy
from spacy import displacy
from pathlib import Path

nlp = spacy.load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")

for ent in doc.ents:
    print(ent.text.ljust(12), ent.label_.ljust(10), ent.start_char, ent.end_char)

html_ent = displacy.render(doc, style="ent", jupyter=True)


Apple        ORG        0 5
U.K.         GPE        27 31
$1 billion   MONEY      44 54


# Status admin requests

You need to have admin permisions to run the commands below. This is useful if you use your own instalation of OpenSearch.

In [5]:
import pprint as pp
from opensearchpy import OpenSearch
from opensearchpy import helpers

host = '10.10.255.202'
port = 8200
user = ''
password = ''
auth = (user, password) # For testing only. Don't store credentials in code.

# Optional client certificates if you don't want to use HTTP basic authentication.
# client_cert_path = '/full/path/to/client.pem'
# client_key_path = '/full/path/to/client-key.pem'

# Create the client with SSL/TLS enabled, but hostname verification disabled.
client = OpenSearch(
    hosts = [{'host': host, 'port': port}],
    http_compress = True, # enables gzip compression for request bodies
    http_auth = auth,
    # client_cert = client_cert_path,
    # client_key = client_key_path,
    use_ssl = True,
    verify_certs = False,
    ssl_assert_hostname = False,
    ssl_show_warn = False
    #, ca_certs = ca_certs_path
)


## Cluster information

In [57]:
print('----------------------------------------------------------------------------------- SERVER INFO')
response = client.info()
pp.pprint(response)

print('----------------------------------------------------------------------------------- CLUSTER HEALTH')
response = client.cluster.health()
pp.pprint(response)

print('\n----------------------------------------------------------------------------------- CLUSTER INDICES')
response = client.cat.indices()
print(response)

----------------------------------------------------------------------------------- SERVER INFO
{'cluster_name': 'opensearch-cluster',
 'cluster_uuid': '6OAGZGbfQgOEPtFM5qLB6Q',
 'name': 'opensearch-node1',
 'tagline': 'The OpenSearch Project: https://opensearch.org/',
 'version': {'build_date': '2022-01-14T03:38:06.881862Z',
             'build_hash': 'e505b10357c03ae8d26d675172402f2f2144ef0f',
             'build_snapshot': False,
             'build_type': 'tar',
             'distribution': 'opensearch',
             'lucene_version': '8.10.1',
             'minimum_index_compatibility_version': '6.0.0-beta1',
             'minimum_wire_compatibility_version': '6.8.0',
             'number': '1.2.4'}}
----------------------------------------------------------------------------------- CLUSTER HEALTH
{'active_primary_shards': 94,
 'active_shards': 94,
 'active_shards_percent_as_number': 83.92857142857143,
 'cluster_name': 'opensearch-cluster',
 'delayed_unassigned_shards': 0,
 'disco

## REST Connection to server

The REST API should be used for requests that are not available in the Python API. You need to read the OpenSearch/ElasticSearch documentation carefully before using this.

In [None]:
import requests

s = requests.Session()
s.auth = auth

#auth = (index_name, 'zya*xJ!4]n') # For testing only. Don't store credentials in code.
ca_certs_path = '/full/path/to/root-ca.pem' # Provide a CA bundle if you use intermediate CAs with your root CA.
server_uri = 'https://' + host + ':' + str(port)

# function for the REST requests
def opensearch_REST(uri = '/' , body='', verb='get'):
    # pass header option for content type if request has a
    # body to avoid Content-Type error in Elasticsearch v6.0
    
    uri = server_uri + uri
    print(uri)
    headers = {
        'Content-Type': 'application/json',
    }

    try:
        # make HTTP verb parameter case-insensitive by converting to lower()
        if verb.lower() == "get":
            resp = s.get(uri, json=body, headers=headers, verify=False)
        elif verb.lower() == "post":
            resp = s.post(uri, json=body, headers=headers, verify=False)
        elif verb.lower() == "put":
            resp = s.put(uri, json=body, headers=headers, verify=False)
        elif verb.lower() == "del":
                resp = s.delete(uri, json=body, headers=headers, verify=False)
        elif verb.lower() == "head":
                resp = s.head(uri, json=body, headers=headers, verify=False)

        # read the text object string
        try:
            resp_text = json.loads(resp.text)
        except:
            resp_text = resp.text

        # catch exceptions and print errors to terminal
    except Exception as error:
        print ('\nelasticsearch_curl() error:', error)
        resp_text = error

    # return the Python dict of the request
    return resp_text
