induction-monitoring

Induction to monitoring

View the Project on GitHub infra-helpers/induction-monitoring

Induction to Monitoring - ElasticSearch (ES) for Open Travel Data (OPTD) QA

Person on top of the cliff, by Will van Wingerden on Unsplash

Overview

That repository aims at providing end-to-end examples introducing how to collect, store and query metering events, produced by different sensors on local as well as on clouds.

Although the software stacks are very similar with logging, their purpose is different. See the GitHub repository dedicated to logging for further details.

In those tutorials, Elasticsearch (ES) stacks (e.g., ELK, EFK) are used. A full end-to-end example is explained step by step, and actually used for the Quality Assurance (QA) of the Open Travel Data (OPTD) project.

The full details on how to setup an ES cluster on Proxmox LXC containers are given in the dedicated elasticsearch/ sub-folder. Such an ES cluster is actually the publishing target of the Quality Assurance (QA) events of the Open Travel Data (OPTD) project, produced by the OPTD Travis CI/CD process.

For convenience, most of the ES examples are demonstrated both on a local single-node installation (e.g., on a laptop) and on on the above-mentioned cluster.

This project also features, in a dedicated python/ sub-folder, a datamonitor Python module, supporting:

Travis manages the CI pipeline.

Endpoints

Table of Content (ToC)

Table of contents generated with markdown-toc

References

Ingest processors

Use cases

Open Travel Data (OPTD)

Configuration

Interact with the ES server through the command-line (CLI)

Kibana

Grok processor


# Build a CSV to JSON pipeline

## School data
* Create the `school` index:
```bash
$ curl -XPUT "http://localhost:9200/school"
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "index": "school"
}

New York


* Ingest the data:
```bash
$ while read f1; do curl -XPOST "http://localhost:9200/subway_info_v1/_doc?pipeline=parse_nyc_csv" -H "Content-Type: application/json" -d "{ \"station\": \"$f1\" }"; done < elasticseearch/data/NYC_Transit_Subway_Entrance_And_Exit_Data.csv | jq
{
  "_index": "subway_info_v1",
  "_type": "_doc",
  "_id": "iISt6nABC24yNP3yxmvI",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 11,
  "_primary_term": 1
}
...
{
  "_index": "subway_info_v1",
  "_type": "_doc",
  "_id": "04Sv6nABC24yNP3yO3Jd",
  "_version": 1,
  "result": "created",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  },
  "_seq_no": 1878,
  "_primary_term": 1
}