tableschema-elasticsearch-py

Generate and load ElasticSearch indexes based on JSON Table Schema descriptors.

Getting Started

Installation

pip install tableschema-elasticsearch

Storage

Package implements Tabular Storage interface.

elasticsearch is used as the db wrapper. We can get storage this way:

from elasticsearch import Elasticsearch
from jsontableschema_sql import Storage

engine = Elasticsearch()
storage = Storage(engine)

Then we could interact with storage ('buckets' are ElasticSearch indexes in this context):

storage.buckets # iterator over bucket names
storage.create('bucket', [(doc_type, descriptor)], 
               reindex=False, mapping_generator_cls=None)
        # Reindex will copy existing documents from an existing index with the same name (not implemented yet)
        # mapping_generator_cls allows customization of the generated mapping  
storage.delete('bucket')
storage.describe('bucket') # return descriptor, not implemented yet
storage.iter('bucket', doc_type=optional) # yield rows
storage.read('bucket', doc_type=optional) # return rows
storage.write('bucket', doc_type, rows, primary_key,
              as_generator=False)
        # primary_key is a list of field names which will be used to generate document ids

When creating indexes, we always create an index with a semi-random name and a matching alias that points to it. This allows us to decide whether to re-index documents whenever we're re-creating an index, or to discard the existing records.

Mappings

When creating indexes, the tableschema types are converted to ES types and a mapping is generated for the index.

Some special properties in the schema provide extra information for generating the mapping:

array types need also to have the es:itemType property which specifies the inner data type of array items.
object types need also to have the es:schema property which provides a tableschema for the inner document contained in that object (or have es:enabled=false to disable indexing of that field).

Example:

{
  "fields": [
    {
      "name": "my-number", 
      "type": "number"
    },
    {
      "name": "my-array-of-dates", 
      "type": "array",
      "es:itemType": "date"
    },
    {
      "name": "my-person-object", 
      "type": "object",
      "es:schema": {
        "fields": [
          {"name": "name", "type": "string"},
          {"name": "surname", "type": "string"},
          {"name": "age", "type": "integer"},
          {"name": "date-of-birth", "type": "date", "format": "%Y-%m-%d"}
        ]
      }
    },
    {
      "name": "my-library", 
      "type": "array",
      "es:itemType": "object",
      "es:schema": {
        "fields": [
          {"name": "title", "type": "string"},
          {"name": "isbn", "type": "string"},
          {"name": "num-of-pages", "type": "integer"}
        ]
      }
    },
    {
      "name": "my-user-provded-object", 
      "type": "object",
      "es:enabled": false
    }    
  ]
}

Custom mappings

By providing a custom mapping generator class (via mapping_generator_cls), inheriting from the MappingGenerator class you should be able

Drivers

elasticsearch-py is used to access the ElasticSearch interface - docs.

API Reference

Snapshot

https://github.com/frictionlessdata/tableschema-elasticsearch-py#snapshot

Detailed

Changelog

Contributing

Please read the contribution guideline:

How to Contribute

Thanks!

Name	Name	Last commit message	Last commit date
Latest commit History 7 Commits
data	data
examples	examples
tableschema_elasticsearch	tableschema_elasticsearch
tests	tests
.env.example	.env.example
.gitignore	.gitignore
.travis.yml	.travis.yml
CONTRIBUTING.md	CONTRIBUTING.md
LICENSE.md	LICENSE.md
MANIFEST.in	MANIFEST.in
Makefile	Makefile
README.md	README.md
pylama.ini	pylama.ini
pytest.ini	pytest.ini
setup.cfg	setup.cfg
setup.py	setup.py
test.py	test.py
tox.ini	tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

tableschema-elasticsearch-py

Getting Started

Installation

Storage

Mappings

Custom mappings

Drivers

API Reference

Snapshot

Detailed

Contributing

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

Search code, repositories, users, issues, pull requests...

License

frictionlessdata/tableschema-elasticsearch-py

Folders and files

Latest commit

History

Repository files navigation

tableschema-elasticsearch-py

Getting Started

Installation

Storage

Mappings

Custom mappings

Drivers

API Reference

Snapshot

Detailed

Contributing

About

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages