Author Topic: [WorkerProposal]Data solution:data structured,flexible API and fast-built method  (Read 15098 times)

0 Members and 1 Guest are viewing this topic.

Offline R

  • Hero Member
  • *****
  • Posts: 1034
    • View Profile

Offline R

  • Hero Member
  • *****
  • Posts: 1034
    • View Profile
Quote from: vianull
1) Develop a Postgresql-based plugin. We are going to build a structured on-chain data storage  using Postgresql.

I like the idea of developing new plugins for exporting the data from the bitshares full node, the more plugins the merrier in my opinion. If you could split the development of the plugin from the other two goals to reduce the cost of the worker proposal then I'd support this WP.

None of your images are loading for me, could you upload them to an alternative image server please?

I'd be very interested in a plugin for Google's BigQuery, streaming directly into a table for running REST API based queries on Google's servers.
« Last Edit: February 22, 2019, 01:30:31 pm by Customminer »

Offline oxarbitrage

Please don't get me wrong, i do think is a good idea to have ways to visualize the data with good graphics, API calls and others.

However, the worker do not consider there is already a framework available to do it, can be communication problems, last changes to make it all possible are relatively new, etc.

Another issue i have with this worker is the price, even if you have to make it all from scratch i think the worker price is at least 3 times above common sense.

I will be happy to support a worker that consider this 2 points.

Offline Digital Lucifer

  • Sr. Member
  • ****
  • Posts: 369
  • BitShares Maximalist & Venture Architect
    • View Profile
    • BitShares
  • BitShares: dls.cipher
  • GitHub: dls-cipher
In general i don't agree with this worker, here are some reasons:

- Cost. It cost almost 300k usd(288000 usd to be exact to do it).
- Limited. It is database specific, a new worker may came in saying mongodb or whatever is better for their specific needs.
- Closed to participate. No one except the bts.ai team can participate in the development.
- Closed source. It don't says anything that all the work will be open source and MIT license. Will bts.ai be open source and inside the bitshares organization ?
- Reinvent the wheel. The Elasticsearch plugin is working great and have all the data needed, the synchronization time is of 20 hours according to a last report, all the data inside operations is structured and available. That cost 0 to bitshares as it is already done. Doing the same from scratch using another database by a new team is IMHO a waste of resources. The core team, with the accumulated experience can do a postgres plugin in extremely reduced time if that is what the community needs. Also, the core team can pay a team or individual to do the plugin as plugins are core work and will need review and approval from core team members.
- Benefit. Besides having some better visualizations of some data which i think is important i don't see any other real benefit of the proposal.

In my opinion i will like to see some day a general worker that will do this and others in a bounty style, most of the API links mentioned are not being developed because there is  no funding to get developers on board. There is already a ruby project for bitshares at https://github.com/MatzFan/bitshares-ruby not being improved because of lack of funding, among many other dead projects.

I think that bitshares needs a worker proposal similar to the core worker where teams and individuals can participate in the development of different tools of the bitshares ecosystem.

It honestly looks to me like reinventing the wheel after all the work it has been done in this particular regard, discard everything and start from scratch instead of build on top of previous tools to save time, resources and advance.

Just my personal opinion, i don't have any voting power or influence to decide what is accepted or not.


The important thing is NOT PostgreSQL or ES or Mongo or anything . What we care most about is how to implement the requirment .

such as:
1) How to draw charts using specified data in a certain period, e.g. the average price of a market pair, the issue/burn number of an asset, and the feed price of a smart asset.
All these stuffs are unstructured by ES currently, which take some problems to our implements.
2) Bitshares is an exchange such that there may be more professional approach to the storage and analysis the transaction data in exchange viewpoint.

As mentioned in this worker, we will open source under MIT. If this worker is validated, then all future data APIs on bts.ai will be open source, which has been part of this worker.

2) BitShares is a Blockchain who have much brighter and wider future than just an Exchange.
Milos (DL) Preocanin
Owner and manager of bitshares.org
Move Institute, Non-profit organization
RN: 2098555000
Murska Sobota, Slovenia.

Offline vianull

  • Full Member
  • ***
  • Posts: 91
    • View Profile
    • bts.ai
  • BitShares: vianull
Thank you for all your suggestions . I think we need to reassess our proposal and think it through.
We are bts.ai  team
Witness:  witness.hiblockchain
Standby Committee: btsai

Offline Digital Lucifer

  • Sr. Member
  • ****
  • Posts: 369
  • BitShares Maximalist & Venture Architect
    • View Profile
    • BitShares
  • BitShares: dls.cipher
  • GitHub: dls-cipher
If you want to get how much asset was issued in a period of time you can filter the elasticsearch operations by operation_type 14(ASSET ISSUE) and by  operation_history.op_object.asset_to_issue.asset_id in a period of time, please check this link:
http://148.251.96.48:5601/app/kibana#/discover?_g=()&_a=(columns:!(_source),index:'357b9f60-d5f8-11e8-bb51-9583fd938437',interval:auto,query:(language:lucene,query:'operation_type:%2014'),sort:!(block_data.block_time,desc))

If you want to go after the feed prices of a smart asset you can filter by operation type 19 and smart asset id: http://148.251.96.48:5601/app/kibana#/discover?_g=()&_a=(columns:!(_source),index:'357b9f60-d5f8-11e8-bb51-9583fd938437',interval:auto,query:(language:lucene,query:'operation_type:%2019%20AND%20%20operation_history.op_object.asset_id:%201.3.113'),sort:!(block_data.block_time,desc))

In both kibana links you can change the timeframe in the upper right.

If you want to get market price changes you can go after the fill order and so on.

Also, when operations are not enough there is the es_objects plugin that will allow to export certain(currently predefined) specific objects; however i am making some changes to try to make it work with any blockchain object so for example you could get the internal ticker objects for a pair and get the current and past market prices from there.

I strongly believe that those 2 plugins are in the same direction you want to go.

ES/Kibana plugin/UI has BEEN DEVELOPED, TESTED and have very GOOD and ON-GOING development/maintenance atm, 100% agree.

My personal and professional opinion on this would be: Someone stepped up already earlier and now we don't need another proposal that is "re-inventing the wheel" without proper diligence done towards achievements/implementations done in our eco-system.

Many thanks.

Chee®s
« Last Edit: November 07, 2018, 05:19:34 am by Digital Lucifer »
Milos (DL) Preocanin
Owner and manager of bitshares.org
Move Institute, Non-profit organization
RN: 2098555000
Murska Sobota, Slovenia.

Offline Fox

- 1.1 Gather data from external APIs of CEX
  - Support publishing open source tools/process to collect data from CEX
  - Support collection of public data to be published as an open source dataset (may not meet EULA/TOS of provider(s))
  - See [Bitshares-Core/1350](https://github.com/bitshares/bitshares-core/issues/1350): NVT Data Collection and Visualization
- 1.2 Data visualization
  - Support the concept.
  - Provided already by [Kibana](https://www.elastic.co/products/kibana) today (visualization tool for Elasticsearch)
- 2 Multi-dimensional query and data export from DEX
  - Support.
  - Provided already by Elasticsearch
  - See [Bitshares-Core/1350](https://github.com/bitshares/bitshares-core/issues/1350): NVT Data Collection and Visualization
- 3 Data backup service
  - Support the concept. Have open questions about the implementation:
    - How does this differ from a seed node?
    - If not seed node How can service provider attest data authenticity (trust required)
    - Opinion: Best if block producers snapshot, attest and post to bitshares.org (caveat: loose history data)
- 4 Address identification and classification
  - Concerned this may lead to censorship
- 5 Analysis of Witnesses (feeds, nodes, etc.)
  - Support the concept

Suggestions for Other:
  - Replace focus on Postgresql Plugin in favor of ZeroMQ implementation
  - Focus on building out APIs and wrappers (using Elasticsearch) that UI Team can build from
  - DevOp tools for node operators (seed, api, block producers)

I look forward to iterating on a proposal that advances development and adoption of the BitShares platform.

Best,
Fox
Witness: fox

Offline oxarbitrage

If you want to get how much asset was issued in a period of time you can filter the elasticsearch operations by operation_type 14(ASSET ISSUE) and by  operation_history.op_object.asset_to_issue.asset_id in a period of time, please check this link:
http://148.251.96.48:5601/app/kibana#/discover?_g=()&_a=(columns:!(_source),index:'357b9f60-d5f8-11e8-bb51-9583fd938437',interval:auto,query:(language:lucene,query:'operation_type:%2014'),sort:!(block_data.block_time,desc))

If you want to go after the feed prices of a smart asset you can filter by operation type 19 and smart asset id: http://148.251.96.48:5601/app/kibana#/discover?_g=()&_a=(columns:!(_source),index:'357b9f60-d5f8-11e8-bb51-9583fd938437',interval:auto,query:(language:lucene,query:'operation_type:%2019%20AND%20%20operation_history.op_object.asset_id:%201.3.113'),sort:!(block_data.block_time,desc))

In both kibana links you can change the timeframe in the upper right.

If you want to get market price changes you can go after the fill order and so on.

Also, when operations are not enough there is the es_objects plugin that will allow to export certain(currently predefined) specific objects; however i am making some changes to try to make it work with any blockchain object so for example you could get the internal ticker objects for a pair and get the current and past market prices from there.

I strongly believe that those 2 plugins are in the same direction you want to go.

Offline vianull

  • Full Member
  • ***
  • Posts: 91
    • View Profile
    • bts.ai
  • BitShares: vianull
In general i don't agree with this worker, here are some reasons:

- Cost. It cost almost 300k usd(288000 usd to be exact to do it).
- Limited. It is database specific, a new worker may came in saying mongodb or whatever is better for their specific needs.
- Closed to participate. No one except the bts.ai team can participate in the development.
- Closed source. It don't says anything that all the work will be open source and MIT license. Will bts.ai be open source and inside the bitshares organization ?
- Reinvent the wheel. The Elasticsearch plugin is working great and have all the data needed, the synchronization time is of 20 hours according to a last report, all the data inside operations is structured and available. That cost 0 to bitshares as it is already done. Doing the same from scratch using another database by a new team is IMHO a waste of resources. The core team, with the accumulated experience can do a postgres plugin in extremely reduced time if that is what the community needs. Also, the core team can pay a team or individual to do the plugin as plugins are core work and will need review and approval from core team members.
- Benefit. Besides having some better visualizations of some data which i think is important i don't see any other real benefit of the proposal.

In my opinion i will like to see some day a general worker that will do this and others in a bounty style, most of the API links mentioned are not being developed because there is  no funding to get developers on board. There is already a ruby project for bitshares at https://github.com/MatzFan/bitshares-ruby not being improved because of lack of funding, among many other dead projects.

I think that bitshares needs a worker proposal similar to the core worker where teams and individuals can participate in the development of different tools of the bitshares ecosystem.

It honestly looks to me like reinventing the wheel after all the work it has been done in this particular regard, discard everything and start from scratch instead of build on top of previous tools to save time, resources and advance.

Just my personal opinion, i don't have any voting power or influence to decide what is accepted or not.


The important thing is NOT PostgreSQL or ES or Mongo or anything . What we care most about is how to implement the requirment .

such as:
1) How to draw charts using specified data in a certain period, e.g. the average price of a market pair, the issue/burn number of an asset, and the feed price of a smart asset.
All these stuffs are unstructured by ES currently, which take some problems to our implements.
2) Bitshares is an exchange such that there may be more professional approach to the storage and analysis the transaction data in exchange viewpoint.

As mentioned in this worker, we will open source under MIT. If this worker is validated, then all future data APIs on bts.ai will be open source, which has been part of this worker.
We are bts.ai  team
Witness:  witness.hiblockchain
Standby Committee: btsai

Offline oxarbitrage

In general i don't agree with this worker, here are some reasons:

- Cost. It cost almost 300k usd(288000 usd to be exact to do it).
- Limited. It is database specific, a new worker may came in saying mongodb or whatever is better for their specific needs.
- Closed to participate. No one except the bts.ai team can participate in the development.
- Closed source. It don't says anything that all the work will be open source and MIT license. Will bts.ai be open source and inside the bitshares organization ?
- Reinvent the wheel. The Elasticsearch plugin is working great and have all the data needed, the synchronization time is of 20 hours according to a last report, all the data inside operations is structured and available. That cost 0 to bitshares as it is already done. Doing the same from scratch using another database by a new team is IMHO a waste of resources. The core team, with the accumulated experience can do a postgres plugin in extremely reduced time if that is what the community needs. Also, the core team can pay a team or individual to do the plugin as plugins are core work and will need review and approval from core team members.
- Benefit. Besides having some better visualizations of some data which i think is important i don't see any other real benefit of the proposal.

In my opinion i will like to see some day a general worker that will do this and others in a bounty style, most of the API links mentioned are not being developed because there is  no funding to get developers on board. There is already a ruby project for bitshares at https://github.com/MatzFan/bitshares-ruby not being improved because of lack of funding, among many other dead projects.

I think that bitshares needs a worker proposal similar to the core worker where teams and individuals can participate in the development of different tools of the bitshares ecosystem.

It honestly looks to me like reinventing the wheel after all the work it has been done in this particular regard, discard everything and start from scratch instead of build on top of previous tools to save time, resources and advance.

Just my personal opinion, i don't have any voting power or influence to decide what is accepted or not.

Offline vianull

  • Full Member
  • ***
  • Posts: 91
    • View Profile
    • bts.ai
  • BitShares: vianull
The indices of ES are all stored on the hard drive not in RAM. The 32G memory requirement is probably just a best guess and not thought through, the RAM maintains the general performance and some caching, but not the actual chain data. Servers with 64G RAM, 500G hard drive and 1 GBit connectivity are still below 70 USD in Europe, I don't think anyone thought too much of the different pricing in other regions there.

In Alibaba Cloud, mainland China : 2 Cores ,8G RAM ,80G SSD costs 79 USD per month.  The hardware you mentioned may cost over 400 USD.

I agree that technial plan has nothing to do with price . The reason why we use PostgreSQL is mentioned above. 
We are bts.ai  team
Witness:  witness.hiblockchain
Standby Committee: btsai

Offline sschiessl

  • Administrator
  • Hero Member
  • *****
  • Posts: 662
    • View Profile
  • BitShares: sschiessl
1) I apology that we haven't synced an ES node with full bitshares data, just built on our own private chain. I saw on the wiki (https://github.com/bitshares/bitshares-core/wiki/ElasticSearch-Plugin) that ES recommends at least 32G of memory based on the data amount three months ago. In the past three months, the number of BTS operations exploded. In contrast, https://bts.ai uses PostgreSQL to store hundreds of millions of data, including ROR's Web Server and Cache. The memory is not as higher than 4G. ES is a great solution, but it's really too heavy for us. We are going to run a ES full node to test the minimum requirements .

The indices of ES are all stored on the hard drive not in RAM. The 32G memory requirement is probably just a best guess and not thought through, the RAM maintains the general performance and some caching, but not the actual chain data. Servers with 64G RAM, 500G hard drive and 1 GBit connectivity are still below 70 USD in Europe, I don't think anyone thought too much of the different pricing in other regions there.

Offline vianull

  • Full Member
  • ***
  • Posts: 91
    • View Profile
    • bts.ai
  • BitShares: vianull
Sophisticated API calls for visualization would indeed be great!

Quote
a. ElasticSearch Plugin
https://github.com/bitshares/bitshares-core/wiki/ElasticSearch-Plugin
ES has been a good plugin. It has provided a very comprehensive unstructured storage and an indexing of historical transactions and objects on BTS. By the way, ES gives us a lot of inspiration.
But there are some shortages,

1) Cumbersome. ES requires extremely high performance for the server;
2) The synchronization time is too long, lacking of a fast-built solution. Currently, as far as we know,  the usage of ES Plugin for data query or data analysis is limited.
3) Some content (such as related information of the transaction) is not structured. Indeed, ES stores them directly, which cannot satisfy some customized queries.

What makes you think that Postgresql will be better in that regard?
* ES scales easily and was designed for cluster installations, postgres wasn't.
* Synchronization time for Postgresql will most likely be even longer than for ES.
* Operation details will be structured in ES plugin in upcoming release.


1. ES is easy to scale, but the amount of BTS data may not be more than billion in short time. In addition, after optimization, Postgresql still has a good performance in the billion level. In order to facilitate data backup and quickly data recover, we hope to adopt a light database solution, so we use pg to solve problems.

2. Our plan is to develop Bishares' C++ Plugin for synchronizing data to PostgreSQL and also for batch submission. It should be similar with the current ES plugin speed.
However, based on the PostgreSQL solution, we can provide a faster recovery service. Users can download the compressed data package and import it directly. It does not need to synchronize from scratch. This is why it could reduce the synchronization time.

3. In our worker proposal, in addition to structuring, we prefer to optimize and customize APIs for specific analysis requests.

For solving query problems, it's not bad to finalize APIs in different way and let them provide various data, isn't it?

1. Both PostGre and ElasticSearch are suitable for storing that amount of data, whereas the most benefit of ElasticSearch is the built-in clustering, advanced text search capabilities (Lucene) and optimization for simultaneous queries. PostGre is an old horse which offers many synergies and pure SQL queries, which many devs are used to. I'd argue that to build a plugin that offers new API you could use either or.

2. ES allows snapshoting as well https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

3. Customized APIs for specific analysis requests can also query the ES database

Quote
We have analyzed the existing technical solutions in BTS community:

a. ElasticSearch Plugin
https://github.com/bitshares/bitshares-core/wiki/ElasticSearch-Plugin
ES has been a good plugin. It has provided a very comprehensive unstructured storage and an indexing of historical transactions and objects on BTS. By the way, ES gives us a lot of inspiration.
But there are some shortages,

1) Cumbersome. ES requires extremely high performance for the server;
2) The synchronization time is too long, lacking of a fast-built solution. Currently, as far as we know,  the usage of ES Plugin for data query or data analysis is limited.
3) Some content (such as related information of the transaction) is not structured. Indeed, ES stores them directly, which cannot satisfy some customized queries.

b. python wrapper
https://github.com/oxarbitrage/bitshares-explorer-api
Python wrapper has provided a good API, its backend relies on ElasticSearch and some self-built data. However, the data is imported in a timed manner, which means that data is not real-time (it updated every day). As the data increasing, each import will take more time. have been some very mature projects in the community that  provides us with valuable experience, but there are still a lot of problems  to be solved in the current programs.

My 2 cents to the above:

a.
  • 1) What are the minimum requirements you found to run ES?
  • 2) and 3) Snapshoting and new version of ES plugin solves those issues
b.
  • The link you have provided is the open-explorer API, which uses PostGre as well in its backend. Here, data is periodically imported. This is planned to be switched to real-time with the finalization of the es_objects plugin
  • The python wrapper you mention directly queries ES, which is built for real-time data. A deployed example can be found here

In general another solution to data storage can be interesting to explore. I think a lot of synergies can be created if an abstract RESTful API is defined (for example using Swagger). It would allow to be also included in the python wrapper, which would instantly create compatibilities. What are your thoughts on that?



Thank you very much for your reply!

1) I apology that we haven't synced an ES node with full bitshares data, just built on our own private chain. I saw on the wiki (https://github.com/bitshares/bitshares-core/wiki/ElasticSearch-Plugin) that ES recommends at least 32G of memory based on the data amount three months ago. In the past three months, the number of BTS operations exploded. In contrast, https://bts.ai uses PostgreSQL to store hundreds of millions of data, including ROR's Web Server and Cache. The memory is not as higher than 4G. ES is a great solution, but it's really too heavy for us. We are going to run a ES full node to test the minimum requirements .

2) As far as I can guess, the reason why ES occupies a lot of memory is that the full index is used by default, which is very useful for full text content search  But for Bitshares, it is very little usage.

3) I am very happy to hear that "Snapshoting and new version of ES plugin solves those issues". We are developers of bts.ai. Initially we didn't want to send a worker, but we wanted to redesign bts.ai. In the process of collecting requirements, we found that everyone was very interested in the requirements mentioned above, and then we came up with the idea of ​​development and open source. The ES plugin is very good and gives us a lot of inspiration, but the current version does not meet our requirements. Work of Bitshares core software is challenging and there are a lot of stuffs to be developed. So we hope to complete our workers in a few months and contribute to the community instead of waiting.

4) In addition, it is very important that I think that an important feature of Bitshares data is 'Time Series'. For this feature, we can do a lot of optimizations, such as using BRIN indexes. It can improve performance very well. In this respect, PostgreSQL has very good performance and features. This is also an important reason why we use PostgreSQL.

In the end, I think the "define an abstract RESTful API" you mentioned is a very good idea. It can unify the specification and greatly reduce the difficulty of Bitshares application development. We are very happy to see this happen and will try it in our work.
« Last Edit: November 05, 2018, 03:03:43 pm by vianull »
We are bts.ai  team
Witness:  witness.hiblockchain
Standby Committee: btsai

Offline vianull

  • Full Member
  • ***
  • Posts: 91
    • View Profile
    • bts.ai
  • BitShares: vianull
There was discussion to do a more general approach like ZeroMQ to have an interface for any middleware to receive blocks, operations and ideally even virtual operations. Then, adding a traditional database would be really easy - much easier than integrating SQL into the backend, IMHO


Nice comment. It was one of our initial plans and finally gave up, because of the underlying performance problem.
According to your comment, it's time to consider again and test ZeroMQ or other message queue. Performance may not be a problem with muti thread writing to db. Thank you for your suggestion.
We are bts.ai  team
Witness:  witness.hiblockchain
Standby Committee: btsai

Offline sschiessl

  • Administrator
  • Hero Member
  • *****
  • Posts: 662
    • View Profile
  • BitShares: sschiessl
Sophisticated API calls for visualization would indeed be great!

Quote
a. ElasticSearch Plugin
https://github.com/bitshares/bitshares-core/wiki/ElasticSearch-Plugin
ES has been a good plugin. It has provided a very comprehensive unstructured storage and an indexing of historical transactions and objects on BTS. By the way, ES gives us a lot of inspiration.
But there are some shortages,

1) Cumbersome. ES requires extremely high performance for the server;
2) The synchronization time is too long, lacking of a fast-built solution. Currently, as far as we know,  the usage of ES Plugin for data query or data analysis is limited.
3) Some content (such as related information of the transaction) is not structured. Indeed, ES stores them directly, which cannot satisfy some customized queries.

What makes you think that Postgresql will be better in that regard?
* ES scales easily and was designed for cluster installations, postgres wasn't.
* Synchronization time for Postgresql will most likely be even longer than for ES.
* Operation details will be structured in ES plugin in upcoming release.


1. ES is easy to scale, but the amount of BTS data may not be more than billion in short time. In addition, after optimization, Postgresql still has a good performance in the billion level. In order to facilitate data backup and quickly data recover, we hope to adopt a light database solution, so we use pg to solve problems.

2. Our plan is to develop Bishares' C++ Plugin for synchronizing data to PostgreSQL and also for batch submission. It should be similar with the current ES plugin speed.
However, based on the PostgreSQL solution, we can provide a faster recovery service. Users can download the compressed data package and import it directly. It does not need to synchronize from scratch. This is why it could reduce the synchronization time.

3. In our worker proposal, in addition to structuring, we prefer to optimize and customize APIs for specific analysis requests.

For solving query problems, it's not bad to finalize APIs in different way and let them provide various data, isn't it?

1. Both PostGre and ElasticSearch are suitable for storing that amount of data, whereas the most benefit of ElasticSearch is the built-in clustering, advanced text search capabilities (Lucene) and optimization for simultaneous queries. PostGre is an old horse which offers many synergies and pure SQL queries, which many devs are used to. I'd argue that to build a plugin that offers new API you could use either or.

2. ES allows snapshoting as well https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

3. Customized APIs for specific analysis requests can also query the ES database

Quote
We have analyzed the existing technical solutions in BTS community:

a. ElasticSearch Plugin
https://github.com/bitshares/bitshares-core/wiki/ElasticSearch-Plugin
ES has been a good plugin. It has provided a very comprehensive unstructured storage and an indexing of historical transactions and objects on BTS. By the way, ES gives us a lot of inspiration.
But there are some shortages,

1) Cumbersome. ES requires extremely high performance for the server;
2) The synchronization time is too long, lacking of a fast-built solution. Currently, as far as we know,  the usage of ES Plugin for data query or data analysis is limited.
3) Some content (such as related information of the transaction) is not structured. Indeed, ES stores them directly, which cannot satisfy some customized queries.

b. python wrapper
https://github.com/oxarbitrage/bitshares-explorer-api
Python wrapper has provided a good API, its backend relies on ElasticSearch and some self-built data. However, the data is imported in a timed manner, which means that data is not real-time (it updated every day). As the data increasing, each import will take more time. have been some very mature projects in the community that  provides us with valuable experience, but there are still a lot of problems  to be solved in the current programs.

My 2 cents to the above:

a.
  • 1) What are the minimum requirements you found to run ES?
  • 2) and 3) Snapshoting and new version of ES plugin solves those issues
b.
  • The link you have provided is the open-explorer API, which uses PostGre as well in its backend. Here, data is periodically imported. This is planned to be switched to real-time with the finalization of the es_objects plugin
  • The python wrapper you mention directly queries ES, which is built for real-time data. A deployed example can be found here

In general another solution to data storage can be interesting to explore. I think a lot of synergies can be created if an abstract RESTful API is defined (for example using Swagger). It would allow to be also included in the python wrapper, which would instantly create compatibilities. What are your thoughts on that?