Have you ever wondered how a company that offers Artificial Intelligence solutions for different sectors and contexts operates? That’s the case with Hop AI. In this article, we will discuss our technology stack and talk a little about how these solutions are developed.
The importance of data quality and the technologies adopted
It is a consensus in the world of Artificial Intelligence solutions that the quality of your solution depends heavily on the quality of the data being processed. Bearing this in mind, it is important to define a set of tools to deal with data, in order to meet different workloads and needs.
Each client presents a different type of processing. Some will process continuous streams of real-time data, while others will perform batch processing at predefined time intervals. Some generate many lightweight events per minute, and others produce few events, but with massive volumes of data.
Currently, at Hop AI, we adopt five different technologies to deal with data: MongoDB, Postgres, Redis, Kafka, and Object Storages (IBM Cloud Object Storage or Amazon S3). MongoDB is where we store much of the information from our solutions, due to its schema flexibility and ability to handle workloads that merge reading and writing with high operation rates.
Postgres is always useful for storing business information with a relational nature. Redis is used as a cache for operations that occur frequently or have strict response latency requirements. Kafka is our preferred tool for processing continuous data streams, being ideal for solutions that need to provide real-time or near-real-time predictions, thanks to its undeniable capacity to handle high volumes and extremely low latency. Finally, Object Storage is where we store our most voluminous data, those with a lower access rate but that occupy a lot of space. There, we also store raw data acquired directly from the source.
Hop AI’s technology stack
Python: The programming language chosen to work with AI at Hop AI
With the data organized and curated, it’s time to talk about the tools we use to turn them into powerful predictions. In this context, Python is the language chosen by us as the basis for all our work, for some reasons such as popularity, active community, library diversity, and performance. Going a little further, each Hop AI client has a different problem that we need to define a canonical strategy to approach first, requiring flexibility from our stack so that we can work with different types of problems.
Process optimization problems, such as the best way to price or define the moment to stop a machine for maintenance, are commonly implemented using the OpenAI Gymnasium reinforcement learning library, which has an elegant and extensible implementation that makes it easy to characterize client scenarios in a clear and objective way.
For time series, we have achieved excellent results using the Facebook Prophet library, and we use it in problems similar to “how much will we sell in 15 days?” or “how much can our client buy in August?” Of course, classic classification and regression problems are also part of our day-to-day, and for them, we have a larger universe of solutions, and the ideal solution will depend heavily on the problem’s context and input data type, ranging from basic scikit-learn implementations to more powerful tools such as Tensorflow and PyTorch.
Integrating AI with other solution components: simulations, monitoring, and metrics
Of course, no AI solution exists in a vacuum. A valid analogy here, one that we repeat frequently, is that AI is the engine of the solution, and like in cars, we hardly see the engine and have contact with several other items. These other items are the other components of the solution that will vary from an interface for simulations, where the user can enter exogenous parameters into the models and, from this stimulus, have a view of the consequences of the action in the future, or even monitoring of what is being processed and its metrics
To achieve this, we use a set of different solutions, from solutions existing within cloud providers to monitor model indicators to the construction of a completely new system. As mentioned earlier, our models, once in production, are always producing metrics that need to be monitored. Our darling in this context is Metabase, a simple and user-friendly solution that we use to deliver the health of the numerous metrics that complex systems have.
By this point, it should be clear that Hop AI also has a multicloud policy for deployment. Since this is also a variable that completely changes from client to client, all our solutions are completely independent of providers and do not generate any kind of lock-in.
I’ll leave it for next time to talk about the work methodology and the tools we use in the engineering process behind all of this.
AUTHOR Tiago Moura CTO, Hop AI
WANT TO KNOW MORE ABOUT HOP AI?
Read more about who Hop AI is, our history, and achievements. Visit this page and get in touch.