The AI Journey: Why You Should Pack OpenShift and OpenVINO™

May 31, 2022

Christina Cardoza

OpenShift, OpenVINO

AI can be an intimidating field to get into, and there is a lot that goes into deploying an AI application. But if you don’t choose the right tools, it can be even more difficult than it needs to be. Luckily, the work that Intel^® and Red Hat are doing is easing the burden for businesses and developers.

We’ll talk about some of the right ways to deploy AI apps with experts Audrey Reznik, Senior Principal Software Engineer for the enterprise open-source software solution provider Red Hat, and Ryan Loney, Product Manager for OpenVINO^™ Developer Tools at Intel^®. They’ll discuss machine learning and natural language processing; using the OpenVINO AI toolkit with Red Hat OpenShift; and the life cycle of an AI intelligent application.

Why are AI and machine learning becoming vital tools for businesses?

Ryan Loney: Everything today has some intelligence embedded into it. So AI is being integrated into every industry—industrial, healthcare, agriculture, retail. They’re all starting to leverage the software and the algorithms for improving efficiency. And we’re only at the beginning of this era of using automation and intelligence in applications.

We’re also seeing a lot of companies—Intel partners—who are starting to leverage these tools to assist humans in doing their jobs. For example, a technician analyzing an X-ray scan or an ultrasound. And, in factories, using cameras to detect if there’s something wrong, then flagging it and having somebody review it.

And we’ve started to even optimize workloads for speech synthesis, for natural language processing, which is a new area for OpenVINO. If you go to an ATM machine and have it read your bank balance back to you out loud, that’s something that’s starting to leverage AI. It’s really embedded in everything we do.

How is AI and ML started to be more broadly adopted across industries?

Audrey Reznik: When we look at how AI and ML can be deployed across the industry, we have to look at two scenarios.

Sometimes there’s a lot of data gravity involved in an environment and data cannot be moved off-prem into the cloud, such as with defense systems or government—they prefer to have their data on-prem. So we see a lot of AI/ML deployed that way. Typically, people are looking to a platform that will have MLOps capability, and they’re looking for something that’s going to help them with data engineering, with model development, training/testing the deployment, and then monitoring the model.

If there aren’t particular data security issues, they tend to move a lot of their MLOps creation and delivery/deployment to the cloud. In that case they’re going to look for a cloud service platform that has MLOps available so that they can look at, again, curating their data, creating models, training and testing them, deploying them, and monitoring and retraining those models.

“The advent of #OpenVINO changed the paradigm in terms of optimizing a #model, and in terms of quantization.” – Audrey Reznik, @RedHat via @insightdottech

In both instances what people are really looking for is something easy to use—a platform that’s easy for data scientists, data engineers, and application developers to use so that they can collaborate. And the collaboration then drives some of the innovation.

Increasingly, we’re seeing people use both scenarios, so we have what we call a hybrid cloud situation, or a hybrid platform.

What are some of the biggest challenges with deploying AI apps?

Ryan Loney: One of the biggest challenges is access to data. When you’re thinking about creating or training a model for an intelligent application you need a lot of data. And you have to factor in having a secure enclave where you can get that data and train that data. You can’t necessarily send it to a public cloud—or if you do, you need to do it in a way that’s secure.

That’s one of the things I’m really impressed with from Red Hat and from OpenShift is their approach to the hybrid cloud. You can have on-prem managed OpenShift or you can run it in a public cloud—and still really give the customer the ability to keep their data where they want to keep it in order to address security and privacy concerns.

Another challenge for many businesses is that when they’re trying to scale, they have to have an infrastructure that can increase exponentially when it needs to. That’s really where I think Red Hat comes in—offering this managed service so that they can focus on getting the developers and data scientists access to the tools that they would use on their own outside of the enterprise environment, and making it just as easy to use inside the enterprise environment.

Let’s talk about the changes that went into the OpenVINO^™ 2022.1 release.

Ryan Loney: This was the most substantial change of features since we started in 2018, and it was really driven by customer needs. One key change is that we added hardware plugins, or device plugins. We’ve also recently launched discrete graphics. So GPUs can be used for deep-learning inference. Customers need them for things like automatic batching, and they can just let OpenVINO automatically determine the batch size for them.

We’ve also started to expand to natural language processing, as I mentioned before. So if you ask a chatbot a question: “What is my bank balance?” And then you ask it a second question: “How do I open an account?” both of those questions have different sizes—the number of letters and number of words in the sentence. OpenVINO can handle that under the hood and automatically adjust the input.

What has been Red Hat’s experience using OpenVINO^™?

Audrey Reznik: Before OpenVINO came along, a lot of processing would have been done on hardware, which can be expensive. The advent of OpenVINO changed the paradigm in terms of optimizing a model, and in terms of quantization.

I’ll speak to optimization first. Why use a GPU if you can say, “You know what? I don’t need all the different frames in this video in order to get an idea of what my model may be looking at.” Maybe my model is looking at a pipe in the field and we’re just checking to make sure that nothing is wrong with it. Why not just reduce some of those frames without impacting the ability of your model to perform? With OpenVINO, you can add just a couple of little snippets of code to get this benefit, and not use the hardware

The other thing is quantization. With machine learning models there may be a lot of numerics in the calculations. I’m going to take the most famous number that most people know about—pi. It’s not really 3.14; it’s 3.14 and many digits beyond that. Well, what if you don’t need all that precision? What if you can be just as happy with the one value that most people equate with pi—that 3.14?

You can gain a lot of benefit for your model, because you’re still getting the same results, but you don’t have to worry about cranking out all those digit points as you go along.

For customers, this is huge because, again, we’re just adding a couple of lines of code with OpenVINO. And if they don’t have to get a GPU, it’s a nice, easy way to save on that hardware expense but get the same benefits.

What does an AI journey really entail from start to finish?

Audrey Reznik: There are a couple of very important steps. First we want to gather and prepare the data. Then develop the model or models, and integrate the models in application development. Next, model monitoring and management. Finally, retraining the models.

On top of the basic infrastructure, we have our Red Hat managed cloud services, which are going to help take any machine learning model all the way from gathering and preparing data—where you could use our streaming services for time-series data—to developing a model—where we have the OpenShift data service application or platform—and then to deploying that model using source-to-image. And then model monitoring and management with Red Hat OpenShift API management.

We also added in some customer-managed software, and this is where OpenVINO comes in. Again, we can develop our model, but this time we may use Intel’s oneAPI AI analytics toolkit. And if we wanted to integrate the models in app development, we could use something like OpenVINO.

And at Red Hat, we want to be able to use services and applications that other companies have already created—we don’t want to reinvent everything. For each part of the model life cycle we’ve invited various independent service vendors to come in and join this platform—a lot of open source companies have created really fantastic applications and pieces of software that will fit each step of the cycle.

The idea is that we invite all these open-source products into our platform so that people have choice—they can pick whichever solution works better for them in order to solve the particular problem they’re working on.

Ryan, how does OpenVINO^™ work with Red Hat OpenShift?

Ryan Loney: OpenShift provides this great operator framework for us to just directly integrate OpenVINO and make it accessible through this graphical interface. Once I have an OpenVINO operator installed, I can create what’s called a model server. It takes the model or models that my data scientists have trained and optimized with OpenVINO, and gives out an API endpoint that you can connect to from your applications in OpenShift.

The way the deployment works is use what’s called a model repository. Once the data scientists and the developer have the model ready to deploy, they can just drop it into a storage bucket and create this repository. And then every time an instance or a pod is created, it can quickly pull the model down so you can scale up.

Even if you don’t perform the quantization that Audrey mentioned earlier, OpenVINO does some things under the hood—like operation fusion and convolutions fusion—things that give you performance boost, reduce the latency, increase the throughput, but don’t impact accuracy. These are some of the reasons why our customers are using OpenVINO: to squeeze out a little bit more performance, and also reduce the resource consumption compared to just deploying with deep learning.

What’s best way to get started on a successful AI journey?

Audrey Reznik: One of my colleagues wrote an article that said the best data science environment to work on isn’t your laptop. He was alluding to the fact that when they first start out, usually what data scientists will do is put everything on their laptops. It’s very easy to access; they can load whatever they want to; they know that their environment isn’t going to change.

But they’re not looking towards the future: How do you scale something on a laptop? How do you share that something on the laptop? How do you upgrade?

But when you have a base environment, something everybody is using, it’s very easy to upgrade that environment, to increase the memory, increase the CPU resources being used, add another managed service. You also have something that’s reproducible. And that’s all key, because you want to be able to take whatever you’ve created and then be able to deploy it successfully.

So if you’re going to start your AI journey, please try to find a platform. Something that will allow you to explore your data, to develop, train, deploy, and retrain your model. Something that will allow you to work with your application engineers. You want to be able to do all those steps very easily—without using chewing gum and duct tape in order to get to production.