Podcast: Accelerating Developers’ AI Success with OpenVINO™

Accelerating Developers’ AI Success with OpenVINO™

June 1, 2023

Christina Cardoza

Yury Gorbachev, Raymond Lo

From diagnosing diseases to detecting defects on the production line to providing deeper insights into customer behavior—AI increasingly transforms the way businesses across all industries operate today. But who truly makes these deployments successful is developers creating AI models and solutions capable of providing business value. When AI developers are equipped with the right tools, technology, and knowledge, they have the power to make all kinds of exciting and innovative use cases.

In this podcast episode, we discuss how AI is used to improve efficiency, make better decisions, enhance customer experience, and provide a competitive advantage. We also explore tools and technologies that allow developers to successfully build and deploy these AI models and solutions as well as touch on some of the latest capabilities in the OpenVINO^™ 2023.0 release.

Listen Here

Our Guest: Intel

Yury Gorbachev, OpenVINO architect at Intel, and Raymond Lo, AI Software Evangelist at Intel.

Yury has held various engineering roles in his seven years at Intel. As an OpenVINO architect, he works with the developer community to learn about their AI development pain points and come up with a technical solution to solve them.

Raymond has been the global lead on OpenVINO for the past three years, working with engineering, planning, enabling, and marketing teams to drive developer satisfaction. Prior to joining the company, he was CTO and Cofounder of Meta Co., a YC-backed company where he worked to build software and drive hardware.

Podcast Topics

Yury and Raymond answer our questions about:

(3:47) The evolution of artificial intelligence in recent years
(6:36) How developers benefit from AI advancements
(9:46) Best practices for successful AI deployments
(14:58) The five-year anniversary of the AI toolkit OpenVINO
(20:47) New tools making AI more accessible to business users
(24:10) The future of AI and the role of OpenVINO
(29:17) What developers can expect in OpenVINO 2023.0

Transcript

Christina Cardoza: Hello and welcome to the IoT Chat, where we explore the latest developments in the Internet of Things. I’m your host, Christina Cardoza, Editorial Director of insight.tech, and today we’re going to have an exciting conversation where we’ll be talking about AI trends and challenges, as well as get a look at the latest OpenVINO^™ 2023.0 release with Raymond Lo and Yury Gorbachev from Intel. And, Yury, I see that you’re an OpenVINO Architect at Intel. Can you explain to me a little bit more about what that means and what else you do at Intel?

Yury Gorbachev: Yeah, so, I’m acting as a leader for the architecture team. Basically it’s a team of very capable individuals, so they are very technically strong in different areas. Optimization, model optimization, code optimization for the certain platforms like CPU, GPU, and then model support—all of those things. And we work on a customer request, we work on new features that we envision ourselves, we work on potential evolution areas that we see as important. And we try to combine them. We try to combine them with what we have. We try to come up with the technical solution and solve the problem in the most efficient manner.

So, pretty much I would say for the past years we’ve been working on so many features on so many interesting aspects of AI that we have built a very strong team with very strong competence. And for me, personally, I just, every day I just enjoy looking at something new and how to add this to our product. That’s pretty much what I do every day.

Christina Cardoza: Well, I can’t wait to see how some of that gets added into OpenVINO. I know we’ll be talking about OpenVINO 2023.O, but before we get into it, Raymond, I would love to learn a little bit more about what an AI Software Evangelist does and how that brought you to Intel.

Raymond Lo: So, I’ve been in the industry about 10 years already. And before Intel I was a developer, right? I was even the CTO of a company that came out of Y Combinator. And that’s how I got started, because I always build software, and then software I think is where it drives the hardware. And that time you have developer that wrap around building workshop with me. At Intel, that’s what I’m doing: I’m building hackathon workshops and giving talks so people can learn about what can you do, right? As a developer, number one rule is it looks great—how does it work right? And that’s what I do at Intel today, and that’s what inspires me, because all of you have been bringing me extremely hard questions every day.

Christina Cardoza: Yeah, absolutely. And I’m really excited to get into this conversation, because at insight.tech we write a lot about different trends happening in different industries, and AI has been a part of a lot of these new use cases that we’re seeing over the last couple of years. Defect detection and manufacturing customer-behavior analysis in retail, even traffic detection in smart cities. So it’s really been exciting to see all of these changes with AI powering it.

And you mentioned you work on OpenVINO. I know OpenVINO is behind a lot of this stuff, and the developers that Raymond’s working with to make some of these capabilities possible and really translate them into these business values. So I’m excited to hear from you guys, since you have been on the inside, on more the technical side as engineers. Yury, I would love to see how you’ve been seeing AI progress over the last couple of years. What are the trends that you’ve been seeing? Or what are you most excited about that’s happening in this space?

Yury Gorbachev: Yeah, first of all, I would say you are right. I mean, most of the AI use cases that you mentioned are already in production. This is the mainstream now. Quite a lot of use cases are being solved through AI. Customer monitoring, roads monitoring, security, checking of the health of patients—all of those things are already in the main line. But I think what we are seeing now is, for the past year, is a dramatic change in the AI and how it is perceived and what capabilities it is possible to—it is capable of solving. So, I’m talking about generative AI, and I’m talking about this splash of the popularity that we are seeing now with this ChatGPT, Stable Diffusion, and all those models.

Most of the hype is obviously coming from ChatGPT, but there is quite a lot of use cases that are being—that are exploding right now, thanks to generative AI. We are seeing image generation, we are seeing video generation, we are seeing video enhancements, we are seeing text generation—like there are use cases where a model can write a poem for you, a model can continue your text, can continue the paper that you were writing. And all of those things, they are evolving very rapidly right now.

I think, if we look back in like 10 years or so, when there was an explosion in adoption of deep learning, I think it was combined with availability of the data and the availability of the GPUs and the ability to train the models. There was an explosion in the use cases. There was explosion in the models, there was explosion in architectures. So now, the same thing is happening with the generative AI.

Christina Cardoza: I absolutely agree. You know, we are just seeing every day new advancements, new different technologies and models that developers can use. It can be quite overwhelming, because now developers, they want to jump on the latest and greatest advancements. They don’t know exactly how to do it or if they should. And sometimes it’s better not to jump right away and to maybe wait and see how it plays out and then how you can add it to your solution.

So, Raymond, I know in your position you work with a lot of developers, and you’re trying to exactly help them do this and teach them how to build AI in a safe and smart way. So, what can you tell us about the advancements and how you work with developers?

Raymond Lo: Well, to work with developers, I have to be a developer myself. So maybe it’s worth sharing how I started to, because it came quite long ago, maybe 10, 12 years ago. I built my first neural network on my computer at that time with my team, right? I was in the lab trying to figure out how can I track this fingertip and make different poses, just making sure that my camera can understand what I’m doing in front of it. It took us three months just to understand how to train the first model. Today, if I give it to Yury, it’s like, Point me to right there. Two days later maybe it’s all done, right? But to me at that time, even published the paper, building just a very simple neural network took me forever.

Of course it worked at the end. I learned how it works. But through these many years of evolution, the frameworks are available. TensorFlow, PyTorch is so much easier to use. Back then I was computing on my own C++ program. Pretty hard-core, right? Not even Python—C++ coding, right? And then today they have OpenVINO, because back then, I was trying to deploy on my wearable computer. Oh, dear God, I have to look at instruction sets. I was trying to look at, okay, how do I parallelize this work, making sure that it runs faster.

So, today when I talk to my developers or developer in the community, it’s like here—it’s OpenML, I have GPT, everything is in there. You don’t have to worry about that as much, because when you made a mistake in unfurling, guess what happened? Ba boom. It will not run anymore, or it’ll give you wrong results. So those are the things that I find that is a lot valuable today is I have a set of tools and resources that people can ask me. I can give them a quick and “validated answer.”

Then back to the old days, I have my research project that maybe three people ever read it—including myself at that time. And then I finished my PhD. People think this is pretty good, but a lot of work today I see is from undergraduates, right? They have no programming experience. How would this will be good enough, right? So that is the sort of thing that Intel—today we are giving people this validated tool with this kind of back history, right?

Christina Cardoza: That was great. I’m sure it is frustrating, but also exciting to see how much faster you can build these AI solutions than you have in the past. I was talking to one of your colleagues, Paula Ramos, and she was working on something that took her months to do—train an AI model, like you were saying. But then with the tools Intel has available, it took her minutes. So it’s amazing to see all of these evolutions and advancements.

I mentioned some use cases in the beginning—defect detection and manufacturing smart cities and retail. A lot of these applications that AI is now being built into can be very mission-critical applications. There’s a lot of different things you have to be aware of when building out these solutions. So I’m curious, Raymond, what advice would you give developers when building these types of solutions? How—what should they watch out for, or what should they be aware of?

Raymond Lo: This is actually a very good question, because as I speak more with young developers, some of the customers, I listen, right? Like, what do you need to make something run the way that you need it, right? So, let’s say, hypothetically speaking, if the person is trying to put it in a shopping mall, I don’t think they need, like, FDA approval, or any sort of approval to get a camera set up. They need to think about privacy, they need to think about heat maybe, because they want to hide it. They don’t want to have a camera with the rig sticking out. Like, it will happen all the time because the device could be very hot, right? If you’re running it on a very power-hungry device. The more I talk to the people listening, I find out there’s no perfect answer.

But we think about portfolio, and that’s what Intel has. When I first joined, I was like, “Hmm.” My boss was saying, “We have a lot of hardware that runs OpenVINO.” And I was like, “How many?” He was like, “Look at all the x86.” “Like, what do you mean, x86?” “Every x86, yeah, today, right?” I was like, “Yeah, I use it.” Well, I was 18 or younger, right? So that gave me that insight into oh—I don’t need ultra-expensive supercomputer to do inference.

So, as I listen to more—some use cases, like detecting diamonds, it’s real; it’s actually a real hackathon. To figure out if the diamond has defect in it, they don’t need a supercomputer. This is a need for a computer that reads it very well with a very good algorithm. And then I think, everyone loves diamond.; who doesn’t, let me know. But if they look at the diamonds, right? It’s so shiny and pretty, but they can use that to find the defect inside, and then what they require is a very unique system, right? They want it to be in a factory; they want it to be on the edge. They don’t want to upload this data—nope, not happening. They have a factory there; they want to make sure everything happened on site.

So that’s how I felt, like the more we work with our customer, I think we are trying to collect these kinds of use cases together and create these kinds of packages of solution for them. And that’s what I like about my work because it’s—anyone play puzzle games? Every day is a puzzle. It’s good and bad, okay. And you may have something to add to it.

Yury Gorbachev: Yeah, I think you’re totally right. So I think it’s like, the most undervalued platform, I would say, is something that you have on your desk, right? So quite, if not most, developers actually use laptops, use desktops that are powered by Intel, and OpenVINO is capable of running on them and actually capable of delivering quite good, if not the best AI performance for the scenarios that we are talking about. You can do live video processing, you can do quite a lot of background processing for documents, audio processing—all of those things are actually possible on the laptop. So you don’t need to have a data center.

So that’s something we’ve been trying to show for years, and that’s something that Raymond is helping to show to our customers, to developers. We are making resources to showcase that. We are making resources to showcase how you can generate new images on this. How you can process your video to perform style transfer, to detect vehicles, to detect people, and maybe do a few more things, right? So I think this is spot on.

So, from the business standpoint the exact same platform runs in the cameras and the video-processing devices and things like that. But it all starts with the very basic laptops that each and every developer has. And the OpenVINO started there; OpenVINO started to run on those, on those devices. And we continue to do it. We continue to showcase power and performance that you can reach by running on those devices.

Christina Cardoza: That’s a great point. Intel has followed along with this evolution. You know, like Raymond said, you don’t need such advanced hardware or all of these computers anymore to do some of these things. Intel keeps making advancements year over year to make sure that this is easier to build, this is more accessible to developers or to the business side. like I mentioned with Paula—that being able to train an AI model in minutes—that was a new software tool that Intel just released—the Intel® Geti^™, that AI platform just within the last year or so. So it’s really exciting to see the advances that Intel is actually making.

And you’ve mentioned OpenVINO a couple times, Yury. I know you work very closely with that AI toolkit, and that Intel is celebrating the fifth year of OpenVINO this year. It seems like it’s been around for a lot longer, with all of the advancements and use cases that have come out, but, wow—five years. And I know along with this release we have the 2023.0 release coming. So, Yury, I’d love to hear a little bit more about how you’ve seen OpenVINO advance over the last couple of years, over the last five years, and what we can expect from this new release coming out.

Yury Gorbachev: Yeah, so, like Raymond mentioned in the very beginning that he was starting with OpenCD, so I have to say originally, most of the team, that we have actually started by working on the OpenCD by developing OpenCD, and then eventually we started to develop this open-source toolkit to deploy AI models as well. So we borrowed a lot from OpenCD paradigms, and we borrowed a lot from OpenCD philosophy. And we started to develop this tool. And I have to say, since we’re working on OpenCD we were dealing a lot with computer vision. So that’s why initially we were dealing with computer-vision use cases with OpenVINO.

Then, as years passed, and we have seen also simultaneously tools that were evolving, like TensorFlow, PyTorch. We even started—initially, when we started, Caffe was the most widespread framework. Nobody even remembers that now, but that was the most widespread framework, that people were attending webinars, attending conferences, just to develop models in Caffe. Nobody remembers that. But we started with this.

We’ve seen the growth of TensorFlow, we’ve seen the explosiveness of PyTorch, all of that. So we had to follow this trend. We’ve seen the evolution of the scenarios like computer vision initially, close image classification. Then, oh man, object detection became possible, segmentation, all of those things. So new models appeared, more efficient models, methodologies for optimizations of the models. So we had to follow all of those.

We initially made just runtime, then we started working on the optimization tools and optimized models for community. Then eventually we added training-time optimization tools. We added more capabilities for training models and things like that. So we had to adapt our framework. And I have to say, initially we started with computer vision, but then huge explosiveness happened in the NLP space, text-processing space. So we had to change quite a lot in how we processed, how we processed the inferences in our APIs. So we changed that; we changed a lot in our ecosystem to support those use cases.

So, like about a year ago we have released OpenVINO 22.0, 22.1, actually that time with the new API. And that was because we wanted to unlock the NLP audio space and all of those scenarios that we were supporting not very efficiently. So now we are seeing the evolution of, as I mentioned, generative AI, image generation, video generation. So we adapt to those as well.

And I would say in parallel, so this is the functional track, right? But then in parallel, as you mentioned, Intel evolves a lot. Intel produces new generation after generation. As we introduced discrete GPU we are evolving our integrated GPU family and client space, that data center space—all of those families were evolving. So we had to keep up with the performance that were—that is capable of providing—that those platforms can provide. So all those technologies like VNNI; now there is a Sapphire Rapids with IMX, discreet GPU with systolic arrays—all of those things, we had to enable them through our software.

So we worked a lot with the partners; we worked a lot across the teams to power those technologies to always stay best performing framework on Intel. So we—if you take a platform that you had in the laptop when we started, and if you look at this now, I would say probably the clients now in terms of AI could be compared somehow to the data center processors when we were starting. So, huge evolution in terms of AI, huge evolution in terms of performance, in terms of optimization methods, and things like quantization—all of those things.

So we were looking how regularly we measure how we evolve generation over generation, and it’s not like 5%, it’s not like 10%; sometimes it’s twice better, three times better than generations before. So it’s very challenging to stay up to date with the latest AI achievements, the technologies that AI is capable of solving, as well as powering all these generations of the hardware that we are dealing with.

Christina Cardoza: Yeah, and one piece of this puzzle that I think is also important to talk about is you’re on the developer side; you’re making things just so much smoother, being able to build these applications, being able to have them run—these high-compute solutions run very easily for businesses. And I think at the same time—like I mentioned the Intel Geti. You have this solution that came out that now makes it easier for developers to work with the business person. Or like Raymond mentioned earlier in the conversation, people that don’t really have programming skills are now able to get into AI and build these models, and then OpenVINO can really carry it through to make it a reality.

Raymond, can you talk a little bit about those two solutions—how they work together, and how you’ve seen developers and business teams improving solutions?

Raymond Lo: So, the way I see development today is more about the—well when I say end-to-end, right? We hear that term a lot. It’s really about, you have a problem statement that you want to solve. Remember my first example? I just want to figure out what my finger’s doing in space just to track the fingertips? That requires some training, right? So, it requires having data about, okay—this is pointing up, or this is doing a camera shot. I was trying to do something fancy, right? So, same for that; we noticed that’s the gap.

So that’s what Geti fills in, where you can provide a set of data that you wanted the algorithm to recognize something, let’s say, it can be default, can be sort of like a classification of a model, of an object. Then that process often, as we said before, that took me many years, right? To understand how it works. But today it’s more like, okay—we provided interface, we provide people the tool, and then also the tool is not just like click-and-drop, but they have those fine tuning parameters. You can really figure out, let’s say, how you want to train it. You can even put it, let’s say, with the dataset, so that every time you train it you can annotate it and also get that—we call it an active-learning approach.

So back then when I do the dataset, I, like, label every one of them by hand. But today we can have—let’s say you start three of them, then the AI will figure out, okay—I’ll give you 10 more we think are most likely the one that you want to highlight. And then after you give them enough example, the AI will figure out the rest of it for you, right? Just think about the whole training effort, right? We take away a lot of those burdens that, seriously, don’t make a lot of sense for me to do that when you can have an algorithm from that does it better than me and then we can do it more effectively.

So that’s what Geti is really about, right? Bringing that journey from an idea. Now you have ways and ways to tackle this problem—to getting a model that is deployable on OpenVINO. And that to me is a very new thing that we are putting together. And again, when we look at Geti, the team had the experience building this too. So I really recommended next time you find us and ask them, what’s the journey? Because they spent a lot of years behind this. So we just launched, it doesn’t mean we just started last year, right? So we have, almost like, many years ago they started doing machine learning training. And that, I think, is what the Geti is about, is bringing the people, the group of people today, having that difficulty, getting a model running, to get it running today, and, more importantly, bring it to the ecosystem.

Christina Cardoza: We’re already seeing so many innovative use cases, but I think we’ve only really scratched the surface. You mentioned generative AI is coming out now, and there’s still so much to look forward to. So, I’m curious to hear from both of you, and Yury I’ll start with you—what do you envision for the future of AI? What’s still to come, and how will OpenVINO 2023.0 and beyond continue to evolve along with the needs of the AI community and landscape?

Yury Gorbachev: So, it’s hard to really predict what will happen in a year, what potential scenarios will be possible through the AI, through the models and so forth. But one thing I can say for sure, I think we can be fully confident that all of those scenarios, all of those use cases that we are seeing now with generative AI—this is the image generation, video, text, chatbots, personal assistants, things like that—those things will all be running on the edge at some point, mostly because there is a desire to have those on the edge.

Like, there is a desire to analyze documents locally. There is a desire to edit documents locally. There is a desire to have a conversation with your own personal assistant without sending your request to the cloud, and having a little bit of a privacy. At the same time do this fast, because doing things on the edge is usually faster than doing them on the cloud—at least more responsive. The way, like, assistant for the voice operating right now, home assistant—most of them are operating on the edge. So we will be seeing that all of those scenarios will be moving to the edge.

And this is where OpenVINO, I think, will play huge role, because we will be try—we will be trying to power them on the regular laptop. We are already doing that. We’ll be trying—we will be continuing to do it. We will be working with our customers as we did before. We’ll be working on those use cases to enable them on the laptop. So you will be able to download Chrome extension, or your favorite browser extension, and it will be running some of those scenarios in a very—with a very good performance.

And initially there might be a situation that performance on the laptops will not be enough. Obviously there will be some trade-offs in terms of what optimizations you will do versus what performance you will reach. But eventually the desire would be so high that laptops will have to adapt. You will see more powerful accelerators integrated right in the clients. And then this would be more or less the same challenge as we went through for the past years. We will need to enable all of this, and we will need to enable them in a manner that it’ll be fast, responsive on the edge. So that’s my prediction, so to say.

Raymond Lo: Yeah. And the way I predict the world, I often try to model it, although, like what Yury says, it’s very hard to model something today because of the speed. But there’s something I can always model; is always like, any time there’s a successful technology that happens in this world, it’s always the adoption curve, right? It just—it’s a simple number, like how many people use it every day. It’s obvious that it’s called a bound-to-happen trend. Bound to happen means everyone will understand what this is. We understand the value; they know how to get there, and then they are scaled to it.

And that’s—I think in this release, 2023, marks the part where I see scale, right? We hit a million downloads—thank you again, Yury, one million downloads. That is a very important number. Adoption, right? Then we hit, let’s say, this number of units sold with like all these things. It represented that the market is adopting this, rather than something that is great to have and then no one revisiting, right? The revisiting rate, the adoption rate.

I can tell you, from a year from today, I got to put it in there, we will have better AI. Is it almost sure, but it’s bound to happen, right? We’ll not have a degraded AI; we’ll have a better AI. Some tool may just degrade because, if you look at some of the phones, it’s like, it’s a square box now, what way can you make the phone better, right? It’s a square, right? It’s a physical, it’s a square, right? What can you make right? Can you make it foldable? Yes, that’s what you can do. But for AI the possibility is, we did it from the software, the thinking. So that’s what we think is quite exciting.

Christina Cardoza: Yeah, absolutely. And this has been a great conversation, guys. Unfortunately we are running out of time, but before we go, I just want to throw it back to you guys. We’ve been talking about OpenVINO and this 2023.0 release and the five year anniversary. Yury, I’d love some final thoughts from you about what this five-year anniversary of the toolkit really means. And you touched upon it a little bit earlier, but if you can expand a little bit on what should developers be excited about in this new release, and what they have to look forward to.

Yury Gorbachev: Yeah. So, the release that we are making, there are continuous improvements in terms of performance. As I mentioned, we are working on generative AI, we’re improving generative-AI performance on multiple platforms. But most noticeably we are starting to support dynamic shapes on GPU. This is the huge work on the compilator that we have done. Huge work on the background. And there will be—it’ll be possible to run quite a lot of text-processing scenarios on the GPU, which includes integrated GPU and discrete GPU.

There is still some work that we need to do in terms of improving performance and things like that, but in general those things were not entirely possible before. So now it’ll be possible. We’re looking at capabilities like chats, and they will be running on even integrated GPU, I think.

Second major thing I would like to highlight is we are moving—we are streamlining a little bit our quantization and our model-optimization experience. We are making one tool that does everything, and it does this through the Python API, which is more like a data science–person friendly, a little bit, regular environment for working with the models. So those things are really important, and obviously we are continuing to add new platforms; we’re continuing to add improvements in terms of performance, things like that.

So, and then one feature I would probably say a little bit as a preview or as experimental because we like to get some feedback is we are starting to support PyTorch models. We are starting to be able to convert PyTorch models directly. So there is still some work that we are doing on this, but we are very excited on the work that we have done already. So, it’s not probably to all degrees production-ready functionality, but the team is very excited about this.

And what I would say is that we would be very happy to hear feedback from our developers to hear what they think, to hear what we can improve. Because we did a lot of work, we did a lot of notebooks, we did a lot of coding to make those things happen. And I know Raymond is talking about those notebooks all the time. So he will probably be the best person to talk about it.

Raymond Lo: Yeah. Just, like, don’t trust me, trust the notebooks, trust the examples, right? Because 70-plus examples, you can try from Stable Diffusion, GPT—all of those example that will run on your laptop. Again, high-end laptops for the high-end models, okay? So if you want to upgrade your laptop, it’s the best time now. So, those notebooks will give you the hands-on experience, and that’s where we can communicate. Again, to look for it. It’s called OpenVINO, O-P-E-N-V-I-N-O. Notebooks, okay, notebooks. When you Google that, you’ll find it, and then you find all the great examples that our engineers develop, and that’s where we can, you can get help.

Christina Cardoza: Great. Well I just want to thank you both again for joining us today and for the insightful conversation. I know there’s so many different areas of AI that we could continue to talk about, but I look forward to seeing how else OpenVINO continues to evolve, and we’ll make sure to link in this episode access to learn more about OpenVINO 2023.0, and how you guys can make the switch. But, until then, this has been the IoT Chat.

The preceding transcript is provided to ensure accessibility and is intended to accurately capture an informal conversation. The transcript may contain improper uses of trademarked terms and as such should not be used for any other purposes. For more information, please see the Intel^® trademark information.

This transcript was edited by Erin Noble, copy editor.