Secrets of Rugged AI

October 4, 2019

Kenton Williston

Johnny Chen IoT Chat

A conversation with Johnny Chen @OnLogic

We’re excited to announce IoT Chat, a new podcast created just for developers and engineers. Listen in to hear from industry experts on everything from AI to hardware design—and stick around for a few geeky jokes along the way.

In our first episode, we talk about the challenges of AI and machine learning in rugged environments with Johnny Chen, Solutions Architect at OnLogic (formerly Logic Supply). We recently interviewed Johnny for our article Fast-Track Computer Vision at the Edge and learned so much that we invited him back to dig deeper.

Listen to this episode for surprising insights on topics including:

Why popular vision hardware is inherently unreliable
How specialized hardware is actually better for portable code
The risks of hacks like liquid cooling

Transcript

Kenton Williston: Welcome to the IoT Chat, a production of insight.tech. I’m Kenton Williston, the editor in chief of insight.tech and your host for today’s podcast. Let’s get into the conversation.

In today’s podcast we’ll be talking about machine vision with Johnny Chen from Logic Supply. Johnny, I’m really excited to talk to you today. Can you tell me a little bit about yourself, what your background is and the kinds of machine vision technology you’re working with?

Johnny Chen: Currently, partnership and alliances is my current role. Previously I was a solutions architect here at Logic Supply, working with customers, designing both hardware and the software stack they will use. Logic Supply itself is very focused on machine vision applications.

Kenton Williston: So one of the first things that I would like to know about, the work you’re doing there is what kinds of hardware you’re using. I think it’s pretty obvious to everyone that machine vision is really hot right now. And that can be seen in the large number of players that are going after this market. You’re seeing CPU, GPUs, FPGAs, purpose-built vision processors; people are even rolling their own ASICs. So what kinds of hardware are you seeing as being, um, the most important in this marketplace?

Johnny Chen: So in industrial space, it’s pretty unique because we have quite a few environmental challenges. So we use everything from CPUs, GPUs and VPUs. Some customer applications they come from using external graphics, things like NVIDIA but it really depends on the application. We’ve used almost every type of, accelerators including, FPGAs and, recently quite a bit of NVIDIA in different form factors and so forth.

Kenton Williston: So it sounds like from your perspective, there’s an emerging trend towards the more specialized processors coming out on top. Would you agree with that?

Johnny Chen: Oh, definitely. It used to be everyone used GPUs, but I think, with the trend of get everything getting more specialized, especially at the edge where the environment is not as friendly, where you really don’t want a fan or you don’t want something high powered, that’s where the advantage of things like a VPU comes in. Something that was designed specific for vision.

Kenton Williston: And what about from the cost perspective? How does this new Movidius-style specialized hardware stack up?

Johnny Chen: Oh, it’s definitely better than most customers think because you not only really are saving on the hardware side where the real saving is, is cost of ownership and also, reliability and power consumption. So the cost of ownership makes up those two things. You know, reliability power consumption, not having a fan, having a lower power usage all adds up very quickly.

Kenton Williston: the next thing that comes to mind for me is that, the processor isn’t everything. And so I’m wondering what the rest of the system looks like. And what I mean by that is, in a lot of applications, people will have some existing equipment that they want to add vision capabilities to. So are they, somehow retrofitting their existing compute hardware? Are they ripping and replacing and putting in something all new?

Johnny Chen: The most common approach I see is really, as they added these applications there, they’re building new compute, but they’re adding it to old machines, because these are large machines that will not be replaced, because of the cost to replace them. But they’re adding in the, compute function with the cameras and so forth to add the vision capability into those machines.

Kenton Williston: And is that being done by, for example, adding a new box on top of what’s already there, adding some software, you’re doing some kind of add in cards?

Johnny Chen: It really is. Usually it’s providing a new box, because adding an add-in cards this thing system, you might not be able to take advantage of some of the newer platforms that’s out today, especially in terms of, compute power., power savings as well as, you know, high reliability because of the lower power usage of new systems.

Kenton Williston: That makes sense. That leads me to think about the level of customization that’s happening there. If we’re putting a new box on top of an existing system, is this something where you would want to build a highly customized solution? Is there some kind of an off-the-shelf approach that’s typically preferable, from like a design time consideration?

Johnny Chen: I think one with time to market consideration, what we do quite a bit with is, a customized solution. It’s not a custom box per se. It’s a customized solution. So we look at the customer’s environment, we look at what your needs are on what their end goals are. We use quite a bit off-the-shelf product plus with some customization specifically for them. We do a lot of like thermal design for their environment to make sure the machine can survive in that environment., we make sure there’s enough compute, and that is off-the-shelf, the compute portion’s off-the-shelf. The cooling portion’s a lot of times what we customize, for the customer depending on how much compute they need and what type of accelerators they are using, and we have customers that use one to two Movidius chips, and we also have some applications that calls for all the way up to eight or more really depending on the usage.

Kenton Williston: Wow. I don’t think I’ve seen a lot of examples where you would have eight vision processors in a system. What are some use cases where you would need that kind of performance?

Johnny Chen: Well, it depends on how much data you’re processing. So some customers require real time. And then the nice thing about Movidius is, you can actually split the workload, to multiple chips and they’ll run in parallel. So a great example is, automotive, for example, not necessarily autonomous vehicles, but for that you say mapping vehicles, cars is driving outside, mapping the area, doing, street shots. A lot of times you’ll do real time processing to, collect the analyst lies the data right there. And then in real time go over and do things like blur out, license plates, blur out faces. So this way, all privacy issues that they could do in real time right away.

Kenton Williston: There seems to be implicit in the conversation we’ve been having so far that what we’re really talking about is doing machine vision, via edge processing, which seems to clearly be where everything’s heading., are there instances where things are being done, in the data center or the cloud?

Johnny Chen: Well, that’s an interesting question. What we’re seeing is this, people are running models at the edge. So when you’re at the edge, you’re running the vision, your process and data. But at the same time you’re collecting new data. As you’re collecting those new images and new data, you’re sending it back to the server side. And what the server side is doing is incorporating that new data into the model and creating new models and that then these models get smarter and smarter. As it gets more data, then it pushes back out to the, to the edge system to run the new model. The reason for the separation is couple of things. One is you may have multiple edge devices all collecting data and then you centralize the data to create the new model. Then push it all back out to those edge systems. Two that is to create new models. There’s a lot of compute there and that is not necessarily what you want to put at the edge. At the edge is where you just want to run models.

Kenton Williston: But presumably the work you’re doing at the edge can feed back into your model development to refresh and update and make your models better.

Johnny Chen: Correct. it works like it’s like the learning loop, right? So the edge is running the model but also collecting new data but as semi venue data back to central servers, collecting new data from all the edge machines, then punching the data and creating a new model and then push it back out to all the edge systems. So now you pre that basically a learning loop, just get smarter and smarter over time. The longer you have the system, the better the model gets.

Kenton Williston: What kind of tools and frameworks and things of that nature are people using to actually do all of this work I know there’s lot of things out there, TensorFlow, Caffe, et cetera., do you see any one of these approaches gaining, a lead over the others?

Johnny Chen: Well, personally, and what we’ve seen with customers more and more is there’s a gravity, there’s a movement towards things like, OpenVINO. One of the main advantage of OpenVINO is it allows you to basically write your code once. So what that means is I write my code once I could actually assign it to different compute that’s available to that system. So for example, if I had an edge system because of the extreme temperatures there, I could only have a CPU. I might not even be able to put a VPU inside because of that extreme temperature. I could use the CPU in the built-in GPU, and the Intel silicon to actually run the model, that same code, then it could be repurposed for something the nicer environment industrial environment where it’s inside a factory where I might have multiple Movidius chips on there. I think it will run on that high to not need to recode and I have more compute resources. So it’s very adaptable. That gives you a big advantage, because there’s never going to be one machine fits all environments. This allows you to basically deploy your code on multiple different hardware.

Kenton Williston: So that sounds really appealing. The idea that not only could you write something once, that could run on a single Movidius, multiple videos, but could also run on some combination of CPUs, GPU, FPGAs. But of course, if I put myself in the developer’s shoes, I could, have some understandable skepticism about how well this is really going to work in practice. the first question is how do I know that this is really going to work at all and closely followed behind that, am I going to get really suboptimal performance such that I kind of wish I had taken a more hardcore, highly optimized approach?

Johnny Chen: If you have a very clear vision, what the end goal is that code, once you put it through the translator, you assign it to the right compute or the right type of compute. And how much real time you need it to be. It will work optimally for that compute. That’s the whole idea of OpenVINO, the OpenVINO. The way it works is you basically assign what computer you want to use. And like I said, you can use a, you use, you use a CPU core, you could use on the Movidius, CPUs, the GPU. A lot of times because of the CPUs today, they’re multi-core or they’re up to six or eight have a more course. A lot of times what our customers do is basically dedicate a core to a process to run a particular process that gives you almost real time. That’s one way to do it. Like I said, because of the flexibility and lets me use different hardware in combination with each other and they work in parallel. That’s the best part. So I can pick things to run the right way for what I need it to do.

Kenton Williston: It sounds like what you’re saying is on one hand you have this portability, and this write once, run anywhere, flexibility, but you still have from an architectural standpoint, the ability to optimize your overall system by putting the right workloads in the right places. Um, so that you can get whatever performance you need.

Johnny Chen: Yeah, absolutely. I mean, a lot of people, for instance, use CUDA. CUDA is great because is the GPU itself has a lot of little processors. The problem is with all the, with a cheap booth, the power of the GPU, these things run, you know, over 150 watts of power. Imagine if you were deploying this in real life, you know, here’s a warehouse, you need 10 20 of these systems. using 150 watts plus doesn’t make sense of just the cost of utility is going to be extremely high. Plus you have fans. Now we’re talking about reliability. Now imagine if you build specialized hardware using VPU, using the CPU, using the built in GPU and cut that down to maybe 30 watts. you’re looking at tremendous savings across the board, not just from a power consumption, but also reliability. This is why I think it’s so important that we work with our clients to make sure we understand, “What is their end goal? What is an application they’re using it for?” And then this way we can look at it and design for that.

Kenton Williston: So one of the most popular ways that people have been approaching the machine vision problem to date has been to use graphic processors, GPUs, which, you know, I think there’s a lot to say about their performance that they offer, but when I talked to folks in the industry, it seems like it’s not so much the performance that is the challenge as much as it is, power consumption and costs. Would you agree with that?

Johnny Chen: Yes, I would definitely agree with that., typically, you know, typical set up for an NVIDIA GPU type would be NVIDIA running CUDA, Intel , Intel CPU or another CPU running everything else. It’s a great starting point. But in full production environments, I’m realistic for edge, especially for as reliability. just to look at power requirements or if there are more requirements, a typical NVIDIA GPU takes 150 watts plus 150 watts. If you were deploying this in the warehouse, you have 10 systems. Just from a power perspective, the amount of power use using it is really unrealistic in the production environment. Second is reliability. These GPUs running fans, they’re not designed for industrial environment where these GPUs are really designed for at the most 30 to 40 degrees Celsius. ambient temperature. You’re running much hotter than that in industrial environments. So if you take those requirements, we understand the requirements, but we built a more purposely built box using VPNs using the built in CPU and the built in GPU.

Kenton Williston: So you end up with a more optimized solution in other words. So, so this all sounds great, but I’m sure, as in any development project there’s always pitfalls and challenges that come up. And I’m curious what some of the biggest challenges you see when you talk to your customers.

Johnny Chen: A typical deployment, or typical someone that’s just starting and they’re using a CUDA, it would be like an NVIDIA GPU, and the Intel CPU, it’s an interesting starting point . But in full production environments, I’m realistic, especially for edge reliability because the two things again, power requirements and they’re more requirements. So when you deploy an NVIDIA type system or a GPU type system, a dedicated GPU type systems VPUs take a lot of power. You’re looking at 150 watts, minimal amounts on these GPUs. Imagine in the warehouse, you’re deploying 10 of these. Now you’re looking at, you know, the amount of power you’re using, first of all is tremendously high. Second is the thermal requirements of your environment. GPUS are not designed really to be in those environments. Plus there’s a fan that could fail, especially in industrial environment where there’s lot sand or there’s a lot dust in the air or especially at carbon dust, which is conductive.

Johnny Chen: These things are all things that will destroy the system. So from where reliably you Billy perspective and a thermal perspective, it makes a lot more sense to design something really designed for industrial, for that environment. for that same performance, we could do it down to a system that does about 30 watts of power, using Movidius using the CPU, using the GPU. And this is where, a tool like OpenVINO really comes in handy because I write the code once, it allows me to use it, all those compute components.

Kenton Williston: One of the things I’m also wondering about is the hacker approach to solving these problems. I imagine that there are short cuts. People try to take that end up getting them in trouble, right? Like maybe there’s some kind of sneaky way they can deal with a power problem by just saying, well just, I won’t cool it sufficiently and hope it’ll last. do you see people taking shortcuts like that that you think they should avoid?

Johnny Chen: Yeah, I, I’ve seen some interesting shortcuts in my time here., an interesting one was it particularly liked the GPU, by cooling for instance. there are people that will say, well, if it’s in that type of environments, it’s very dusty, I’ll just do liquid cooling. Okay, interesting idea. But again, you’re taking something that is a commercial approach into industrial environment. Yes. Liquid cooling does work. liquid loops do work very well, but the costs and the maintenance of those are very high. And edge systems are typical systems that you really want to put it there and you want forget about it. They just have to work all the time because any downtime costs you in productivity and so forth. So yes, we’ve had customers that use, you know, liquid cooling loops. It works until there’s a leak or something happens and then they lose a whole day of productivity. these are things that are truly in an industrial environment, especially in a production environment. These are unacceptable. So we do have customers that come back to and says, you know, we tried this and failed and now we’re ready to look at other solutions.

Kenton Williston: But I have to say that, we really been talking kind of general terms. everything we’ve said so far would you say it really applies across the board for pretty much all industrial use cases for machine vision or are there different approaches you need to consider for different specific applications?

Johnny Chen: Well, you definitely want to consider quite a few things. an interesting example is, autonomous robots. there’s one great application that one of our customers use it for is for cleaning robots. So these are robots that basically roam through the building, clean the building. They’ll even ride the elevator and go back to charge itself 24 seven and avoids people is always running into things. Now when they started the project, they, they use a commercial off-the-shelf system. They ran a GPU, they did everything exactly the way how most people thought they should do it. They were running into a lot of failures because you have to realize, even though if you think about it with cleaning robots, you would think that commercial system would be perfectly fine because it’s just roaming the halls. Interesting enough though, there are quite a few bumps that has to go over, like going into the elevator.

These robots are pretty heavy. So the shock and vibe was actually making the systems fail. Everything from the fan to the GPU and so forth. So we started working with them. We moved them to, something that was fabulous. We moved them from, using a GPU to using Movidius. And so far they have no failures. It’s these little things that you don’t think about over time it wears out a machine. So once you go into a machine, once you moved to a compute that is basically fanless, sealed, you don’t have to worry about the environment as designed for the environment that’s in you don’t have those failures. So that was an application, in which, we worked with them to pick the right hardware and to put together the right custom system to fit that need.

Kenton Williston: That’s a great example. So I’m wondering as we look forward, it seems like today the transition is really from the, standard graphics processor to a more specialized vision processor. And I’m wondering where you see things going next.

Johnny Chen: Well, I see things moving faster. I see more ASICs, you know, like GPUs or VPUs as in addition to the host processor. These are things that will work together. I see more integration, even these aces into the host process selves. And I see integrate really into basically everything we do., technology really has to become invisible in order to gain larger acceptance, just in everyday things. So it has to integrate into our daily lives without us thinking about it and already starting to happen. And especially with vision, there’s no part of our daily life that will be touched by either, machine learning, or vision applications, imagine your machines, your home appliance, your work machines, they’re all smart enough eventually to be able to tell you, when the needs to be maintained, what part needs to be maintained instead of running a standard maintenance schedule could actually run based on your usage model, how you use the machine. It’ll know what part needs to be changed. These things will all become part of our daily lives.

Kenton Williston: I can’t wait until my toaster’s looking at me. And what about from the development methodologies? we’ve talked a lot about, OpenVINO and how significant that is. Are there ways you see the software stack changing going forward?

Johnny Chen: If there’s a development methodology, hardly changing. I think that the two biggest things, optimization, efficiency is the key to what I talked about, about technology becoming invisible. So we can’t work in a bubble. Hardware has to work with software together to maximize that efficiency. This way, you keep enough overhead for future development, but you use enough of what you need today without wasting. And that’s why I think it’s so interesting, as we talk about like different SDKs and different types of, tools you use, why I think flexibility of the tool is the most important thing and OpenVINO gives you that. That’s one of the tools that we see more and more coming up., in terms of our customers asking for, and, we as a company recommend to people because it does help them., it does help them develop faster and get to market faster and have the proper hardware to match, what their, what their end needs are.

Kenton Williston: So I think it’s fair to say that all these changes are happening rather quickly. So what can developers do to stay on top of all these changes and perhaps more importantly, what can they do to future proof their design so they don’t start something now and then six months from now there’s, a new technology that comes along that does the job so much better.

Johnny Chen: I would say is keep it simple. Start with a clear vision of what the end goal has to be. What do you want to do. But at the same time, you should roadmap out the additional features. I’ll be added at a later date and then architect a hardware to meet those additional features. And it’s going to be a balance. And this is where, hardware and software has to work together to understand what each other’s needs are.

Kenton Williston: Before we go, I want to ask you, the ultimate hard to answer question, which is if you had to give one piece of advice to a developer or an engineer who’s just getting started on their first vision project, what would that be?

Johnny Chen: exactly what I was saying earlier. It is keep it simple and have a clear vision and look for SDKs that have flexibility and that’s future proof. What you hate to do is start on an SDK and basically it’s for specific hardware and that hardware goes end of life.

A great example I would say is Movidius itself. When it first started, it had its own SDK worth. Well, but it was specific to Movidius. I don’t think that, the adoptability and also the interest really happened until OpenVINO happen because now it’s not an SDK specific to one hardware. It’s an SDK that will work on multiple hardware. So as a beginner or someone just getting into this or starting your first project, your development time is going to be long. You don’t want to start with an SDK that is too hardware specific. Because what’s going to happen that might happen is by the time you developed the code, the hardware is no longer available or the SDK has changed so much or for the new hardware that you’ll have to start over.

Kenton Williston: Awesome. That totally makes sense to me. With that, we’re out of time. So I just want to say thanks one more time, Johnny for joining us. Really appreciated getting your insights.

Johnny Chen: Happy to be here.

Kenton Williston: Thanks so much for listening. As always, if you enjoyed today’s podcast, please make sure to support us by subscribing and rating us on Apple podcasts--and if you want to chat more about machine vision, make sure to tweet us at insight.tech. This has been the IoT Chat podcast. Join us next time for more conversations with industry leaders at the forefront of IoT design.

The preceding transcript is provided to ensure accessibility and is intended to accurately capture an informal conversation. The transcript may contain improper uses of trademarked terms and as such should not be used for any other purposes. For more information, please see the Intel^® trademark information.