The challenges in making AI and machine learning work smoothly are formidable enough—now make it run in environments that include everything from sand and dust, to unexpected elevators. We asked Johnny Chen from OnLogic, a leader in high-performance IoT systems, to share his tips for deploying machine vision in rugged environments. He revealed the advantages of customized solutions over customized hardware, the specific challenges of operating at the edge, and the ways that taking shortcuts with your system can backfire.
How to Choose the Right Hardware for Machine Vision
Kenton Williston: Can you tell me what your background is, and what your role is at Logic Supply?
Johnny Chen: Partnership and alliances is my current role. Previously, I was a solutions architect here at Logic Supply, working with customers, and designing both hardware and the software stack they would use.
Kenton Williston: What kinds of hardware are you seeing as being the most important in this marketplace?
Johnny Chen: We use everything from CPUs, GPUs, and VPUs. It really depends on the application. We’ve used almost every type of accelerator, including FPGAs .
Kenton Williston: It sounds like from your perspective there’s an emerging trend towards the more specialized processors coming out on top. Would you agree with that?
Johnny Chen: Oh, definitely. It used to be that everyone used GPUs, but I think, with the trend of everything getting more specialized—especially at the edge, where the environment is not as friendly, where you really don’t want a fan or you don’t want something high powered—that’s where the advantage of things like a VPU comes in. Something that was designed specifically for vision.
Kenton Williston: For a lot of applications, people will have existing equipment that they want to add vision capabilities to. So, are they retrofitting their existing compute hardware? Or are they ripping out and replacing and putting in something all new?
Johnny Chen: The most common approach I see is that they add these applications to old machines. Because these are large machines that will not be replaced because of the cost to replace them. But they’re adding in the compute function with the cameras to add vision capability.
Kenton Williston: If we’re putting a new box on top of an existing system, is this something where you would want to build a highly customized solution?
Johnny Chen: What we do quite a bit is a customized solution. It’s not a custom box per se. It’s a customized solution. We look at the customer’s environment, we look at what their needs are, what their end goals are. We use quite a bit of off-the-shelf product, plus some customization specifically for them. The cooling portion is a lot of times what we customize for the customer, depending on how much compute they need and what type of accelerators they are using.
Kenton Williston: Machine vision via edge processing clearly seems to be where everything is heading. Are there instances where things are being done in the data center or the cloud?
Johnny Chen: Well, that’s an interesting question. What we’re seeing is that people are running models at the edge. So, when you’re at the edge, you’re running the vision, your process, and data. But at the same time you’re collecting new data. As you’re collecting those new images and new data, you’re sending it back to the server side. And what the server side is doing is incorporating that new data into the model and creating new models, and then these models get smarter and smarter. As it gets more data, then it pushes back out to the edge system to run the new model. It works like a learning loop, right? The longer you have the system, the better the model gets.
The reason for the separation is couple of things. One, is that you may have multiple edge devices all collecting data, and then you centralize the data to create the new model. Then you push it all back out to those edge systems. Two, is to create new models. There is a lot of compute there, and that is not necessarily what you want to put at the edge. At the edge is where you just want to run models.
Kenton Williston: What kind of tools and frameworks are people using to actually do all of this?
Johnny Chen: There’s a movement towards things like OpenVINO. One of the main advantages of OpenVINO is that it allows you to basically write your code once. So, what that means is once I write my code, I can actually assign it to different compute that’s available to that system. It’s very adaptable. That gives you a big advantage, because there’s never going to be one machine that fits all environments. This allows you to basically deploy your code on multiple different hardware.
Kenton Williston: If I put myself in the developer’s shoes I could have some understandable skepticism about how well this is really going to work in practice. Am I going to get really sub-optimal performance such that I’m going to wish I had taken a more hardcore, highly optimized approach?
Johnny Chen: If you have a very clear vision of what the end goal is for that code, you put it through the translator and you assign it to the right compute or the right type of compute, and know how much real time you need it to be—it will work optimally for that compute. That’s the whole idea of OpenVINO, because the flexibility lets me use different hardware in combination with each other, and they work in parallel. That’s the best part. I can pick things to run the right way for what I need it to do.
Kenton Williston: One of the most popular ways that people have been approaching the machine vision problem to date has been to use graphic processors, GPUs. I think there’s a lot to say about the performance that they offer, but when I talk to folks in the industry, it seems like it’s not so much the performance that is the challenge as it is power consumption and costs. Would you agree with that?
Johnny Chen: Absolutely. The problem is with the power of the GPU: these things run over 150 watts of power. Imagine if you were deploying this in real life. Here’s a warehouse where you need 10 or 20 of these systems. Using 150 watts-plus doesn’t make sense, as the cost of utility is going to be extremely high. Plus, you have fans. We’re also talking about reliability. There’s a fan that could fail—especially in an industrial environment where there’s a lot of sand or there’s a lot of dust in the air, especially carbon dust, which is conductive. These things are all things that will destroy the system.
Now imagine if you build specialized hardware using VPU—using the CPU, using the built-in GPU—and cut that down to maybe 30 watts. You’re looking at tremendous savings across the board, not just from a power consumption point of view, but also reliability. This is why I think it’s so important that we work with our clients to make sure we understand: “What is their end goal? What is an application they’re using it for?” And then this way we can look at it and design for that.
Kenton Williston: One of the things I’m also wondering about is the hacker approach to solving these problems. Like maybe there’s some kind of sneaky way they can deal with a power problem by just saying, “I won’t cool it sufficiently and I’ll hope it’ll last.” Do you see people taking shortcuts like that that you think they should avoid?
Johnny Chen: I’ve seen some interesting shortcuts in my time here. There are people who will say, well, if it’s in that type of environment where it’s very dusty, I’ll just do liquid cooling. Okay, interesting idea. But again, you’re taking something that is a commercial approach into an industrial environment. Yes, liquid cooling does work. Liquid loops work very well, but the costs and the maintenance of those are very high. It works until there’s a leak or something happens, and then they lose a whole day of productivity. And edge systems are typically systems where you really want to put it there, and you want to forget about it.
Kenton Williston: Everything we’ve said so far—would you say it really applies across the board for pretty much all industrial use cases for machine vision? Or are there different approaches you need to consider for different specific applications?
Johnny Chen: Well, you definitely want to consider quite a few things. An interesting example is autonomous robots. There’s one great application that one of our customers use it for, that is for cleaning robots. These are robots that basically roam through the building and clean the building. It’ll even ride the elevator and go back to charge itself 24/7, and it avoids people. Now when they started the project, they used a commercial, off-the-shelf system. They ran a GPU—they did everything exactly the way most people thought they should do it. But they were running into a lot of failures, because even though you would think that a commercial system would be perfectly fine because it’s just roaming the halls, interestingly enough, there are quite a few bumps that the robot has to go over, like going into the elevator.
And these robots are pretty heavy. So the shock and vibration were actually making the systems fail. Everything from the fan to the GPU and so forth. So, we started working with them. We moved them from using a GPU to using Movidius. And so far they have no failures. It’s these little things that you don’t think about—over time it wears out a machine. So, once you go into a machine, once you move to a compute that is basically fanless, sealed, you don’t have to worry about the environment that it’s in, and you don’t have those failures. So that was an application in which we worked with them to pick the right hardware, and to put together the right custom system to fit that need.
Kenton Williston: I’m wondering where you see things going next.
Johnny Chen: I see things moving faster. I see more ASICs, like GPUs or VPUs, as in addition to the host processor. These are things that will work together. I see more integration. Technology really has to become invisible in order to gain larger acceptance, just in everyday things. It has to integrate into our daily lives without us thinking about it, and that’s already starting to happen. Imagine your machines, your home appliances, your work machines—they will all be smart enough eventually to be able to tell you when they need to be maintained, what part needs to be maintained, instead of running a standard maintenance.
Kenton Williston: Are there ways you see the software stack changing going forward?
Johnny Chen: I think that the two biggest things are optimization and efficiency. That is the key to what I talked about—about technology becoming invisible. Hardware has to work together with software to maximize that efficiency. This way, you keep enough overhead for future development, but you use enough of what you need today without wasting.
Kenton Williston: What can developers do to stay on top of all these changes? And, perhaps more importantly, what can they do to future-proof their designs?
Johnny Chen: Keep it simple. Start with a clear vision of what the end goal has to be. But at the same time, you should roadmap out the additional features that will be added at a later date, and then architect a hardware to meet those additional features. And it’s going to be a balance. This is where hardware and software have to work together to understand each other’s needs.
To learn more about machine vision, listen to our podcast on Secrets of Rugged AI.