Over The Edge

Generative AI at the Edge with Daniel Situnayake, Head of ML and Jenny Plunkett, Senior Developer Relations Engineer at Edge Impulse

Episode Summary

How is AI being used at the edge, and what possibilities does this create for businesses? In this episode, host Bill Pfeifer sits down with the co-authors of the book AI at the Edge, Daniel Situnayake, Head of ML, and Jenny Plunkett, Senior Developer Relations Engineer at Edge Impulse. They discuss how to determine which problems can actually be addressed through AI at the edge, how to think about effective AI, and unexpected use cases in generative AI and synthetic data generation.

Episode Notes

How is AI being used at the edge, and what possibilities does this create for businesses? In this episode, host Bill Pfeifer sits down with the co-authors of the book AI at the Edge, Jenny Plunkett, Senior Developer Relations Engineer and Daniel Situnayake, Head of ML at Edge Impulse. They discuss how to determine which problems can actually be addressed through AI at the edge, how to think about effective AI, and unexpected use cases in generative AI and synthetic data generation. Plus, they cover how AI can support data distillation efforts and how to build teams that can successfully navigate this landscape. 

---------

Key Quotes:

“We're going to be using generative AI to help us build a synthetic data set to train other AI models to deploy to edge devices.” - Jenny

“One of the things that's really cool about synthetic data and using generative AI for that, is it potentially reduces the cost of training a model because instead of having to spend huge amounts of money labeling all this data, if you create the data yourself, you can have it implicitly be labeled.” - Dan 

--------

Show Timestamps:

(01:49) How did they get started in tech? 

(03:14) What brought them to AI?

(08:12) What brought them together to write their book?

(13:26) Determining which problems can be addressed with AI at the edge

(15:51) What possibilities does AI at the edge create for businesses? 

(20:41) Synthetic data and generative AI 

(24:15) Using AI for data distillation 

(31:00) Building a skilled and interdisciplinary team 

(39:30) AI’s transition from a career path to a tool 

(43:06) Effective AI 

(46:46) Edge / wildlife conservation case study 

(49:37) What are they excited about moving forward? 

--------

Sponsor:

Over the Edge is brought to you by Dell Technologies to unlock the potential of your infrastructure with edge solutions. From hardware and software to data and operations, across your entire multi-cloud environment, we’re here to help you simplify your edge so you can generate more value. Learn more by visiting dell.com/edge for more information or click on the link in the show notes.

--------

Credits:

Over the Edge is hosted by Bill Pfeifer, and was created by Matt Trifiro and Ian Faison. Executive producers are Matt Trifiro, Ian Faison, Jon Libbey and Kyle Rusca. The show producer is Erin Stenhouse. The audio engineer is Brian Thomas. Additional production support from Elisabeth Plutko and Eric Platenyk.

--------

Links:

Follow Bill on LinkedIn

Connect with Jenny Plunkett on LinkedIn

Connect with Daniel Situnayake on LinkedIn and Twitter

Daniel’s substack

Episode Transcription

Narrator 1: [00:00:00] Hello and welcome to Over the Edge. This episode features an interview between Bill Pfeiffer and two colleagues from Edge Impulse, a company that supports developers in bringing AI solutions to the edge. Daniel Sitnayaka, head of ML, and Jenny Plunkett, senior developer relations engineer at Edge Impulse, joined the podcast to talk about their recent book, AI at the Edge.

Narrator 1: This book provides engineers and tech leaders with an end to end framework for solving problems with edge AI. In this conversation, Bill, Dan, and Jenny dive into the possibilities that AI creates at the edge, how to think about effective AI, and generating synthetic data to train edge AI models. But before we get into it, here's a brief word from our sponsors.

Narrator 2: Over the Edge is brought to you by Dell Technologies to unlock the potential of your infrastructure with edge solutions. From hardware and software to data and operations, across your entire multi [00:01:00] cloud environment, We're here to help you simplify your edge so that you can generate more value. Learn more by visiting dell.com/edge for more information, or click on the link in the show notes.

Narrator 1: And now please enjoy this interview between Bill Piper and the duo from Edge Impulse. Daniel Sitka, head of ml, and Jenny Plunkett, senior Developer Relations Engineer.

Bill Pfeifer: Jenny, Daniel, welcome to the show. It's fantastic to have you here. I know we came across you from your O'Reilly book, AI at the Edge, and it looks like you're kind of all over that space of AI at the Edge.

Bill Pfeifer: We like to talk about edge computing here and so much of that is AI that I've been really looking forward to having this conversation.

Daniel Situnayake: Yeah, thank you so much for having us on. It's exciting to talk. Yeah, agreed. Thank you so much.

Bill Pfeifer: Yeah, looking forward to it. So we always like to start by going, can you tell me a little bit about how you got started in

Daniel Situnayake: technology?

Daniel Situnayake: I mean, I was always one of those little kids that's just [00:02:00] fascinated with technology for some reason. Like I remember I used to collect like old broken bits of electronic circuit boards I'd find in the streets when I was a little boy, and then I started programming when I was like eight years old from outdated basic programming books I found in the school library.

Daniel Situnayake: It was just something that I was always drawn to for some, some unknown reason, and I guess I just sort of stuck with it. And how about you, Jenny?

Bill Pfeifer: Yeah,

Jenny Plunkett: I mean, I'm a little bit younger, not so much younger, but my first sort of experience was like just trying to change the templates on my MySpace page. A lot of my early experience with technology was just like learning how to remove the ads from the CSS and the HTML and, but that sort of sparked my interest.

Jenny Plunkett: And then, uh, you know, I went to UT Austin for electrical engineering and here I am.

Bill Pfeifer: Okay. I actually came to it through civil engineering and I, I often feel a little bit out of place because [00:03:00] I talked to so many people who were like, I was too and coding video games. And I'm like, I just, I wasn't interested as a kid.

Bill Pfeifer: I like played with sticks and things.

Jenny Plunkett: That was definitely not me either. Like I was all about just playing with my dog.

Bill Pfeifer: Love it. So what brought you to AI? I mean, that's, that's not a trivial

Daniel Situnayake: topic. So for me, it was a gradual journey. I'd always been kind of interested in the idea of AI, mostly from a sort of science fiction perspective.

Daniel Situnayake: But then as I, you know, went to university, I studied computer networking actually, so nothing even remotely to do with this stuff, but as part of my I did do some stuff around biometrics and auto ID technologies. So essentially technologies for allowing computers to understand what's going on in the real world.

Daniel Situnayake: And that was my first exposure to something sim, to what I do now. I ended up working at a little startup in, in the U S when I first moved over here in like [00:04:00] 2009. And I was building internal tools, but the company was building conversational AI, basically for call centers. So you can have a little conversational chat bot essentially that you talk to on the phone instead of having to talk to somebody in the call center.

Daniel Situnayake: And that was my first taste of working with AI developer tools. And then over the course of the, you know, last 13, 14 years or so, pretty much worked everywhere you could possibly think of in the. Tech industry, front end, back end, doing regular software engineering, there was always this curiosity I had around, first of all, data science, which was starting to get really hot when I was like early in my career.

Daniel Situnayake: And I thought, Oh, that seems interesting. And I gradually kind of migrated towards data science. And that's the net. I'd been working on that for a while and the deep learning stuff started getting big. So I started gradually migrating towards that. And before I knew it, I'd found [00:05:00] myself working on it full time.

Daniel Situnayake: And I had a really lucky opportunity of getting to work on the TensorFlow team at Google. So I was in the right place at the right time when TensorFlow Lite for microcontrollers launched, which is the sort of Google mechanism for running. Deep learning models on really low power embedded devices. It was just, you know, really amazing to be there because I feel like I've always had this curiosity from both sides, like one around embedded and on the other side around, AI and machine learning.

Daniel Situnayake: And those two things have kind of come together at this perfect moment. And I got to be there to help launch a new product around it. So that's how I ended up where I am today, looking at Edge AI. Very cool. And Jenny?

Jenny Plunkett: Yeah, so I did not go to school for artificial intelligence. I did not take any classes about AI.

Jenny Plunkett: However, my degree was obviously electrical engineering and my focus ended up being more so on Edge [00:06:00] devices, because. In the last two years of my degree, I spent doing two internships at Arm in the IoT division at the time, and I worked as a customer, customer support and developer experience team member for Arm EmbedOS, which was the open source RTOS at Arm at the time, which is, it still exists, of course.

Jenny Plunkett: So a lot of my background from my education and from, from my first career experiences was on IoT and edge devices. And I remember I was working at Arm and I was doing support for the RTOS and getting drivers embedded and supporting new targets. And then along comes a colleague, Jan, our CTO, and he and a few other team members on the ARM IoT team were working on getting tiny ML models running on ARM Cortex M devices.

Jenny Plunkett: And at the time I was like, okay, what is, I don't really get it, but I thought it was really cool and it sort of just like stuck in the back of my head. And of course I became really good friends with my colleague, Jan. Fast forward a few years later, I am [00:07:00] leaving my job at Arm and I messaged Jan, I'm like, so this Edge Impulse thing that you started, what's, what's going on over there?

Jenny Plunkett: And of course we only knew each other through working on the IOT unit at Arm. That and Zach Shelby is also from Arm, who was our CTO at Edge Impulse. And I messaged him and I was just like, what's going on? And he was like, come over and be a user success engineer for Edge Impulse. And of course I had no AI background still, but I had a strong background in IOT devices and sporting new targets on the embedded side.

Jenny Plunkett: So it was just a really an interesting opportunity. And here I am three years later, a book and three years working in AI with Dan.

Bill Pfeifer: Very cool. So it sounds like you're both reasonably focused on embedded systems and Jenny. You did mention RTOS, real time operating systems, which are a fairly, fairly significant component of having embedded systems.

Bill Pfeifer: I just wanted to call that out for listeners who may not have recognized that term, because it may come up again [00:08:00] as we continue to talk about embedded systems if they continue to come up. So, moving forward, now you're working together, you were, you wrote a book, so obviously you get along at least somewhat.

Bill Pfeifer: How did you guys come to meet one another? We talked about how Jenny... Came to be at Edge Impulse, but then what brought you guys together enough that you wanted to spend that much time

Daniel Situnayake: together? Yeah. I mean, it was, I wrote a book about TinyML called TinyML with a very, very creative name. But with Pete Warden, who was the guy at Google who launched TensorFlow Lite for microcontrollers.

Daniel Situnayake: And then I'd left Google to come and work at Edge Impulse. Essentially because of this book, Pete and I wrote this book that was supposed to be a kind of basic introduction. Like here's the, the very least amount of information you need to know to get started in tiny ML. And we wrote this book and it ends up being like several inches thick.

Daniel Situnayake: I was just thinking that sounds

Bill Pfeifer: [00:09:00] a little bit like rocket surgery for dummies.

Daniel Situnayake: Yeah, exactly. It's just this vast amount of context and information you need about embedded engineering and then deep learning and. If you think about like the breadth of skills involved, there aren't that many people who cross over like that.

Daniel Situnayake: So you've got like the deep learning side, it's all like writing Python, lots of scripting, data munging, and everything sort of revolving around data. And then on the C side, it's It's just a completely different approach to writing software. You're very concerned with efficiency and memory use and very low level kind of software engineering.

Daniel Situnayake: So it's hard to sort of create something that cuts across all those lines. And so when I saw that Zack and Yarn had founded this company, Edge Impulse, that basically the goal was to abstract a lot of that complexity away. So if you're an embedded engineer, you can focus on the embedded side and [00:10:00] some of the, the ML stuff is abstracted away to the point that you don't have to go and become a Python engineer or do a machine learning degree to tackle it.

Daniel Situnayake: And the same in reverse, if you're comfortable with machine learning and you can train a model, but you have not got a clue how you would end up. Turning it into a C library to run on device, it could do that part for you as well, so instead of needing this giant book. You could kind of throw it away really, and just follow some tutorials and get started.

Daniel Situnayake: So I joined Edge Impulse as a founding engineer, right, right at the beginning. And over the first couple of years, we've got so much experience, Jenny and I working with customers and seeing the kinds of issues people run into and the kind of questions people try to ask when they're developing products.

Daniel Situnayake: And I just kind of realized that there. There's a lot of information about how to get started by then, and there are loads of resources to help you dive in and just start building things and hacking. But in [00:11:00] terms of actually developing a product, and hopefully an effective and good product that uses Edge AI, there are really no resources available at all.

Daniel Situnayake: And so I thought it would be an interesting idea to pitch this, which is essentially like a high level guide to how do you build a product. With these technologies. So it covers like everything from the algorithms that you use and how to build a dataset all the way through to building a team and designing the product itself.

Daniel Situnayake: And Jenny was a natural choice to help with that because she was one of the people in there in the trenches, helping customers do this stuff, which almost no one in the world really has. We kind of buddied up on this book and we're the authors, but it's filled with input from people from throughout our company and throughout the industry, because there's this brief period of time where like no one had really heard of this stuff.

Daniel Situnayake: And there was a few of us out there helping people [00:12:00] build things. And hopefully now we've captured some of that insight that we've hard won and it's available for other people. I

Jenny Plunkett: feel like our first two years was entirely just teaching people what Edge AI was. What is Edge AI? Like, what is the Edge? So many people had different definitions for it.

Jenny Plunkett: I think from Dan, it was just a matter of like, so how did we actually get started working on the book? He put one Slack message in. He was like, Jenny, do you want to write this? And I was like, okay. But it was really fun and it was a great experience. And I'm grateful to be involved. And like Dan said, it was...

Jenny Plunkett: A labor of love, obviously the book utilizes Edge Imple, so it's everyone we knew who was working at Edge Imples at the time was involved as well.

Bill Pfeifer: Very cool. So that kind of takes me into one click deeper on the use case sort of thing, right? In my world, we talk a lot about the Edge can solve business problems, and you do that by applying AI at the Edge.

Bill Pfeifer: I'd love to differentiate my business at the Edge, so I'm just going to install some [00:13:00] AI on some Edge compute devices. That's not enough, right? You've got to have all these pieces in place and not every problem is going to be as achievable or really a good idea to start with, right? You want to pick something that's, that's startable, that's reasonably easy to attack.

Bill Pfeifer: How do customers figure out how to get started with that, right? Like how do they identify. This is the first problem I should attack because it's straightforward, whatever that means with AI at the edge, right? You know, a non trivial, a non trivial set of challenges. How would you recommend customers get started with that first step?

Daniel Situnayake: So I can kind of jump in with a couple of ideas. So there's two parts to what you said. One of them is kind of... How do you identify a problem that could potentially be solved using Edge AI? Basically where Edge AI would bring some benefit. And then [00:14:00] the other part is, how do you pick a problem that is appropriate to be solved using AI in the first place?

Daniel Situnayake: And those are two very distinct things. So with, with Edge AI, there's this really useful acronym. BLERP, B L E R P, which captures a lot of the reasons why you might want to build Duff on the edge rather than having it in the cloud, for example. So that BLERP acronym, it stands for Bandwidth, Latency, Economics, Reliability, and Privacy.

Daniel Situnayake: And those are the five key factors that if you can identify that doing stuff on the edge will bring benefits in one or two of those areas, then it's probably worth thinking about. And if there isn't going to be any benefits in those areas to your application, then it probably is best to avoid because it introduces extra complexity, extra engineering complexity and extra work.[00:15:00]

Daniel Situnayake: So we probably don't have time on the podcast to drill into all of those terms individually, but there's a bunch of... Things online, blog posts and other content that go over those aspects. But for example, bandwidth, a really good reason, a really important reason to do stuff on devices, if you don't have enough bandwidth available to send data to the cloud in the first place.

Daniel Situnayake: And that's actually a really key one because. Uh, in a lot of cases, like if you think of all of the devices out there, all the sensors out there collecting unbelievable amounts of data, our whole world is, you know, highly instrumented at this point. But most of that data can't actually be used for anything currently because there's no connection to get the data from the device to some central place where it might be processed.

Daniel Situnayake: So that's obviously a point where if you can start to do some processing on device, then you can. You're going to be able to start building things that weren't possible otherwise.

Bill Pfeifer: So as we look then at putting AI out to the edge, [00:16:00] what additional possibilities does that create for the way businesses operate in a typical case?

Bill Pfeifer: What are you seeing customers doing?

Jenny Plunkett: One of the coolest, exciting things I'm seeing is just the, in the previous years at Edge Impulse, a lot of the use cases have been specifically around trying to solve a certain problem, like trying to identify a keyword from audio data, or We're trying to identify an object in a computer vision spectrum, but one of the cool things that I've been doing a lot with this year is working with anomalous data and working with models where we don't have the ability to train to detect defects prior to Deploying on the edge.

Jenny Plunkett: So a lot of the things that I'm seeing are, are where we're taking sensor data or image data, and we're creating a nominal model and then using new algorithms and DSP algorithms and ML models. We can then determine defects without needing the anomalous data, which is really, really cool, and it's [00:17:00] something that I've been working on specifically recently with NVIDIA, getting synthetic data, using synthetic data that has the anomalous defects in it, trained in a model, and then deployed onto production devices, before they even get anomalous data collected in the field, which is really interesting to me, but yeah, maybe, Dan, do you have something to add there?

Jenny Plunkett: Thanks, Dan.

Daniel Situnayake: Yeah, cool. And so I think from my perspective, we have seen a lot of really cool applications in a couple of fields, pretty broad fields. One of them is healthcare. So healthcare is actually a really amazing fit for Edge AI because a lot of those benefits, potential benefits like privacy or connectivity or licensee are very well suited to if you're building products to help people with, For example, monitoring a medical condition they might have.

Daniel Situnayake: It's really important that patient data is kept private and people's health care data is kept private. So by doing some kind of [00:18:00] Processing on device, you're able to make sure that we can both understand what's going on, but also protect potentially sensitive information. And the same thing applies in industry.

Daniel Situnayake: So there are a lot of use cases, like a loss is an understatement. There are, you know, almost infinite use cases in industry where there's huge amounts of data being generated from things like manufacturing lines or inside of pieces of equipment. That first of all, the amount of data is just not really feasible to send up into the cloud.

Daniel Situnayake: So now that we have ways of processing it on device, we can suddenly start to make sense of this data and then. Because it's happening on device and there's low latency and it's reliable, we can act upon it straight away. So, for example, if there's some condition, like Jenny mentions, that's anomalous that we detect, you can potentially shut down a production line or alert the operator so that they can [00:19:00] intervene and potentially help with safety or reducing stoppage time.

Daniel Situnayake: The other big thing there is privacy because These types of industrial developers, they don't necessarily want to share their decret source with the rest of the world and sharing live sensor data from in the heart of their manufacturing lines with some kind of third party is very difficult for them, especially real time.

Daniel Situnayake: So it's very good that with this type of technology, they can train models that run on the edge and then they don't have to send data anywhere else and they can be assured that there's not going to be any kind of. Leakage of data or, you know, the, this idea that. If you're pooling data with a third party provider, that third party provider might also be helping out your competitors.

Daniel Situnayake: And it's nice if you're not giving up any of that competitive advantage. For

Bill Pfeifer: sure. So, that's what we can do today. [00:20:00] Actually, my next question, Jenny already partially answered. I was, I was going to ask, what are some recent capabilities that have been added that people may not be aware of, that may not... Uh, that they may not be thinking about yet.

Bill Pfeifer: And Jenny was already talking about training for anomalies when you don't have the anomalies and just making up the anomalies on your own with synthetic data. That's kind of amazing. Is there anything else in that ilk or other things you can do with synthetic data? I mean, there's the spaces of AI are moving forward so fast.

Bill Pfeifer: Most people just aren't keeping up unless they live in that space. So what comes next? Like, what are, what are we about to see?

Jenny Plunkett: Well, this is a, this is a little bit meta, but I'm talking about NVIDIA a lot, but it's just what I'm working on right now. But one of the things that they're working on is we'll be using the generative AI models that they deploy in NVIDIA Omniverse and 3D synthetic data set generation.

Jenny Plunkett: We're going to be using generative AI to help us build a synthetic [00:21:00] data set to train other AI models to deploy to edge devices. So it's like, it's, it's AI clashing on a bunch of. Coming together on a bunch of fronts, but generative AI is one of those things that's really, really exciting. To be able to generate weather in an environment and, and cracks and defects and physics environments and, and not have to do it in the real world at all.

Jenny Plunkett: It's going to save so much money, time. It's going to be safer for people who are deploying these models. It's just, the opportunities are endless, but we're already seeing that. So it's not even in the future, like that's already happening with generative AI as, as it is right now. So maybe, yeah, I can't really see much past the future than that.

Jenny Plunkett: Yeah.

Daniel Situnayake: I think it's really, really important stuff because you might be aware of this, but you might not be aware of this if you don't work building stuff with machine learning. But. Data is actually incredibly expensive and difficult to get, and it's not so much [00:22:00] just any data that's difficult to get, but high quality data that's been labeled, which means it's been annotated with metadata about what it actually represents.

Daniel Situnayake: is really, really expensive to get. If you can imagine a factory production line, it's very easy to put some cameras on there or some vibration sensors or microphones and capture raw data of the production line running. But it's a lot more difficult to then Mark up that data with a description of exactly what was going on at any given time so that you can sort of correlate that with what you see in the data.

Daniel Situnayake: And that typically is the type of data that you need training models. So one of the things that's really cool about synthetic data and, and using generative AI for that is it. Potentially reduces the cost of training a model because instead of having to spend huge amounts of money labelling all this data, if you create the data yourself, [00:23:00] you can, you know, have it implicitly be labelled, you're deciding what the data represents.

Daniel Situnayake: So, the label's kind of baked in and it won't solve every problem, it can't do everything, but it's a really effective tool to have in your arsenal when you're trying to do more with less data.

Bill Pfeifer: That's, that's just, it kind of hurts my head, right? The idea of taking generative AI, which are these massive, massive models that make no sense to run at the edge.

Bill Pfeifer: And using them to generate good, clean, labeled data so that you can train a smaller, simpler, task oriented AI that can run at the edge. And they said AI was complicated. So simple. Just use AI to build AI. No problem. That's, that's super cool.

Jenny Plunkett: Super cool. It's also insane because then you can then systematically and programmatically swap out the environment that you're generating the synthetic data in.

Jenny Plunkett: So it's now... So for reducing the resources and the time and the cost and the physical environment that it needs to be in order to like, make [00:24:00] one model that's really specified for this manufacturing location, work for this other factory that has a completely different floor, lighting system, et cetera.

Jenny Plunkett: It's just really exciting stuff and yeah.

Daniel Situnayake: Yeah, it touches on a point that's really important in edge AI generally, which is the idea of distillation. So, essentially, when you're training a model, you're creating a representation of a data set. Like, you have this big, giant data set, and then you train a model which represents the fundamental essence of that data in a much more compact form.

Daniel Situnayake: You can think of deep learning as a form of compression, almost. And so the really interesting thing, and the way these synthetic data, sort of generative AI based systems work, is they're trained on a gigantic Unlabeled data set, just huge, huge amounts, terabytes and terabytes of data. And you end up with this model that can kind of represent all of that data in the types of forms that it might be encountered.[00:25:00]

Daniel Situnayake: So you could train a model on loads of images, and then you can use that model to generate images that are feasible images that could have been in the original data set. They're like equivalent to things that might be in the original data set. And then when you're doing the synthetic data generation and training a smaller model based on that, it's called distillation because you're essentially taking, it's like distilling spirits like an alcoholic drink.

Daniel Situnayake: You're taking this larger thing and then boiling it down until only the bits you care about remain. So you've got this model that's been trained on, you know, pretty much everything. And then you decide which bits you care about. Generating some synthetic data that represents the bits you care about. And then you can use that to do the thing that you really want to do.

Daniel Situnayake: And it means you don't have to drag this huge model around with you everywhere. We see so much at the moment about generative AI and one of the big challenges of generative AI [00:26:00] being it's really expensive and kind of slow because you've got these gigantic models that are costly to run. So the future, I think on that front is going to look like.

Daniel Situnayake: Two things, on the one hand, we've got distillation improving so that we can take a piece of that knowledge and apply it to a specific problem and therefore reduce the overhead. And then on the other side, there's an increase in capability of edge hardware. So the devices are getting more and more powerful and even very, very low power devices, microcontrollers being built with accelerators and lots of memory.

Daniel Situnayake: So that. They're still really low power, but they can run super fast. Some of these common deep learning algorithms, there's whole new classes of devices that are appearing that are. accelerators that are essentially processors with some extra secret source in the silicon, which can really rapidly run the type of mathematical computations that are involved in deep learning.[00:27:00]

Daniel Situnayake: As these things are coming together, we're going to see so many things becoming possible that were just absolutely, you know, wildest dreams five or 10 years ago. And now it's, it's always a story of software and hardware evolving at the same time and giving you new capabilities and that's really very much what's happening at the moment.

Daniel Situnayake: So this

Bill Pfeifer: is really, as I have a minute to sit and think about this, the idea of using Gen. AI to generate synthetic data that's been labeled. What does that do to the potential size of your data set? Because I would imagine that most data, you know, data management can only be done to so much of a degree and you're not going to get perfectly clean data unless you build it so that it's perfectly clean and labeled.

Bill Pfeifer: This one is good. This one is bad. Here's why it's bad. This scenario is not what we want. This scenario is right. How much What smaller of a data set can you then use to train an accurate AI? I'm thinking in terms of [00:28:00] the power cost of running AI right now is insane. The sustainability aspects of AI are not really making people happy.

Bill Pfeifer: But then if we can start to shrink the dataset, make the model more accurate, shrink the model, make everything more power, that's kind of amazing.

Daniel Situnayake: Yeah, I mean, absolutely. I think one of the things that's really fun for me as an ML researcher is that we build models. It's kind of frustrating working on model development because imagine you're testing a new idea for how you're going to train a model.

Daniel Situnayake: You're testing a new architecture. But it takes 6 months to train and costs 10 million dollars. These are the kinds of problems that the people working with large scale generative AI models are having right now. It's very hard to iterate because it's so expensive and takes so much time. One of the wonderful things about edge AI is we're dealing with very small models.

Daniel Situnayake: So they're very quick to train typically, you can [00:29:00] iterate a lot faster and so you can do a lot more exploration of the possible space of things that you can try. And that's, you know, very satisfying from a research perspective and it, it means you can do things like come up with models that are specific to a particular hardware device, which is very, very cool.

Daniel Situnayake: So if you've got a, a certain accelerator chip, you can find a model architecture that works really, really fast and really, really well on that specific chip. Maybe it won't be as fast on a different one, but that doesn't matter because it doesn't cost too much to do that exploration. So I think that's really cool.

Daniel Situnayake: And dataset wise, you never really escape. So even with synthetic data, you can train a model with synthetic data. It's awesome. It will reduce your costs. It will maybe somewhat reduce the complexity of dealing with large amounts of data, although not really, because you still have to kind of generate all these images, for example, and store them somewhere and pipe them around, and that's a big pain in the ass.

Daniel Situnayake: But the thing that's really [00:30:00] important always with ML project is evaluation. And even if you're training with synthetic data, you still need to evaluate with real world data, because otherwise you have no way of knowing if this thing works in the real world. The, to some extent, the data question is a little bit unavoidable, I would say, but it shifts it downstream a bit, at least, which can be beneficial.

Bill Pfeifer: Okay. Okay. Good perspective. Thank you. Let's switch to the people, because I know you've done a fair bit of work there as well. When we think about AI at the edge, we've got to deploy all of this edge hardware, we've got devices generating data that we have to capture, we have to manage, we have to train the AI and deploy the AI and verify that it's still...

Bill Pfeifer: On track and it hasn't drifted. That's like six different super specialized career paths right there. How do you build, integrate, operate a team that has all of those skill sets and get them working [00:31:00] together so that they're handing the right things off at the right time and, you know, doing all of that

Daniel Situnayake: synchronization?

Daniel Situnayake: Yeah, this is hard stuff. I mean, we can kind of see from the theme that we built at Edge Impulse, it was sort of all these people from completely unrelated disciplines all coming together to build things that no one had really built before and Our hope is that over time that need is going to decrease and there are some surprises though.

Daniel Situnayake: So platforms like ours at Edge Impulse is essentially an end to end platform. The goal is that you can go through the whole machine learning workflow from training models through to Evaluating them and then deploying to device in a single tool. And so that kind of levels the playing field a little bit.

Daniel Situnayake: So everyone can at least build something. And our assumption at the beginning was that it was going to lead to teams where, okay, you've got some embedded engineers who know how to do [00:32:00] the embedded application development, and now they can use Edge Impulse to do the machine learning bits and. Suddenly gives them a whole set of new capabilities.

Daniel Situnayake: And that, that was true. And there are plenty of teams like that, that build stuff with our products and with other competing products. What we didn't anticipate so much is also that people who have a really deep background in machine learning and deep learning still want to use these types of tools because essentially, the alternative, which is DIY, where you're bringing together all these different tools from all around the Open source world and from proprietary libraries from device vendors and things like that.

Daniel Situnayake: It's an absolute headache. Sometimes even getting these kinds of tools to run on the same development machine is impossible due to conflicts between dependencies and things like that. So the team composition, I think it really varies depending on the company and depending on [00:33:00] the team and depending on the stage.

Daniel Situnayake: You can at this point get something working with relatively basic skills in, you know, one category or the other, but you're going to find that there are benefits to upskilling in different areas. So if you've got a team with a deep learning engineer, they're going to have insight and knowledge that accelerate your development in that part of the workflow.

Daniel Situnayake: If you have a really strong embedded team, they're going to accelerate development on that side. If you've got really great DSP engineers, they might still use the same kind of general tools as everyone else, but they're going to have insights and wisdom that other engineers might not have. So, I think it's, it's really good to have that knowledge and if you've got that organizationally then you're going to benefit from it even if you're using the sort of more modern and sophisticated tooling that we have today.

Daniel Situnayake: But in addition to the engineering, you also have to think a [00:34:00] lot about how do I know I'm building the right thing? How do I know I'm building a good product? And what can I do to make sure that I stay on course in that regard? So I think that's something we might talk about a little later around responsible and effective AI.

Daniel Situnayake: I'd say that's really important in team composition as well, so you need to have domain experts who really know your subject matter area better than anyone else. If you're building a healthcare application, you better have someone on your team who is a, you know, world leading expert in the thing that you're trying to build.

Daniel Situnayake: You can't just wing it with domain knowledge. And same with having stakeholders involved from different parts of the context. You need to have people in the loop who are potential users of your product. If you're building an industrial monitoring system for factories, you better be talking to potential customers who own or manage factories and production lines, you need to have their feedback the whole way through in order to build an effective [00:35:00] product.

Daniel Situnayake: You'll need to have input from business stakeholders in your own organization, and it can be useful to have the input from outside parties who are experts in ethics and responsible AI as well. So really, you can go as deep and as broad as you can with the team, and the more you do, the better your product's going to be and the less risk there is.

Daniel Situnayake: But obviously that comes at a cost. So like with any engineering product, it's about finding a balance. And you had

Bill Pfeifer: said something in there about upskilling. I was thinking about upskilling, reskilling, and how many companies are going to need to have upskilling and reskilling plans and programs internally just to fill the gaps in this broad set of skill sets.

Bill Pfeifer: How much do you see moving forward? Are companies likely to need to do that? Or do you think the tooling is catching up or the vendor, the vendor world, right? Like Edge Impulse has built [00:36:00] a platform that you intend to be end to end capable. So is that enough that you see Edge Impulse growing, other vendors coming in as well, filling in those gaps, so that Customers who want to build edge AI don't necessarily have to have that full range of skill sets, whether it's professional services or built into the platform, or, you know, does that just become sort of an outsourced and automated thing, or does it have to be insourced somehow?

Bill Pfeifer: Is that part of the planning?

Jenny Plunkett: We're basically not at a point where ChatGPT can completely get rid of the need for people on our customer side to integrate the Edge Impulse library into their end product. Like, AI can't replace completely everything that is needed, so we still need embedded engineers, we still need AI people who are, Down in the trenches with their data because our customer is the best source of knowledge on how their data works and what they need in order to make their product functional.

Jenny Plunkett: Maybe in a few years, we could just ask AI a [00:37:00] question about AI. What AI do I need for this AI problem? I don't think we're fully there yet. And we still need people to copy and paste the code from ChatGPT into the IDE, so...

Bill Pfeifer: What's your job, copy paste engineer?

Daniel Situnayake: Yeah, I would say that what we've seen really is that these capabilities that tools like Edge Impulse grant, the best people to use, those are the people who already have their boots on the ground in these fields.

Daniel Situnayake: So, we've seen very successful projects where... Machine learning experts and domain experts have picked up embedded machine learning tools and been able to start working with them and build with them. We've seen very successful projects where embedded engineers have used the tools to train deep learning models, but then, you know, they bring the embedded expertise to the table and saves them a lot of time with building their application and integrating all of the moving parts and understanding intuitively [00:38:00] how it needs to work in the context of an application.

Daniel Situnayake: I think it's, it's kind of a good situation now where all the tools and learning materials and things like that are out there that people can learn on the job and be able to start working with these technologies without having to go and take another degree or hire someone completely new. I think it's reached a stage of maturity that we just weren't at five years ago, for example, where you needed super deep, super specialized knowledge.

Daniel Situnayake: That knowledge does really help. So if you've got it, then fantastic. But if you don't have it, it's something that you can do. As part of your job or in a little bit of learning on the side, and it's pretty fun and interesting stuff. So I think become a lot more like other parts of engineering at this point.

Daniel Situnayake: Whereas five years ago, ML, it was almost like you had to. Dive into this entire new universe on the side, [00:39:00] which has no real touch points with day to day software

Bill Pfeifer: engineering. Okay, so first and foremost, you have to have the domain expertise of what you're trying to solve. AI will help you do things faster, whether you know what that is or not.

Bill Pfeifer: And it will help you be wrong faster as well. But it sounds like the platforms, the tooling has come far enough that... AI is becoming more of a tool. And I don't want to say less of a career path because it's still a huge career path, but for using it as opposed to building the next algorithms and such, maybe we're getting to the point that it becomes an available tool for enterprise

Daniel Situnayake: customers.

Daniel Situnayake: Exactly. It's making the transition. I think it's made the transition at this point from purely research discipline to... an engineering discipline. And the research side is obviously still there and very, very busy and very, very productive. Huge amounts of research are going on, but there's also huge amounts going on in the [00:40:00] productionization side and people taking the fruits of research and applying it to solve real problems.

Daniel Situnayake: If you think a few years back, if you were working in deep learning, trying to build a product, guaranteed you would have to read a bunch of scientific papers to help you with your project. You'd be sitting on archive reading academic papers and trying to struggle through the kind of superfluous bunch of equations that they throw in.

Daniel Situnayake: Nowadays, I think you can build the vast majority of projects without looking at a single research paper. Because there are tools available, there are kind of battle tested production systems that you can integrate. So that's a huge improvement from my point of view. Cool.

Bill Pfeifer: And you did sort of tease earlier a responsible AI.

Bill Pfeifer: I thought it was a little bit funny when Gen AI came out, it was just endless possibilities. And then people were horrified when it gave some wrong answers. Ah, people give wrong answers too. It's okay. And [00:41:00] figured out that it was trained on copyrighted material and it sometimes spat back little pieces that were copyrighted.

Bill Pfeifer: Ah, people do that too. It's okay. And so clearly training AI responsibly just has massive. Downstream consequences that take a while to come out, and it can be a really expensive mistake. How do we plan ahead to avoid that?

Jenny Plunkett: Actually, I'll just have like a, like a little anecdote. I was recently training a model and it got through multiple stages of collecting the data, training the model, uploading to Edge Impulse, doing the bounty boxes.

Jenny Plunkett: And it was all at the Pepsi can, multiple Pepsi cans and multiple Pepsi can logos. And then at the very end of it, one of us goes, Hey, wait, isn't Pepsi can logo like copyrighted? And we probably should have used that in our demo project. So like, it's just, it's just, these things have just, even though that wasn't really like a non ethical issue, these, they just slip through the cracks so easily.

Jenny Plunkett: And that's why it's really important to [00:42:00] really have multiple people on the team who are, who are still watching over what's happening and Before things go into production, because... As much as AI is super useful and it's been awesome for the development of so many production products so far, we still need humans to be watchdogs of it and say, hey, either that's not ethical or that's not good or we need legal behind that first.

Jenny Plunkett: So we're not completely there yet in terms of automating everything of this responsible AI aspect.

Bill Pfeifer: Right. So kind of expanding ethical up to responsible of, you know, don't use copyrighted material and stuff like that. What constitutes ethical AI, responsible AI? I mean, if done wrong, someone's going to ask, who got to decide what was ethical for that AI?

Bill Pfeifer: How do we do it so that nobody even asks that question? Because it's just good.

Daniel Situnayake: So to me, it all comes back to this additional term, effective AI. What is our goal when we're building a product? Hopefully, if everything's aligned correctly, the goal is [00:43:00] to build a product that's effective. It does the job well.

Daniel Situnayake: And if the product is not meeting the ethical expectations that society has, then that means it's not doing its job well. If the product doesn't work properly for certain people or in certain situations. Then it's not doing its job properly, so it's not effective. So all of this stuff which I think engineers can be quick to dismiss as like wishy washy kind of lovey dovey sort of stuff, actually it's all about building a product that does what it says on the tin.

Daniel Situnayake: You want to build things that actually work for the thing that you're selling them for, and if they don't... Your company's going to fail, you're going to lose your job, you're going to create things that you're not proud to stand behind as an engineer. So all of this is, is about building things that really work.

Daniel Situnayake: And the way to do that, as Jenny mentioned, is just having people involved who can look at this and understand whether it really works. And you [00:44:00] need to have a process that allows those voices to be involved from the very beginning. Along each stage of your project. So it's not just a process of building a bunch of stuff, throwing it out there at the end and seeing if anyone realizes that it's ineffective once it's in production, you want to be able to catch these issues before you get there.

Daniel Situnayake: So you need to have stakeholders analyzing your work regularly and understanding where it's at and how that maps onto the goals. You need to have MLOps systems set up. So essentially systems that can track. The progress and the artifacts and all of the things you're producing as you're training these models and iterating on your work, you need to track all of that in some kind of system, preferably an automated system.

Daniel Situnayake: That means you can see whether things are working as well as they were in the past and continually evaluate. And if you do release something into the world, you [00:45:00] need to be able to go back and find which model, which version of which model was released. So that you can understand why it may or may not be working for certain people and you can improve it.

Daniel Situnayake: It's all about processes and those processes are just geared towards making sure the product is effective. So essentially it's not really any different from any other engineering product, but it's one that for some reason people find very easy to push off to the side. And I think it's a job for somebody else who's an ethical thinker rather than a serious engineer or whatever.

Daniel Situnayake: Really, it's part of your job and if you're not good at it, you need to be good at it.

Bill Pfeifer: I like that idea of collapsing it all into the single metric of effective. Does it do what it needs to do? Is it as fast as it needs to be? Et cetera. But also it's avoiding copyrights and it's ethical and all of that stuff.

Bill Pfeifer: Just kind of throwing all of those challenges. Into one bucket that makes a lot of sense and it gives it, gives it a much more concrete feel. [00:46:00]

Daniel Situnayake: Yeah. And of course it doesn't help if your organization is misaligned from the beginning. So if you're working for a company that has nefarious goals, if you're not trying to build a product that really helps your customers, but you're trying to build a product that rips them off or that does something that, you know, doesn't really work.

Daniel Situnayake: But you're going to sell it as something that is effective. This process isn't going to help. That's hopefully what the courts are for, I think, at some point.

Jenny Plunkett: One of the things we haven't really talked about, but it's one of those things that we can't control. But one of the chapters that we have in the book is about wildlife monitoring.

Jenny Plunkett: And one of the big conversations me and Dan had about when writing this book was like, Ooh, do we really want to put this in the world because it could immediately be used for really nefarious purposes. Even though the intent of the chapter was to, you know, save endangered species and, you know, improve the world on an ethical level.

Jenny Plunkett: But you can imagine all of the nefarious ways that the end [00:47:00] user could use it. So, one of the things that you're thinking about when you create AI models is like, how could it potentially be misused? Even if you do have good intent and positive and world building intent, like... One of the responsibilities, I think, engineers and people even thinking about AI need to take into account is, like, all the possible ways it could be used negatively and to reduce the quality of our world rather than help it.

Jenny Plunkett: And unfortunately, this may be an impossible task. Like, maybe we can't think of all the ways that it could be used. And at a certain point, you just have to trust your end user, but that's a big

Daniel Situnayake: item. Yeah, and this is why it's so important to have all these stakeholders and experts from diverse fields and backgrounds involved at the beginning and all through the process, because there are people who are going to be able to tell you, hey, wait a minute, this could be potentially misused with the wildlife camera example.

Daniel Situnayake: If you have somebody who's involved with Wildlife protection from the [00:48:00] beginning of the project helping you out, then they might instinctively know like, Hey, this technology could be repurposed by poachers to harm wildlife. And you might not have thought about that as an engineer who doesn't have a background in that field.

Daniel Situnayake: So every single. Application area is going to have examples like that where, hey, this could be misused, it could be misapplied, you know, it could be used in a way that isn't how it was intended and cause problems. Or maybe you're just not seeing the entire picture. So you need people who have that ability to see the entire picture.

Daniel Situnayake: No individual can see the entire picture. So it follows that you need a team of people.

Bill Pfeifer: Yeah, that's fair. And you want to make sure that you don't build something that is beautiful, that then turns into something ugly because it's misused. That would be disappointing.

Daniel Situnayake: Exactly. Just because something's cool doesn't mean it's going to be a good product or that it's made the world better by it existing.

Daniel Situnayake: And I think that's [00:49:00] something that... People forget in the technology space. It has to be effective.

Bill Pfeifer: That's part of effective, I guess. So what are you most excited about that's in front of you for the next, say, couple of years? Do you have more collaborative projects, new technologies that you're starting to play with?

Bill Pfeifer: What gets you going

Daniel Situnayake: in the morning? Cool. So for me, I think it's just so exciting always to see the types of companies that are building stuff with Our tools, like we build developer tools, we don't build Edge AI products at Edge Impulse, we build systems that people can use to create them and so one of the side effects of working here is I just get to see all these amazing things that are coming down the pipeline that people are building and what's happening at the moment is that because hardware is getting more capable and we're reaching, like I was saying earlier, this kind of intersection of...

Daniel Situnayake: Models getting more capable, hardware getting more capable. [00:50:00] We're starting to see things that, you know, nobody really thought would be possible on the edge. Dealing with real time video, super low latency stuff with video. Essentially giving hardware devices the ability to perceive and make decisions at the same rate that human beings can.

Daniel Situnayake: So, I think we're, we're about to step over this gap where suddenly all these new applications and use cases become possible that no one's really thought of yet. And what I, I'm most motivated by is being surprised by, wow, that's crazy. I never would have thought of that, but these people have applied the technology we work on to a completely new problem and are solving some amazing thing using capabilities that have only just started to exist.

Daniel Situnayake: And how about you, Jenny? I

Jenny Plunkett: guess what I'm excited about personally is not necessarily related towards, you know, I guess it is related towards AI in the, in the field, but one of the big things that have been really [00:51:00] firing me up this year is I've been having the opportunity. To get involved really in the AI for good space.

Jenny Plunkett: And one of the, one of these things has been traveling around and going to conferences, listening to people who are working on ethical AI and ethics and working on legislation for making sure AI is used responsibly in the world. But one of the other things is just like getting to work with these wildlife protection companies who are, who are using the, trying to use ML to protect endangered species in various environments and things and.

Jenny Plunkett: That just gets me really fired up, you know, it's not where the money is, but it's, it's where improving the world is. And I'm really excited in the future to keep working with these companies and getting to speak at their events and, and see how we as Edge and Pulse can help them out.

Bill Pfeifer: Very cool. Help them out with synthetic data and watching blurp and making sure AI

Daniel Situnayake: at the Edge is effective.

Daniel Situnayake: Precisely.

Bill Pfeifer: Cool. All right. This was a wonderful conversation. How can people find you online and keep up with the latest work that you're up

Daniel Situnayake: [00:52:00] to? So I'm personally weaning myself off Twitter. So that was my go to and I'm dancity on Twitter, but I actually just created a sub stack and I'm doing a, I don't know how often it is, newsletter, a couple of times a month, maybe Sort of the more out there things in edge AI, like what is this technology doing to our world?

Daniel Situnayake: How is the world maybe gonna look in a few years as this technology evolves and becomes more prevalent? So feel free to come and read my sub stack. I have one post so far. I'll share the link with you. And then if you're interested in having a really good view of just what's going on across Edge AI, I'd sign up, follow Edge Impulse on Twitter or LinkedIn.

Daniel Situnayake: And we have an amazing social media team who pull together all the news from the whole industry. And So we've got really a kind of super enlightening feed of all the [00:53:00] latest and greatest stuff that's happening. Not super hyper technical, it's stuff that no matter what your background is, you can see what's going on and get excited about it.

Daniel Situnayake: But there is some really technical deep dive stuff. Every so often as well.

Jenny Plunkett: And going off of that, we do also have a very thriving community. So if you have any, if you want to talk about this, what's other innovations in the space, you can go on our forum or DM us on Twitter or on our YouTube channel. But you can follow me at Twitter or I guess it's X, X.

Jenny Plunkett: com, Jenny M. Plunkett. I'm also on LinkedIn, not super active on either, but if you shoot me a message or tweet me, I'll, I'll reply.

Bill Pfeifer: And you said you were DevRel, so you're probably pretty active with at Jim Pulso's communities as well.

Jenny Plunkett: Yeah, definitely, but not under my name, of course, usually just under the Edge Impulse account.

Bill Pfeifer: Fantastic. Cool. Well, thank you so much for the time, for the perspective, and for bringing us up to speed on AI at the Edge. This was a great conversation. I appreciate it.

Daniel Situnayake: Yeah. Thank you, Bill. And thanks [00:54:00] everyone for listening.

Jenny Plunkett: Thank you.

Narrator 2: That does it for this episode of Over the Edge. If you're enjoying the show, please leave a rating and a review and tell a friend.

Narrator 2: Over the Edge is made possible through the generous sponsorship of our partners at Dell Technologies. Simplify your edge so you can generate more value. Learn more by visiting dell. com slash edge.