Pushing IoT Analytics to the Edge

November 29, 2018
By: Special Guest
David Simmons, Senior Developer Evangelist, InfluxData

The IoT is generating enormous streams of real-time data that needs to be efficiently collected, processed, and analyzed. But data transport costs – and latency issues – make sending all data to a central collection point increasingly untenable. Pushing data collection to the edge can reduce costs, but only if the edge solution is capable of more than just collect-and-forward.

As IoT Architectures mature and evolve the ability to both collect and analyze sensor data has become mission critical. But forwarding all data to a central repository – be it on-premise or cloud-based – becomes more and more impractical as the number of sensors and the amount of data grow. Pushing data collection and analysis further to the edge is becoming a requirement for IoT deployments.

With most data platforms, it simply isn't practical, or even possible, to push collection AND analytics to the edge. There just aren’t enough resources on your typical edge device to support most data collection and analytics packages.  You need to build a practical, efficient and scalable data collection and analysis gateway device for IoT deployments, and you need to have highly efficient software to run on it. 



You’ve come up with your IoT deployment plan, and you’ve run tests on a small number of sensors writing data to your back-end system, but you’re worried about what the performance will be when you hit full roll-out.  It’s one thing to begin collecting data from a handful, or even a few hundred, sensors. It’s an entirely different situation when you roll out thousands, or even tens of thousands, of sensors.

       David Simmons 

While most everyone speaks of the IoT in terms of the number of devices that will be connected over the next 5 to 10 years, the real challenge will come in the amount of data that those devices generate. Each one of those 10 – 20 billion devices (depending on whom you believe) will begin to generate anywhere from 1 to dozens of data streams. And those data streams will then continue forever. Something has to be able to collect, store, analyze, visualize, and react to all of that data as it is generated.

IoT deployments are struggling to find an efficient, scalable, data platform that meets all of these criteria.  Below are some suggested tips and suggestions for what to consider but there are some basic criteria for what an IoT data platform must be able to do.

First, let’s look at some requirements for handling IoT data.

IoT data MUST be:

The IoT is what we’re doing to instrument the physical world around us in order to make accurate decisions on everything from environmental regulation to transportation logistics. If your data is timely – recorded and processed in near-real-time – then how can you use to make informed, real-time business decisions?

Your data must be accurate, and the platform on which you record and store your data must be capable of handling the large volumes of incoming data. If your data isn’t accurate, your decisions certainly won’t be.

You must be able to make decisions and take actions based on your data. If you’re collecting vibrational analysis data on your machinery, but not acting on the anomalies in order to increase the efficiency of your operations – or to perform predictive maintenance tasks – then collecting the data isn’t really providing any benefit no matter how timely and accurate it is.

In addition to having these attributes for the data in your platform, there are some physical attributes that your edge device will also need to have.

If your IoT data platform is only deployable in the cloud, or the enterprise datacenter, what happens when your back-haul network from your edge devices goes down? Do you really want to pay the network charges to transmit that much data?

If your IoT Data Platform is only deployable at the edge, then how are you going to integrate it with your upstream data systems? What other tools will you need to integrate to get comprehensive data collection and analysis across your entire deployment? And how much will all that cost?

Your IoT data collection platform – at the edge – must also be fairly robust physically. Your Platform must include:

Just because it’s an edge device doesn’t mean you should have to compromise on the flexibility of having data storage, analytics, dashboards, and alerting available there.

Not everyone knows what a Time Series Database is, or why Time Series data is different than other types of data. This leads some – and even some other database vendors – to believe that any database that can store a timestamp can be a Time Series Database. Nothing could be further from the truth.

You see, in time series data, the time is not just a component of the data, it is a critical component of the data. Time series data can be categorized as “reading@time” or “measurement@time”. And those value scome in a never-ending stream. Time series data is not updated, or used in transactions as in a normal RDBMS. Time is a component of every datapoint, and of every query.

If you’re using a SQL or NoSQL database for storing your Time Series data, you’re simply using the wrong tool for the job. It would be like using a screwdriver to pound in a nail. Yes, you can do it, but it’s less efficient, and doesn’t produce results as easily.

That definition of Time Series Data above is also a definition of IoT data. As we use the IoT to instrument our world, we are essentially taking sensor readings at regular intervals of time – sometimes down to the mill- or nano-second level – and storing those for analysis. IoT data is Time Series Data.




Edited by Maurice Nagle

Original Page