Machine Learning, Data Access, Development Bottlenecks & SiaSearch
Recently our CEO, Clemens Viernickel, sat down with Mark Kelly of Alldus to talk about all things data and how that has shaped SiaSearch over the past year. Here are the major takeaways from the discussion:
What is the unstructured data problem SiaSearch is focused on?
A lot of companies e.g. automotive OEMs, collect massive amounts of sensor data. Currently the major focus remains on doing some analytics on this data to understand what is being collected. However, a much bigger problem is that of interaction and access i.e. whenever engineers want to interact with these raw data assets, they have to download the batch of data and then look at it by hand. This is because of the absence of any abstraction layer between the engineer wanting to work with the data and the data. In many other domains this abstraction layer is crucial for a rich user experience, like Google search which uses metadata on websites to make them searchable. However there is no such thing for sensor data (video, image etc.). SiaSearch tries to create this abstraction layer to really connect engineers to raw data a lot more efficiently and have them avoid spending all of their time dealing with the actual files.
Why are companies failing to productionize Machine Learning?
A big part of this failure is because a lot of entities have been trying to throw machine and deep learning problems at unfit problems. Also, a lot of entities have underestimated what it takes to robustly put ML into production. A general idea is that all you need to do is collect some data, train a model and run it on real world use cases. However, it is a lot more complicated. This is especially true for vision based tasks because vision data is very heavy and unstructured. It requires proper data handling infrastructure and then beyond data there are a host of challenges. This is why a lot of startups are coming up in the MLOps space to offer these solutions to help get ML models to production faster. These startups range from data management, like SiaSearch, to training pipelines to production monitoring.
How will ML and specifically machine vision develop over the next few years?
The future of machine learning, especially machine vision is very exciting. This is only the beginning of a diverse and impactful set of industrial use cases ranging from robotics & autonomous driving to medical imaging. Many different applications are starting to be used on a day to day basis in these domains. It’s heartening to see this adoption across the board and not just at tech giants like Google & Facebook. However, at the same time it is alarming to see how slow the process of ML adoption has been, for example, two years ago everyone though autonomous vehicle would be here or that doctors won’t be looking at by now. This slow progress has a lot to do with very high benchmarks for models to ensure their robustness. Which is why it is increasingly important that tools like SiaSearch take away the cumbersome data handling work to let engineers focus on producing highly robust models.
How does SiaSearch fit in that future?
SiaSearch aims to become the go to abstraction layer for engineers dealing with frame level vision datasets across any industry collecting vision based sensor data. This is because it is clear that the problem of unstructured data access isn’t just prevalent in the autonomous vehicle space but also many other domains e.g. general robotics, drones in agriculture, retail etc.
Interested in trying out the SiaSearch platform for yourself? Request a demo or get started now.