Donate to Science & Enterprise

S&E on Mastodon

S&E on LinkedIn

S&E on Flipboard

Please share Science & Enterprise

High Quality Data Essential for Training A.I. Models

– Sponsored content –

Human machine interface

(Gerd Altmann, Pixabay. https://pixabay.com/illustrations/digitization-particles-smartphone-7261158/)

The recent explosion in new uses for artificial intelligence or A.I. has generated more attention to the models or algorithms that power these applications. Most of these new applications use a form of A.I. called machine learning, but their success depends on the soundness of the data that feed the models.

Machine learning, as the name implies, uses algorithms that learn, or adjust their outcomes as they encounter more and more data. As the algorithms process more data, their calculations should become more refined and precise. For example, some A.I. medical diagnostics analyze images captured from echocardiograms that display blood flow and heart mechanics in real time. For these algorithms, the wider the range of heart conditions and disorders shown in the images, the greater the scope of the algorithm and the likelihood of returning a true-positive or true-negative diagnosis.

Generative A.I. performs projections from the data used to train the algorithms, sometimes taking the form of text or images that meet user specifications. In these and many other machine learning models, raw-data inputs need context to maximize their usefulness to data scientists. Adding this context to raw data is a process called data labeling, considered a key step in training machine learning algorithms.

While some data labeling processes can be automated, human intervention is often needed to ensure the data that trains machine learning algorithms accurately reflect real-world conditions. Data labeling is nonetheless a process used occasionally by algorithm developers, and thus outsourced in many instances. As a result, choosing a data labeling service requires model designers to look into the organizational structure and management of their outsourcing partners, much like they would hire their own professional staff.

When reviewing data labeling services, consider factors such as relationship of labelers to the company — direct-hires or crowd-sourced contractors — and presence of on-site management. In addition, security and integrity of data in the hands of the outsourcing partner need to be guaranteed. Visit the web site of Innovatiana for an example of a data labeling company that meets these specifications.

*     *     *

Comments are closed.