Blog

Automate your data pipeline with end to end API integrations

Laura Kier, VP of Growth
December 2, 2022

Scaling AI - in particular, shipping models faster and iterating with the newest data - is top of mind for all of our clients. In order to truly scale an AI system, you need to create a scalable data pipeline, where data flows seamlessly and can quickly be used to build, iterate, and test models. That’s why we’re announcing a new end to end API solution that helps AI teams automate their data pipelines and integrate Centaur Labs data annotation capabilities into AI development life cycles.

Introducing end to end API integration

We’re introducing new APIs that allow teams to seamlessly import data, create annotation tasks, set gold standards and get labeling results entirely through our API tools. All data and annotation types are supported. 

Programmatically send data for annotation and get results

When AI teams begin data annotation projects, they often have data ready to annotate today, and will also have additional data to annotate in the future. Perhaps they’re using data collected as part of a clinical trial, and clinical sites share data with them as it becomes available. Or maybe they are a medical device company, and some of their hospital system customers send them data in monthly batches. Or maybe they’re working with patient generated data, and have a steady stream available through an application. Whatever the circumstances, the team will have access to more data over time. 

Before, if you had new data available on a Tuesday, and more on Thursday, you would need to spend time manually importing that data twice in the Centaur Labs platform. The same manual processes would need to be taken to download the final labeled data results as well. Now, with API support, you can programmatically have data sent to the Centaur Labs platform as it becomes available in your Amazon S3 bucket. Once you specify which project and task to add the data to, it will be sent to the labeling network, labeled, and the results available for download via our API anytime.

Quickly and easily set up annotation tasks 

Teams often want multiple types of annotations throughout the model development lifecycle. Perhaps they’re building a model that leverages multiple data types, or as their model improves and their training initiatives become more targeted, they want additional annotations on the same dataset.

Instead of opening the Centaur Labs platform and manually creating a new task, you can now use our API and create the new task right from where you’re already working. You can specify your task type - whether classification or segmentation - write your task prompt and, if needed, a set of answer choices to be shared with the labeling network, upload Gold Standard examples and assign unlabeled data to the task. 

Clear and simple documentation

An API is only as good as its documentation. From the start, we’ve built our APIs with robust documentation, allowing for:

  • Centralized documentation. It is the single source of truth for how the Centaur Labs API behaves. It also lives within our broader documentation so you can easily access other ‘how to’ guides and read about Centaur Labs concepts.
  • Developer-friendly tools. We provide full API reference as well as how-to guides containing static code snippets. 
  • Connection with the community. If you have a comment or think there may be an error - suggest an edit or write in our forum. We are building these tools for you, but also with you, and we want to hear your feedback.

Learn more

Do you want to automate your data pipeline so you can iterate on your models faster? Integrate Centaur Labs data annotation capabilities into your AI development life cycle more easily with our new APIs.

Our new APIs are generally available today to all customers globally.

Schedule a demo to learn more!

Related posts

May 1, 2020

How multiple opinions drive huge gains in data labeling accuracy

How Centaur Labs leverages multiple expert opinions to create the most accurate medical data labeling platform for text, image and video data

Continue reading →
August 1, 2020

Building a scalable and accurate medical data labeling pipeline

Examine the unique challenges with medical data labeling, the relative lack of accuracy produced by traditional data labeling methods, and discover a more accurate and scalable alternative

Continue reading →