Investing in Data-Centric AI: Kern AI's Unique Approach to NLP Models

why we invested in Kern AI

The AI startup ecosystem has received a significant boost from recent announcements by Microsoft, OpenAI, Google an others. With a blend of overhype and lots of genuine potential, specifically the natural language processing (NLP) landscape has advanced considerably and gained significant attention from investors, the media and the general public. In contrast to conventional software that solely relies on code, AI systems are driven by a combination of code and data. Upon closer inspection, it becomes evident that the code-part of the combination as mentioned above is progressively becoming commoditized. This shift has prompted developers ever searching for ways to improve their AI systems to redirect their focus toward enhancing the quality of data employed for training AI models instead of (only) iterating on the code.

So, what drives us to make an investment in this space?

At xdeck, we are on the lookout for founders who are “in love with the problem, not the solution.” Moreover, we are in search of unique products or services that can effectively address a genuine pain point of the user groups they are designed for. That is why we initially worked with Kern AI (back in the day onetask.ai) as a member of our Sailors Batch in 2021 (xdeck batch #3), long before they offered open-source developer tools within the data-centric AI domain. Curious about the reason behind our investment in Johannes Hötter, Henrik Wenck, and the rest of the team? Keep reading to learn more!

What is Kern AI, and which problem are they solving?

Kern AI offers a platform that caters to developers who follow a data-centric approach when designing NLP models. Their open-core solution facilitates a developer’s process by focusing on the underlying data, enabling data management and tracking, (semi-)automated data labeling and annotation, as well as task orchestration for collaborative purposes, either with other developers or (external) data and labeling providers.

A special characteristic of the kern-platform is its modular design.

refinery

the heart of the platform

The foundational layer of the platform is the Kern-refinery, both database and application logic editor. refinery allows to automate the process of data cleaning and labeling and shows where improvements to the data quality can be made. It also allows teams to easily work together with either inhouse or external annotators.

bricks

the fuel of the platform

bricks is a collection of modular, open-source code snippets to enrich texts. Via integration to refinery, developers can pick and choose enrichments like profanity detection, translations, address extraction or sentiment detection from the content library, modify them as needed, and run them how they see fit for their projects.

gates

enabling realtime processing

gates allows to process real-time data streams to make predictions on data immediately. Users can thus not only create training data, but also predictions to make operational decisions on the data they are processing.

workflow

the orchestrator of the platform

workflow is the orchestration tool putting refinery into action. It can be used for chain extraction, transformation, and loading tasks (ETL). With an initial set of integrations available, users can, for example, load textual data from spreadsheets or inboxes and automate full workflows that understand natural language.

Since their open-source launch in the Q3/2022, refinery and bricks have already reached several thousand developers.

Why we invested in Kern AI

When we met the founder team back in 2021, scouting for our accelerator batch #3, we were impressed by their vision to revolutionize AI. Having previously delivered several consultancy projects in this domain, we had the strong conviction that Johannes and Henrik were onto something when they showed us their initial version of a no-code platform to democratize access to AI. What truly set the two apart, however, was the resilience they proved when they had to pivot and basically start from scratch. Having received strong signals from customers and experts alike that their solution back then would not work, they were quick to pivot from the initial platform they had built towards a far more specialized use case around data annotation and towards the way narrower target group of developers. Being developers themselves, Johannes and Henrik are always at the forefront of customer and user research and community building. The two of them are never not cutting a corner to make sure that they deliver superior quality to developers when taking any product-, business or user experience-related decision.

This developer-obsession is also reflected in the way they build their product. The open-core version of refinery allows single users access to the most important features, and their open-source library bricks gives access to heuristics and enrichment code snippets they use themselves. These tools are definitely scratching an itch of developers as one can see from the sheer engagement of the open-source community. And also in first enterprise applications, users can deliver significant efficiency gains already achieving close to 85% of labeling automation while at the same time increasing data quality by a significant amount.

Having worked with a great team on a great product in a great market, it only made sense for us to deepen our collaboration with Kern AI when Johannes and Henrik raised their seed round.

“We were fortunate enough to work with the xdeck team prior to the investment. It was clear to us that the team has a very founder-centric approach and that we would receive first-class support as a portfolio company. This is exactly what happened and we are very happy about it!” - Johannes Hötter, Kern AI

We at xdeck, are proud of how far Kern AI got and are happy to be a constant supporter of the whole team.