Tictag is a Singapore-based data collection and annotation startup, with a unique business model built on crowdsourcing and gamifying the data annotation process. The company and its product are democratising data labelling and cleaning for all businesses leveraging AI.
In this memo, we will share insight on how we gained conviction to invest in the ambitious team. as part of Tictag’s US$1.3m pre-series A round. We are pleased to be investing alongside Southeast Asia early-stage investors M Venture Partners, East Ventures and Sam Gibb, partner at Resolution Ventures and Endeavor Ventures.
We are excited to announce our newest investment in Tictag, and its exceptional founding team of CEO Kevin Quah, COO Lee Jin and CTO Yihang!
You’re most likely familiar with the Google reCAPTCHA, or as it’s more lovingly called, the “Completely Automated Public Turing test to tell Computers and Humans Apart.”
The CAPTCHA’s definition is quite literal. The tools’ express purpose is to detect malicious behaviour or fraudulent activity.
But there is another, not so obvious, intent. Have you wondered why it is always to do with traffic, and specifically on identifying which boxes have traffic lights, motorcycles, street signs, etc.?
The answer: Waymo, Google’s self-driving car. Data from the reCAPTCHA is funnelled to enrich the Waymo ML object identification algorithm. This is the risk analysis engine that drives (excuse the pun) the “brain” behind Waymo cars for obvious safety requirements.
Google is harnessing our collective knowledge—billions of people who use their products—to “tag” objects and give them an identity. This is the epitome of crowdsourcing, and is an opportunity only available to a few companies, due to the necessary scale. These companies, including Google, have the reach and the resources to provide services to their massive userbases, and can extract benefits—without much friction— to clean and prepare data for internal projects.
More recently, researchers have turned to the Zooniverse for help. Here, scientists have a portal to crowdsource data cleaning for niche research projects. It is one thing for a researcher to do one image stack but another to do a thousand (as someone who used to be in a lab where segmentation was a common occurrence, this is a real bottleneck and one that crowdsourcing could help solve). This appeal is a philanthropic one, it relies on volunteers who have a personal affinity to advances in medicine, biology, materials science, agriculture, zoology, space, etc.
But what about other businesses or projects that have massive datasets but lack the capital, human resources, or philanthropic lens to crowdsource data cleaning? Enter Tictag.
Tictag is a platform that enables businesses to upload datasets for the “tagging community” to help in data tagging and identification. The purpose of Tictag is to lower the barriers of crowdsourcing, giving accessibility now to any Machine Learning (ML) business or project that deals with large chunks of untagged data.
Where we gained conviction to invest in Tictag was our excitement in the following: the market opportunity, the platform and engine itself, and the founders’ enthusiasm matched with their experience.
“No code, plug-and-play” tools are becoming increasingly common in the ML space. As such, the demand for data sets and AI models from businesses looking to optimise their processes will continue to increase. That said, not all AI models will be immediately beneficial because the accuracy of AI models is limited. Models are only as accurate as the quality of data (even more so than the quantity of data). Bottom line is the cleaner the data, the more accurate the model.
The purpose of data preparation is to get the data as clean as possible before feeding into the model. By crowdsourcing data preparation through a community of taggers, Tictag enables businesses access to good clean data for more accurate AI models.
In this, Tictag expands access to clean data models, and by doing so expands the market of businesses able to leverage the power of AIML.
There are many types of data. Everyday consumers mostly interact with data at the multimedia level: text/numbers, graphics, video, audio, animation, etc. When we think of data preparation for AI models, we think of the tedious manipulation of tables and databases required for computers to parse and interpret.
Tictag have developed a platform that enables businesses to design and map out that preparation and cleaning journey in a step-by-step process. The problem is broken into bite-sized chunks appropriated to what each step achieves along the data’s cleaning journey.
What got us excited was that these bite-sized chunks are then given a front-end that appears to gamify data cleaning. Essentially, disseminating what was a messy bottleneck into a fun and enjoyable activity for many.
Here is an example of a use case
CompanyX is a financial services company that has bulk datasets of unstructured data taken in through its voice chatbot. The company would like to run the datasets through an ML model to optimise the chatbot’s responses to various customer requests and responses.
Through Tictag, CompanyX designs a cascading data preparation method to first improve the accuracy of audio transcription to text, and then second to clean the syntax. The data is grouped according to the type of question, e.g., a frequently asked question regarding one of its online banking services.
Taggers are presented with two activities with multiple similar tasks. By breaking down tasks into 5–7-minute activities, taggers can complete these data cleaning tasks at their own time and pace. In the first activity, they are asked to check that audio transcriptions are correct word-for-word, and if not, to suggest corrections. In the second activity, the audio transcriptions are checked and correctly for accuracy of its punctuation placement.
By completing the above activities, taggers earn badges and coins that can be used to redeem exclusive rewards and benefits. They are also paid a sizeable portion of the cost to the business for helping businesses clean their data.
From a social perspective, this opens a new income stream for those who are time poor, for those are unable to work more “traditional” jobs due to disabilities, or those who just have spare time on their hands. Their social impact potential has not been unrecognised; Tictag is an accredited Business for Good Social Enterprise Member at raiSE (Singapore Centre for Social Enterprise). They are currently partnering with Autism Association (Singapore), Singtel Future Makers and persons with disabilities through their Sustainability Team and outreach programs.
It goes without saying that the success of an early-stage business is strongly determined by the calibre of the founding team. Co-founders Kevin, Jin, and Yihang met while working at a prominent B2B AI startup in Singapore and have a great relationship and dynamic. The business went on to raise a US$25m Series B in 2019 and now employs over 125 staff. In addition, as early employees, all the founders were considered a part of the core founding team.
This showed to us three things: first, that the founders have domain experience in the AI space; second that the founding team have worked together as a team in fast-moving and high-growth environment; and third that the founders have experience creating a business that has found product-market-fit.
In addition, Kevin, Jin, and Yihang complement each other’s skillsets in business development, project management and software engineering, respectively. They are a team we see to be enthusiastic, ambitious, and coachable.
The trifecta generated by the product, the business model, and the team grounded our decision to invest in Tictag. We are excited to partner with Kevin, Jin, and Yihang as they continue to grow Tictag and democratise data cleaning by leveraging the community at large.
Learn more about Tictag, and view open roles on their team.