OLOG is licensed
data sets for
machine learning

OLOG is licensed
data sets for
machine learning

OLOG is licensed
data sets for
machine learning

olog mission

olog mission

/

olog mission

Robust, high quality datasets are necessary for the success of artificial intelligence systems. So far, the recent wave of transformer-based models has thrived on scraped data from the internet.

Robust, high quality datasets are necessary for the success of artificial intelligence systems. So far, the recent wave of transformer-based models has thrived on scraped data from the internet.

Robust, high quality datasets are necessary for the success of artificial intelligence systems. So far, the recent wave of transformer-based models has thrived on scraped data from the internet.

Now, with legal copyright considerations coming to a head—as well as scraping reaching its limitations—it is the time to find an alternate solution to data acquisition and provision for the training of artificial intelligence models. Anthology is a protocol that collects, catalogs, and licenses training data.

Through the platform, humans submit data and earn rewards paid in the $OLOG token, the native currency of the Anthology ecosystem. Datasets are licensed through burning $OLOG tokens and creating non-fungible, time-restricted training agreements. Data is stored through a hybrid fault tolerant peer-to-peer and centralized file storage system.

The Chairman

Architecture

The Founder

Technical Design

The Operator

Designer

The Chairman

Architecture

The Founder

Technical Design

The Operator

Designer

The Chairman

Architecture

The Founder

Technical Design

The Operator

Designer

Anthology

Anthology

/

Anthology

OLOG as the name for this project came from the core of AnthOLOGgy. Anthology is an incentives-alignment protocol designed to create high quality datasets for the proliferation of artificial intelligence. Through the creation of a cohesive token ecosystem, Anthology incentivizes humans to generate, filter, and provide quality control for datasets. So too does it create a clear line of provenance for data creation for the future usage of sustainable licensing and legal precedent.

OLOG as the name for this project came from the core of AnthOLOGgy. Anthology is an incentives-alignment protocol designed to create high quality datasets for the proliferation of artificial intelligence. Through the creation of a cohesive token ecosystem, Anthology incentivizes humans to generate, filter, and provide quality control for datasets. So too does it create a clear line of provenance for data creation for the future usage of sustainable licensing and legal precedent.

OLOG as the name for this project came from the core of AnthOLOGgy. Anthology is an incentives-alignment protocol designed to create high quality datasets for the proliferation of artificial intelligence. Through the creation of a cohesive token ecosystem, Anthology incentivizes humans to generate, filter, and provide quality control for datasets. So too does it create a clear line of provenance for data creation for the future usage of sustainable licensing and legal precedent.

Data Challenges

Data Challenges

/

Data Challenges

Data is necessary for the linear relationship between increased model performance as a function of compute and data (Kaplan). Despite this, the availability of large-scale human feedback datasets remains scarce — AI firms scrape and host proprietary data and provide little to no transparency on its origin. Without vast, high-quality data, even models with substantial compute and pre-training will be insufficient for building the next expansion of AI models. Many empirical challenges arise from using a fixed dataset for AI model training, too. Bias, hallucinations, and sometimes critical security flaws arise as a result from the size of modern pre-training datasets being larger than any comprehensible measure that a human could reasonably parse. Generative AI models also pose reflexive training problems: without a clear understanding of data provenance, it is impossible to tell whether data has interacted with AI systems or comes from human sources.

Data is necessary for the linear relationship between increased model performance as a function of compute and data (Kaplan). Despite this, the availability of large-scale human feedback datasets remains scarce — AI firms scrape and host proprietary data and provide little to no transparency on its origin. Without vast, high-quality data, even models with substantial compute and pre-training will be insufficient for building the next expansion of AI models. Many empirical challenges arise from using a fixed dataset for AI model training, too. Bias, hallucinations, and sometimes critical security flaws arise as a result from the size of modern pre-training datasets being larger than any comprehensible measure that a human could reasonably parse. Generative AI models also pose reflexive training problems: without a clear understanding of data provenance, it is impossible to tell whether data has interacted with AI systems or comes from human sources.

Data is necessary for the linear relationship between increased model performance as a function of compute and data (Kaplan). Despite this, the availability of large-scale human feedback datasets remains scarce — AI firms scrape and host proprietary data and provide little to no transparency on its origin. Without vast, high-quality data, even models with substantial compute and pre-training will be insufficient for building the next expansion of AI models. Many empirical challenges arise from using a fixed dataset for AI model training, too. Bias, hallucinations, and sometimes critical security flaws arise as a result from the size of modern pre-training datasets being larger than any comprehensible measure that a human could reasonably parse. Generative AI models also pose reflexive training problems: without a clear understanding of data provenance, it is impossible to tell whether data has interacted with AI systems or comes from human sources.

03

Journal

03

Journal

03

/

Journal

News, announcements, and plans from the OLOG Foundation.

News, announcements, and plans from the OLOG Foundation.

News, announcements, and plans from the OLOG Foundation.

04

Contact

04

Contact

04

/

Contact

OLOG is just begining. If you would like to join, here is how.

OLOG is just begining. If you would like to join, here is how.

OLOG is just begining. If you would like to join, here is how.

Phone:

Phone: