Source: Zephyr Net

4chan: 4chan and other web sewers scraped up into Google's mega-library for training ML

Problematic, racist, and pornographic web content is seemingly being used to train Google's large language models, despite efforts to filter out that strata of toxic and harmful text. An investigation by The Washington Post and the Allen Institute for AI analyzed Google's immense public C4 dataset, released for academic research, to get a better understanding [...]

196

Est. Annual Revenue

$5.0-25M

Est. Employees

25-100

Founder

Christopher Poole

CEO Approval Rating

86/100

4chan is an image-based bulletin board where users can post comments and share images.