A CBC News investigation has found at least 2,500 copyrighted books written by more than 1,200 Canadian and Québécois authors were shared online as part of a massive — and now defunct — dataset used for artificial intelligence training and research purposes.
The dataset’s existence and general highlights were revealed earlier this year in The Atlantic. It led to an avalanche of writers expressing shock on social media that their work had been included without their permission and sharing their concerns that AI tools could use information from the dataset to generate content in their distinct artistic voice.
A CBC News analysis of the dataset,