Microsoft AI head Mustafa Suleyman says anyone can use information on internet to train their AI models for free

Microsoft AI head, Mustafa Suleyman believes that everything available on the internet is free to use to train AI models. During an interview with CNBC, Suleyman downplayed concerns about AI companies accessing intellectual property. He believes that freely copying and using publicly available content has been the norm online since a long time.

He said, “I think that with respect to content that’s already on the open web, the social contract of that content since the ‘90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been ‘freeware,’ if you like, that’s been the understanding.”

He further stated that unless a publisher or news organisation explicitly request to “not to scrape or crawl” their content other than indexing, AI companies can use it to train AI models. He said, “There’s a separate category where a website, or a publisher, or a news organization had explicitly said ‘do not scrape or crawl me for any other reason than indexing me so that other people can find this content.’ That’s a grey area, and I think it’s going to work its way through the courts.”

Back in May this year, eight American newspaper publishers lodged a lawsuit against Microsoft and OpenAI, accusing the companies of unauthorised usage of their articles in artificial intelligence products. The publishers, which operate major newspapers including the New York Daily News and the Chicago Tribune, allege that Microsoft’s Copilot assistant and OpenAI’s ChatGPT have been using their copyrighted articles without permission or payment.

In addition to this, the Center for Investigative Reporting (CIR), the non-profit organisation behind Mother Jones and Reveal, also filed a lawsuit against tech giants Microsoft and OpenAI, alleging unauthorised use of their copyrighted material to train AI models. This legal action follows similar lawsuits filed by The New York Times and other media organisations. CIR accuses OpenAI and Microsoft of scraping their journalistic content without permission or compensation to bolster the capabilities of AI products like

The Microsoft AI head stated that unless a publisher or news organisation explicitly request to ‘not to scrape or crawl’ their content other than indexing, AI companies can use it to train AI models