NVIDIA has reportedly scraped large amounts of copyrighted content for AI training, according to 404 Media. The company is alleged to have directed employees to download videos from YouTube, Netflix, and other sources to develop commercial AI projects.
These training efforts were aimed at creating models for products such as its Omniverse 3D world generator, self-driving car systems, and “digital human” initiatives. To avoid detection by YouTube, NVIDIA allegedly used virtual machines (VMs) with rotating IP addresses to download content, thus evading bans.
Neither the individual video creators nor YouTube’s owner, Google, consented to this data scraping. Employees who raised ethical and legal concerns were reportedly informed by their managers that the practice had been approved at the highest levels of the company.
“This is an executive decision,” Liu wrote to a hesitant underling on one such occasion, according to Slack messages observed by 404 Media. “We have an umbrella approval for all of the data.”
In addition to YouTube and Netflix videos, NVIDIA is said to have instructed workers to train its AI model on various other sources. These include movie trailer database MovieNet, internal video game footage libraries, and GitHub video datasets such as WebVid and InternVid-10M, which contains 10 million YouTube video IDs.
Some of the data reportedly used by NVIDIA was only meant for academic or non-commercial use. For instance, HD-VG-130M, a library of 130 million YouTube videos, includes a license specifying it is only for academic research. The company allegedly dismissed concerns about these academic-only terms, asserting that their data batches were appropriate for commercial AI products.
In a statement to 404 Media, NVIDIA claimed that its AI training practices fully comply with copyright law. The company likened their practices to a person’s right to “learn facts, ideas, data, or information from another source and use it to create their own expression.”
(Source: 404 Media)
Follow us on Instagram, Facebook, Twitter or Telegram for more updates and breaking news.