This article is a collaboration between IPSA and alt+Law. alt+Law is a student organisation based in NUS Law, centred around legaltech and the growing intersection between Law and Technology. If this interests you, you can access their webpage at http://altlaw.xyz/about-us
Written By: Shaktivel ARUMUGAM of the IPSA Core Team
Edited By: Denise THIA; YANG Zhuoyan of the IPSA Core Team, YE Yang of Alt-Law
Ever chanced upon an interesting thumbnail on Youtube, clicked on it, but only to be greeted by a sad-looking red emoji bearing the message “This video is no longer available due to a copyright claim by…”? That is the work of Youtube’s Content ID, an algorithm created specifically for the identification of potentially infringing works uploaded into the platform.
WHY IS YOUTUBE INVOLVED IN THE DETECTION OF COPYRIGHT INFRINGEMMENT?
As an aside, it must first be clarified that copyright infringement is a matter of law that can only be ruled by the courts. This means that when a third-party platform such as Youtube deems content to be an infringement of existing copyright, what the offending content is guilty of is more accurately described as plagiarism. Briefly, plagiarism is an ethical concept, which according to Oxford Dictionary, is concerned with the “wrongful appropriation” and “stealing and publication” of another author’s “language, thoughts, ideas or expressions”. On the other hand, copyright infringement is a legal concept that affords protection to original works through a host of remedies and is subject to the available defences in legislation or common law. While plagiarised works can be guilty of copyright infringement, this is not necessarily always the case.
Youtube’s (and other third-party platforms) efforts at copyright policing can be explained by the Notice and Takedown procedure mandated by the Digital Millenium Copyright Act (‘DMCA’) in USA. Online service providers are prima facie liable for the dissemination of copyrighted material. However, legal immunity is only conferred if they satisfy the conditions of the “safe harbor” regime provided by the DMCA. One of which requires online service providers to “promptly remove or block access to infringing materials after copyright holders give appropriate notice”. Third-party platforms such as Youtube face potential liabilities for copyright infringement by rightsholders of content if they fail to expeditiously remove or block access to the content after receiving notification from a copyright holder that the content is infringing (citation: US DMCA, s 512(c); Singapore’s Copyright Act s 193D(2)(b)(iii)).
In an era of unprecedented content generation, it becomes difficult to have human oversight over all potential claims of copyright infringement. Therefore, third-party platforms rely on algorithms to sieve out potentially infringing works before they are disseminated throughout the platform. As mentioned earlier, a prime example of this is Youtube’s Content ID.
HOW DOES CONTENT ID WORK?
Youtube states on its website that Videos uploaded to YouTube are scanned against a database of files that have been submitted to them by content owners [and] [w]hen a match is found, the video gets a Content ID claim. This means that every video uploaded onto Youtube is automatically scrutinised by the algorithm with minimal human intervention. Once a match is detected, copyright owners have four choices: (1) mute audio that matches their music; (2) block a whole video from being viewed; (3) monetize the video by running ads against it; or (4) track the video’s viewership statistics. On the other hand, the uploader of the offending work (the ‘Offending Party’) is also given options: (1) acknowledge the claim; (2) if the claim is for a piece of music in the video, choose to remove the song without having to edit and reload; (3) swap out the allegedly infringing song with a free-to-use song; (4) share revenue with the copyright owner; or (5) dispute the claim.
Should the Offending Party choose to dispute the Content ID claim, copyright owners can in turn respond in the following ways: (1) release the claim; (2) uphold the claim; or (3) take down the video. If the copyright owner elects to take down the video. a copyright strike is given to the offending party. Consequently, not only is the video removed from being accessed by the public, but the Offending Party’s account becomes tarnished and they lose certain privileges, specifically the ability to monetise content. One way an Offending Party can respond if they believe that their video constitutes fair use is via the counter notification mechanism offered by Youtube.
ADVANTAGES OF THIRD-PARTY COPYRIGHT ENFORCEMENT THROUGH ALGORITHMS
A distinct advantage offered through the use of artificial intelligence to detect copyright infringement would be convenience and efficiency. Due to the sheer volume of content being uploaded onto the site daily, ensuring that each and every video does not violate copyright is a costly endeavour which requires huge manpower. Keeping costs low is not only beneficial to the enterprise but also to users of the platform, as it ensures that the said platform continues to remain free to use.
Youtube’s Content ID is also beneficial because it facilitates content legalization through licensing. Copyright owners are presented with the option to monetise instead of choosing to take down a video. Accordingly, the offending party can continue to generate economic benefits from their content although a share of it will go to copyright owners. This process is expedited with the automatic matching capabilities of AI (as compared to the rights holder physically identifying a video as infringing and then Youtube brokering a license). This is certainly a mutually beneficial option compared to the traditional take-down remedies offered by the courts.
DISADVANTAGES OF THIRD-PARTY COPYRIGHT ENFORCEMENT THROUGH ALGORITHMS
However, there are disadvantages that should be highlighted.
There are instances where ContentID causes an inefficiency and results in content creators losing out on monetized views of their videos. Under the DMCA, a copyright holder must consider the existence of fair use before sending a takedown notification under § 512(c). A rampant issue arises where the technology fails to adequately sieve out what we might deem ‘fair use’ is in the realm of parodies. The algorithm follows a mechanical process which indiscriminately matches the videos uploaded with Youtube’s database, thereby inevitably capturing parodies and satires.
Even though such videos would clearly qualify for fair use under the current regime, they are nevertheless targeted by Youtube’s Content ID, which in turn triggers a long process that a content creator has to go through to monetise his/her video. This is a clear instance of overenforcement which has a chilling effect on content creation and diminishes opportunities for meaningful conversations in the relevant creative sphere; an effect which is antithetical to copyright law’s aims of promoting creativity.
The Content ID system is also susceptible to abuse. Although contractually, before making use of the Content ID system, Youtube stipulates that copyright owners have to warrant that they own exclusive rights to the reference work, this has not been sufficiently enforced. Consequently, users can exploit this weakness and upload as much content as possible which they might not legally own and then flag subsequent videos as belonging to them. The alleged Offending Party will then have little choice but to share in the revenue generated with such exploitative users. GoDigital Media Group is a well-known example.
There are also questions about accountability and transparency. Very little is known about how the algorithm flags videos. The criteria used by the AI remains unknown and this is worsened by the fact that the algorithm is self-learning – which means that even its creators might not have the faintest idea of the workings of Content ID. This introduces an element of arbitrariness which is unjust because it leads to unpredictable yet extreme outcomes. This is especially so since YouTube content generation is increasingly taken up as a sole and full-time job by many individuals. Therefore, the unfair removal of their videos would negatively impact their livelihoods.
So, can we trust these algorithms to regulate copyright infringement or should they be subject to greater scrutiny? It cannot be denied that such algorithms offer technical advantages such as increasing ease and speed of detection. Further, by shifting more of the burden of enforcement to online service providers, copyright owners have much more accessible targets for liabilities and no longer have to grapple with the difficulties associated with suing individual primary infringers, such as cross-border enforcement, or simply seeing legal costs outweighing the remedies that may be obtained.
In conclusion, the increase in digitisation of content leads to a greater prevalence of the use of detection algorithms. For instance, Spotify is also developing its own anti-plagiarism tool to detect copyright violations. Much closer to our hearts, academic circles have also long accepted the use of the Turnitin anti-plagiarism software which continue to strike fear among students. Thus, it is clear that such algorithms are here to stay for a long time to come.
 Digital Millennium Copyright Act, 17 U.S.C. § 1201 (2012).
 https://law.stanford.edu/wp-content/uploads/2016/10/Accountability-in-Algorithmic-Copyright-Enforcement.pdf. (page 511)
 Patrick McKay, YouTube Copyfraud & Abuse of the Content ID System, FAIRUSETUBE http://fairusetube.org/youtube-copyfraud [https://perma.cc/82TYLKM8].
 GODIGITAL MEDIA GROUP, http://www.godigitalmg.com [https://perma.cc/B36RX8Z7]
IPSA is a student-led interest group and the above article is a reflection of the author’s opinions, and not professional legal advice for any matter. All errors are the author’s own.