We often talk about the possibility of copyright infringement when discussing legal issues surrounding data or text mining. Often, the method for text mining is conceptualized as making copies from a database and then mining the copies on a new server. While that is one way to mine, it is certainly not the only way. Another way to text is to mine the database itself, instead of copies of the database.
When considering whether to advise our faculties and researchers on whether they can text mine a database without explicit permission of the publisher or administrator, we must examine the Computer Fraud and Abuse Act (CFAA) as well as copyright law. If we do not conform to the CFAA requirements, it is possible that our libraries and faculty members may face civil, if not criminal liability.
Without consent of the database administrator, mining may be a violation of the CFAA, specifically the hacking section of the act, which has been codified at 18 USC 1030. For context, this is one of the statutes that Aaron Swartz was charged with before his death in 2013. The CFAA is both a criminal and civil statute. This means that the federal government can either prosecute charges under this statute or private individuals or companies can bring a civil lawsuit for monetary damages. Courts have broadly interpreted 18 USC 1030(a)(4), which states that it is a violation of the statute to:
knowingly and with intent to defraud, accesses a protected computer without authorization, or exceeds authorized access, and by means of such conduct furthers the intended fraud and obtains anything of value, unless the object of the fraud and the thing obtained consists only of the use of the computer and the value of such use is not more than $5,000 in any 1-year period;
Courts have defined exceeding authorized access by including text and data mining as a violation of the Terms of Service, which has included text mining a public facing website. At this point, no one has been convicted for text or data mining, as a criminal manner, although this is possible.
In both the 1st and 2nd Circuits, the courts have found that plaintiffs could sue and win for text mining as a hacking standard under 18 USC 1030. In both cases from both circuits, the defendants had used robots to access public facing websites and publically available information. In the 1st Circuit case, EF Cultural Travel BV v. Zefer Corp, Explorica, used a robot, or a “scraper,” to crawl EF Cultural Traveler’s website, so Explorica could undercut EF Cultural Traveler’s already low prices. The robot scraped and downloaded the prices (and only the prices) of the flights from EF Cultural Traveler’s website for two years prior to litigation by reviewing the publicly accessible HTML Source code. EF Cultural sued Zefer for downloading from their website, and the court granted a preliminary injunction.
In the 2nd Circuit case, Register.com, Inc. v. Verio, Inc., defendant, Verio, used a robot to submit WHOIS queries to multiple domain name registrars, including the plaintiff, to determine whether new domain names had been purchased. Verio would then solicit the purchasers for website design. Register sued alleging, in part, that Verio had violated 18 USC 1030 when the robot crawled the server. The 2nd Circuit affirmed the district court’s injunction against the plaintiff as the case went to the trial.
Court-issued preliminary injunctions are not final dispositions in cases but do have a high bar. Courts should only grant preliminary injunctions when the person requesting the injunction has demonstrated a substantial likelihood of success on the merits. As can be seen above, the courts have found that civil liability is possible under the CCFA. The CCFA is a criminal statute, and court could find criminal liability as well, although no court has. One requirement is that the infringer must do their action “with intent to defraud.” While this may be a bar for ultimate recovery or a defense to criminal litigation, this might have to go to trial for a jury to decide.
Last year, the House of Representatives introduced House Bill 113-2454, otherwise known as “Aaron’s Law” which would have made violations of terms of service not a violation of the CFAA. However, that bill died in subcommittee during the last Congressional section. Some members of Congress, including Senator Wyden, believe that the law could allow for terms of service violations to be violations of CFAA without clarification in the CFAA.
No court has decided whether faculty members, researchers, and libraries and would be immune under CFAA, and nothing in the statute guarantees that they would be. To advise faculty members to “mine away” without mentioning these issues is irresponsible and may land your library and faculty member in court, if not jail. If you have any questions about this issue or any other, please come see us in the Scholarly Communications Office!