Copyright Exceptions and Data Mining

On July 5th, 2018, we had the opportunity to interview two esteemed individuals within the field of copyright—Dr. Jane Secker and Mr. Chris Morrison. Dr. Secker is a senior lecturer in educational development at City, University of London and Mr. Morrison is the copyright software licensing and information services policy manager at the University of Kent. Both Dr. Secker and Mr. Morrison sit on the Universities UK / GuildHE Copyright Negotiation and Advisory Committee and are co-founders of the UK Copyright Literacy blog.

To provide some context to their responses, two key elements about European law should be noted. First, on June 1st, 2014, an exception was made to UK copyright law through the implementation of the Copyright and Rights in Performances (Research, Education, Libraries and Archives) Regulations 2014. This included the removal of barriers for Text and Data Mining (TDM) for non-commercial purposes. Second, the European Union (EU) is in the process of modernizing its copyright laws, and recently had a vote on the proposal for a Directive of the European Parliament and of the Council on Copyright in the Digital Single Market, which was rejected and will be revisited in September of 2018.

We asked Dr. Secker and Mr. Morrison the following questions:

  • What are some examples of the kinds of data mining that researchers should legitimately be able to do?
  • What legal barriers stand in the way of this and, if you could, tell us about the proposed exception for TDM that’s proposed in Europe.
  • Given the current controversies about data mining by social media companies and political consulting companies, privacy issues have risen to prominence. How would the proposed copyright exception intersect with privacy law and what types of research would not be permitted given European privacy regulation?

According to Mr. Morrison, the right to read should be the right to mine. Dr. Secker reiterated this notion and also stressed the importance of being able to legitimately mine various forms of data, whether it be full text subscription databases, abstracts, digitized collections, social media content, etc.

The legal barriers identified by Mr. Morrison include the various licensing terms, terms and conditions of websites, and differing laws around the world on data mining that are often very complicated even for researchers to fully grasp. Further, many researchers find themselves under pressure from external sources—such as those funding the research—to openly license the data set, which can be troublesome especially if the researcher is working in collaboration with a commercial organization. According to Dr. Secker, TDM has been recognized in UK law since 2014, and it is not something that is currently available as a copyright exception in other European countries. This makes it difficult when working in partnership with others who may not have similar legal restrictions on how they can interact with specific datasets. Also, the barriers are not just legal—they can be technical as well, especially when considering factors such as Digital Rights Management (DRM) protection. This poses a conundrum for copyright—on the one hand, there’s an exception that indicates your ability to engage in TDM, but there are also technical limitations such as DRM or other technical protection measures which may prevent you from obtaining access.

When it comes to research, Dr. Secker reminds us that there are pre-existing ethical codes of practice that researchers must adhere to. For any researcher working in the field of copyright or TDM, they would have to get ethical clearance before conducting their research. Mr. Morrison also reminds us that intellectual property laws are not implemented for privacy purposes, but to incentivize creativity and investment in information goods. Privacy concerns are a separate issue from copyright and it’s important to keep them separate when addressing them.

See below for a transcript of the interview (transcript has been edited for clarity and readability).

What are some examples of the kinds of data mining that researchers should legitimately be able to do?

Mr. Morrison: Well, I think they should be able to mine legitimately acquired sources of data, specifically subscription databases that academic institutions subscribe to. In our view, and the view of many information professionals, we have paid to get legitimate access, and we should be able to run computational analysis and algorithms on those datasets in order to understand the facts and the underlying patterns within that information source. But also, beyond that, anything that has value to pure research, whether that be science, social science, or even humanities, anything where new knowledge can be created, and new understandings can be created after the information source, that should be something that researchers should be able to do without having to go into a very complex and potentially expensive process of getting additional permissions. In summary, the right to read should be the right to mine.

Dr. Secker: The only thing I would add to this is that the law should cover data in all sorts of formats. It should cover full text subscription databases, but the researcher might be mining abstracts as well, such as the case in large scale systematic reviews, so it should cover abstracts, image data as well, where you’ve got digitized collections. In my previous role as the copyright and digital literacy adviser at the London School of Economics (LSE), we had historical sources that had been digitized and they were mainly image-based, although some of them had been converted to text, but being able to mine all sorts of different data is crucial to researchers, and there was a lot of interest in this from researchers.

What legal barriers stand in the way of this and, if you could, tell us about the proposed exception for TDM that’s proposed in Europe.

Mr. Morrison: Well, I think the legal barrier to this from the perspective of the researcher is the numerous licensing terms, terms and conditions, and different laws that for most people are very complicated and worrying. So, the area of research that Jane and I are most interested in is how copyright is perceived and how it’s experienced by those involved in research and education. In our experience, most of them are under a lot of pressure from many different sources that have funded to make their research available in certain ways to publish on an open access basis. At the same time, there are ethical concerns that they have to abide by and therefore copyright and associated rights, such as database rights, are just another aspect of a great many things that they have to make sure they get right and it’s something they find hugely complicated. Questions such as what is commercial and what is non-commercial can also become a barrier when they’re working with other partners in what could be regarded as commercial organizations.

Dr. Secker: We’ve had TDM in UK law since 2014 [https://www.gov.uk/government/news/new-exceptions-to-copyright-reflect-digital-age], which obviously, other European countries don’t have at the moment. So, if we might want to work with a partner that is outside the UK, and the fact that this would be harmonized as something across Europe, it would help for those kinds of projects because at the moment, it is only something we’ve had for four years in the UK and there’s still been quite a lot of difficulty getting the message out there that it is something that is permitted. The barriers aren’t necessarily legal; a lot of them are technical, so they could be related to things like DRM. That has caused some problems in examples I know of where, essentially, databases or some kind of web-based source will have some sort of mechanism to stop you from downloading the amount of data that you need to perform TDM and if they use DRM, then you get into quite a difficult situation legally because you can’t circumvent the DRM because that’s illegal to do. So, what takes the precedence? You’ve got an exception that says you’re allowed to do TDM but if you’ve got a DRM on there in some form and you need to apply to have it taken off, you can’t just sort of hack into the system, which would be a way around it. But the kind of issue about Europe I think is significant that, where it’s a project that might be working across more than one country, having that exception only in the UK, I think it’s potentially meant that there haven’t been large-scale projects to look at from a sort of European level yet.

Mr. Morrison: Yes, and also to add at the European level that question about DRM or Technical Protection Measures (TPM): we’re obviously part of a process and there’s been some developments today on what’s happening with that final vote that’s going to the vote in September [https://www.bbc.com/news/technology-44712475]. But there are potential worrying provisions in there around fixing that situation with the TPM in law so that there is no way to kind of get around that at all even at a local level. Jane has had the experience of referring a potential TDM example to the UK Intellectual Property Office because we wanted to remove the TPM, and that’s possibly going to be changed at the European level which would make that impossible to do. Also, the European proposal which is to limit it to research institutions only could be problematic where we are working, as I mentioned earlier on, in partnership with other organizations, that will potentially limit what researchers can do.

Given the current controversies about data mining by social media companies and political consulting companies, privacy issues have risen to prominence. How would the proposed copyright exception intersect with privacy law and what types of research would not be permitted given European privacy regulation?

Dr. Secker: This is an interesting question. I think in terms of social media data for example, I’ve run into a number of situations about using social media in research, how to sort of harvest data out of Facebook and Twitter particularly. There’s a lot of interest from researchers in doing new types of research and I think one of the things to remember is that there are ethical codes of practice that already exist. So, the Association of Internet Researchers have a strict code of conduct if you’re doing this type of research where privacy and the use of personal data is really clearly considered. I had a number of examples where people would come, often Ph.D. students, where they might have harvested data out of blogs or from social media and a lot of this came down to informed consent and what that means when you are taking data that somebody’s put out on the web. It doesn’t mean it’s fair game to do what you want with it. Obviously, there are huge concerns at the moment with changes to data protection, that privacy should somehow trump copyright and become the kind of thing that we always have to be mindful of. But, I think for any researcher that’s working in this space, they would be getting ethical clearance and I think privacy would be a massive concern. I think if you’re doing a project that involves a very sensitive area, perhaps you’re using a hashtag exposing people’s identity and things that they say as individuals; that’s just kind of unethical from the start really.

Mr. Morrison: Yes, I think when having conversations with people about how to overcome the potential barriers that intellectual property laws provide, the conversation often turns towards privacy, and people will say well, does copyright stop me from doing this in order to protect people’s privacy? I think we’re very clear that intellectual property laws are not there for privacy purposes; they are there to incentivize creativity or the investment in information goods, and the recent General Data Protection Regulations (GDPRs) do create a challenge for researchers using TDM. For example, if they decide they have lawful access to an information source which involves lots of personal data, they would be allowed to do that under copyright law or database rights and the TDM provisions certainly in the UK, but they wouldn’t necessarily have permission to use that personal data for a secondary purpose. For example, to provide their dataset to somebody else to then go and look at and draw their own conclusions because that original data subject would only have given their permission for it to be used by the original service, the original party that had taken it. So, researchers have this issue, but in a way that’s a separate issue from copyright and it’s quite important I think to keep those separate when addressing them.

Dr. Secker: But I think it is about looking at the data while getting ethical clearance. Just because you’re not talking to individuals and interviewing them or getting the data from a questionnaire because you might be doing some sort of large scale mining of something like Twitter, it doesn’t mean that those people’s identity are fair game to be sort of reproduced completely un-anonymized. But it is something people that do social research, I think if they’ve moved into this space and they haven’t done research using these types of sources before, it’s something you can cover in research training and that was certainly what we were trying to do in my previous role. We ran a couple of really successful workshops where we got them to understand what the legal issues were, but really importantly what the ethical issues were with using that type of data.

Recap of the 35th session of the IGC

Rice field in Madagascar (Photo: UN Photo/Lucien Rajaonina).

From March 19th to 23rd, 2018, the World Intellectual Property Organization’s (WIPO) IGC met for its 35th session in Geneva, Switzerland. The Draft Agenda for the session outlines the tentative topics of discussion, such as an update on the operation of the Voluntary Fund pertaining to the participation of Indigenous peoples and local communities, a summary of which can be found here. Also on the draft agenda for discussion are reports, recommendations, and proposals pertaining to genetic resources, which can also be found in the summary of documents.

In regard to the Voluntary Fund, the Information Note on Contributions and Applications for Support provides an exhaustive list of the voluntary contributions paid to the fund by nations, the amount of resources available, and the list of persons who were recommended for funding as of January 26th, 2018, as well as those who are seeking support to attend the IGC’s 36th session. Here, it is worth noting that Canada proposed in its 2018 Federal Budget Plan that it will be allocating an investment of $1 million over the span of five years to allow Indigenous peoples to attend and participate in WIPO meetings pertaining to traditional knowledge and cultural expressions as a way of promoting intellectual property rights amongst its Indigenous communities.

During the course of the meetings, new proposals were submitted, and while some nations accepted them, others resisted. For instance, a revised proposal for a potential treaty preventing the misappropriation of genetic resources received much resistance from developing countries on the grounds that the U.S. introduced new issues this week that were not mentioned in the previous version (Saez). In response, a second version was created for consideration by member states, and the committee chair also created a proposal indicating the need to create an expert group to prevent the misappropriation of genetic resources prior to the next session of the IGC, and this proposal was met with positive reception (Saez).

Interview with Ms. Teresa Hackett

Last week we had the pleasure of interviewing Ms. Teresa Hackett, Copyright and Libraries Programme Manager at Electronic Information for Libraries (EIFL) that works with libraries to enable access to knowledge in developing and transition economy countries in Europe, Africa, Asia Pacific, and Latin America. The Copyright and Libraries programme aims to build capacity of librarians in copyright issues, develop useful resources, and advocate for national and international copyright law reform.

We asked the following questions:

  • What are the three biggest problems for international copyright that you hope WIPO’s work can address?
  • Is the Standing Committee on Copyright and Related Rights (SCCR) making progress in solving those problems?
  • What hurdles do you see in the SCCR’s work toward solving those problems?

The three biggest problems identified by Ms. Hackett were inequalities between nations on the right to legally access and use information for education, research, and personal developments; barriers to cross-border access and use of information; and the replacement of copyright law with licenses for electronic resources. She stated that the SCCR is making progress addressing these problems, albeit quite slowly, which is often the case in international law. Currently the focus is on the important issue of agreeing on a workplan for the next biennium to set out a roadmap for the topics. As far as hurdles go, she indicated that there is some contention between developing and developed countries as to what the solution should be; while developing countries want a solution that’s international and binding, like an international treaty, for example, developed countries do not see a need for an international solution.

See below for a transcript of the interview.

What are the three biggest problems for international copyright that you hope WIPOs work can address?

First, the biggest problem is inequalities between nations on the right to legally access and use information for education, research, personal developments, and so on, in particular for digital information. So, it’s inequalities—a lack of equality between nations on the copyright laws of nations. There’s a big divergence around the world in copyright laws as to what libraries are and are not allowed to do for their activities.

The second problem is that there are barriers to cross-border access and use of information.  That’s due to the territorial nature of copyright. As you know, the Internet is global, and information needs don’t stop at the border. But copyright laws often prevent libraries from sharing or providing information services across borders. In fact, because that’s an international problem, only an international organization like WIPO has the scope and the mandate to properly address it.

The third problem is that copyright law is being, to a large degree, supplanted or replaced by licenses for electronic resources. These licenses often take away user rights that are set out in the copyright law. We view that as kind of undermining copyright laws, so we would like to see some way to protect the limitations and exceptions that are set out in copyright law in the licenses so that in the future copyright law still has a very strong place in how we access and use information.

Would you say that the SCCR is making progress in solving these problems? 

I would say yes overall. The Committee adopted a list of eleven topics for discussion which were debated over two years in the Committee. So, we had a list of eleven topics related to library and archive activities, such as preservation, right of reproduction, legal deposit, lending, parallel importation, cross-border uses, orphan works, TPMss, contracts, liability, and translation as well.

The resulting document was known as the ‘Chair’s chart’. Then the Chair proposed to reduce the eleven topics to nine and in fact, he also took out another two sub-topics. A suggested approach was made on seven topics, with further discussion needed on two topics (contracts and translation).

So, that phase of the work has been completed, and under the guidance of a new Committee Chair, the Committee is discussing a workplan for the next biennium, so for 2018 to 2019 when we hope we will be able to make further progress on the topics and to look at what the possible solutions might be.

So, we are making progress, but the progress is quite slow, as is the case in international law making. But I believe progress is being made.

You already said that one of the biggest hurdles would be the speed in which the changes would occur. Aside from that, what other hurdles do you see in the SCCRs work moving forward to address these issues?

Well I think it’s fair to say that all member states support the work of libraries, understand the value of libraries, and how libraries contribute to providing access to information and knowledge. Libraries contribute, for example, by preserving the memory of the world, providing access to our cultural and linguistic heritage, and supporting learning, education and research.

The problem is finding an agreement on a solution to the problems that the library and the archive community are presenting to the Committee and the member states. You could say that there is a split between industrialized countries and developing countries. Developing countries want a solution that’s binding and effective—likely along the lines of an international treaty or other binding international instrument—whereas the industrialized countries don’t see the need for an international solution at all. They believe that all the problems can be resolved at a national level and they only want to discuss best practices and national experiences. So, we have a difference of opinion and to some extent, an impasse as to what the solution should be.

Therefore, the biggest hurdle is really lack of political support from industrialized countries even though some of those same countries are themselves going through copyright reform processes. We hope that maybe when they have completed their own copyright reforms, they might be more ready to engage in discussions on what the solutions might be at the international level, not just at their own national level or regional level, as in the case of the European Union.

Recap of the 35th session of the Standing Committee on Copyright and Related Rights (SCCR)

From November 13-17, 2017, the World Intellectual Property Organization’s (WIPO) SCCR met for its 35th session in Geneva, Switzerland. The Draft Agenda for the session outlines the various topics and objectives that were to be discussed at these meetings. The most pressing and longstanding of these topics was the protection of broadcasting organizations and the limitations and exceptions for libraries and archives, as well as the limitations and exceptions for educational and research institutions and for persons with other disabilities.

The Draft Action Plans on Limitations and Exceptions for the 2018-19 Biennium which was also presented at the session, outlined the list of limitations and exceptions that were to be made for the selected actors. As stated by Teresa Hackett, the EIFL Copyright and Libraries Programme manager, an “action plan is important to give the Committee direction on its future work, as well as helping library groups prepare for their work ahead” (Hackett). Yet, despite the widespread acknowledgments by the members on the progress shown through the draft action plans presented by the secretariat, it was not formally adopted. Instead, it will be revised and presented at the SCCR’s 36th session in April of 2018.

There were many studies presented at the 35th session which outlined outstanding problems on copyright in the digital age. Some worth noting include the one presented by Professor Kenneth Crews, an attorney, who presented his study on Copyright Limitations and Exceptions for Libraries and Archives, indicating that “a number of countries have revised their copyright laws and the exceptions they provide to libraries and archives … fewer countries have no exception, and fewer countries are relying on general exception” (Saez). A Proposal to Advance Discussions was prepared by the Delegations of Argentina, Brazil and Chile, outlining a number of exceptions that “should not conflict with a normal exploitation of the programme-carrying signal and not unreasonably prejudice the legitimate interests of broadcasters and cablecasters” (Saez). Lastly, a Study and Additional Analysis of Study on Copyright, Limitations and Exceptions for Educational Activities was presented by Daniel Seng, a law professor at the University of Singapore, which examined “WIPO member states’ legislation as of August 2017 … to understand whether and how member states relied on the existing exceptions and limitations in the Berne Convention for the Protection of Literary and Artistic Works to construct their own limitations and exceptions in their national laws” (Saez).

So, what can we expect at the 36th session of the SCCR? According to the Draft Action Plans on Limitations and Exceptions for the 2018-19 Biennium, the five categories of limitations and exceptions remain intact: libraries, archives, museums, educational and research institutions, and persons with other disabilities (Balasubramaniam). The work plan suggests studies, brainstorming exercises, seminars, and conferences to take place in the upcoming year, however, no agreements have been made between countries on whether or not WIPO will establish new international rules in the aforementioned areas.

Upcoming agenda of this week’s CDIP meeting

This week the World Intellectual Property Organization (WIPO) Committee on Development and Intellectual Property (CDIP) will be meeting in Geneva, Switzerland for its 20th session from November 27 to December 1, 2017. Among the topics expected to be discussed at this meeting are the appointment of an Acting Vice-Chair; adoption of the Draft Report of the Nineteenth Session of the CDIP; all Development Agenda Recommendations; WIPO Technical Assistance in the Area of Cooperation for Development; work programs for implementation of adopted recommendation; IP and Development; and future work of the Committee.

The Committee will discuss: Progress Reports, Measures Undertaken to Disseminate the Information Contained in the Database on Flexibilities, Contribution of the Relevant WIPO Bodies to the Implementation of the Respective Development Agenda Recommendations, Roadmap on Promoting the Usage of the Web Forum Established under the “Project on Intellectual Property and Technology Transfer: Common Challenges-Building Solutions”, Promotion of WIPO Activities and Resources Related to Technology Transfer, and Mapping of International Fora and Conferences with Initiatives and Activites on Technology Transfer.

In regard to the WIPO Technical Assistance in the Area of Cooperation for Development, the Committee will discuss the Report on the Roundtable on Technical Assistance and Capacity Building: Sharing Experiences, Tools and Methodologies, which found that 76% of the 33 participants who responded to the questionnaire were satisfied with the Roundtable and 64% found it useful.

For a complete list of the work programs for implementation of adopted recommendations please consult the Draft Agenda.