Copyright in relation to GenAI inputs & outputs is an evolving field that may vary across different jurisdictions
An important factor to be aware of, is that the training materials for these tools can contain copyrighted/proprietary materials for which proper copyright clearance or attribution has not been obtained or performed.
Which makes using these tools and respecting creators' rights difficult to reconcile.
There are some general rules however, that will help you to use GenAI in a compliant and responsible way:
Generally, you, the user, are responsible for obtaining the appropriate license or copyright clearance for any third party material that you input into a GenAI tool.
There are some common forms of knowledge that university researchers may use often for which copyright clearance is difficult to obtain or not yours to obtain and you should never input in any GenAI tool
These Include
The way GenAI tools are trained (and develop) with no regard for copyright and IP of authors and artists, means that any output you obtain from them, to help you with your grant application, figures, literature synthesis, and so on, may contain copyright protected material for which no permissions or license to use has been sought - in other words,
it is impossible to know definitively whether your AI tool is giving you pirated material.
Participant and Sensitive Data
You likely already know that GenAI tools are not secure or compliant repositories for data, and therefore:
Never input confidential or sensitive data into a GenAI
Especially human participant generated data. This would contravene your research ethics agreement, state and federal privacy acts, and the code of responsible research, to name a few.
Your Privacy
A lesser-known issue however is that LLM AIs are able to
Accurately infer your private information from your text prompts
Even when you have consciously anonymised your inputs, or prompted something completely unrelated to your identifying data
LLMs are trained on geolocated census data among other large datasets that help them develop identifying capacities. (WIRED 17 Oct 2023)
Because LLMs are not secure or compliant data repositories, the private information that they infer from prompts can be exposed in various ways
If you are curious, researchers from Secure, Reliable, and Intelligent Systems Lab (SRILAB) Zurich have developed this tool to test your privacy inference skills against current LLMs' capabilities
Personal data is subject to data privacy legislation.
Legal requirements on privacy vary with jurisdiction. If your research involves other jurisdictions you will need to check relevant requirements.