AI Usage Best Practices
As the modern tech landscape continues to invest in the development of AI tools such as large language models, their presence on CNU campus becomes more and more likely for students and faculty alike. In particular, with the development of job opportunities centered around the usage of AI at companies as large as Microsoft, knowing how to navigate such usage in a safe and ethical manner becomes increasingly important. With this in mind, CNU's IT department has developed a set of recommendations for implementing the use of AI tools on campus. The first of these recommendations, for faculty and students alike, is to be aware of the potential issues surrounding the rise of AI in classrooms.Â
Potential Issues
What IS an AI?
Oftentimes the first and largest barrier to using AI safely and ethically in a classroom setting comes from a fundamental misunderstanding of what AI even is. This is understandable, as the term comes with many complicated cultural connotations that can make users believe that the modern-day tools are capable of more than they actually are. Discussions around modern AI models also come with multiple terms that are either unfamiliar, or being used in unfamiliar contexts to new users.Â
Artificial General Intelligence (AGI): A shorthand term for an artificial intelligence model that meets or exceeds human abilities on a broad range of cognitive tasks, and can perform those tasks autonomously. In essence, AGI would be considered intelligent in the same way that humans themselves are intelligent. While true AGI does not currently exist yet, modern research suggests that current models are a significant step towards its eventual creation.Â
Machine Learning: A computer system that is trained on outside data, that then makes predictions extrapolated from that data. The more data the computer system is exposed to, the more accurate the predictions; thus, exposure to more data is how machine learning systems 'learn'.Â
Generative AI: Another name for an AI model designed to generate 'new' text, images, code, or other content by using the content of data used to train it. Some experts consider the term to be a misnomer, as the 'new' content is not wholly new, but recombined from other sources.Â
Large Language Models: An AI model designed to be trained on, and output, natural language text. Examples include ChatGPT and Google's Gemini.Â
Diffusion Models:Â An AI model designed to be trained on, and output, non-text data. Common diffusion models are designed to interpret and produce images based on a text prompt. Models that are designed to output audio or video results are also considered diffusion models.Â
Training Data: Any outside information fed to an AI system to 'teach' it how to respond to prompts.Â
For other definitions, this article may be a useful starting point.Â
Unintended Biases and Discrimination
A strong general rule when judging an AI model is that the model cannot create anything it hasn't already seen before–meaning that the output of an AI will always be the result of all the data that was input into the model to train it. This can result in some key issues in the output of large AI models going completely unnoticed by users who assume the model is more intelligent than it is. One example is that a model trained on biased data will replicate that bias in its results; for example, if a medical AI used to diagnose skin diseases is trained using only data from white patients, it may struggle to properly identify medical issues on bodies of color, and result in fewer correct diagnoses on nonwhite skin as a result. However, the model will be seen as presenting its output as purely objective data. Uncritical usage of models fed on biased data can feed into deeply entrenched systems of discrimination, and furthermore mask that discrimination by presenting it as the unbiased truth from a model unaffected by personal bigotry.Â
Misinformation
A term you may hear often in conversations about AI tools is 'hallucinations'. In the case of AI, a hallucination is when a model confidently outputs incorrect information, either by repeating incorrect information found in its training data or by extrapolating a 'best guess' based on data that does not apply to the situation at hand. This happens because AI models are only able to generate outputs based on the training data they are given, and are not able to differentiate between good and bad data when selecting which information to use. Users who are not aware of this possibility may be fooled into thinking the model is providing real facts, and use and spread this misinformation as a result.Â
Environmental Impact
As AI models grow in size and are trained with ever larger data sets, the energy costs of using AI grow in turn. Larger models using larger sets of training data consume more energy per query, which raises questions of the environmental impact of widespread AI usage. Environmentally conscious users may not find that the benefit of using AI models to generate text outstrips these costs in the face of rapidly incoming climate crises. Â
Educational Shortcutting
A common concern stated by educators regarding AI is that the use of AI models will enable students to hide a lack of knowledge about a subject they were intended to learn by using an AI model that can generate the knowledge needed to pass the class. For example, a student who does not understand a unit about Shakespeare, instead of demonstrating his own lack of mastery by writing an essay that would not be received well, may use an AI language model to generate an essay that would demonstrate knowledge he has not retained. Such 'shortcuts', if not properly managed, retain the risk of making students dependent on tech tools they do not understand rather than learning the knowledge and skills they need for themselves, which can have dangerous consequences if said students enter fields where said knowledge is required to perform a job properly and safely.Â
Training Data, Privacy and Data GovernanceÂ
The most important part of any AI model is the data set it is trained on, as that data is the source of any possible outputs it could generate. Robust AIÂ models require a large amount of data for training, and are programmed to actively seek out more data. However, this can lead to some ethical and legal issues regarding said data. Many models have been found 'scraping' proprietary work from artists who explicitly don't want their work used in this way, potentially violating intellectual property and copyright laws in the process. Other models have been found violating other, more serious laws when obtaining data; well-known examples include AI models found to have accessed and scanned confidential personal information, up to and including personal messages, private files and even medical records.Â
While these issues are still being resolved, using these models creates a non-zero possibility that AI-generated outputs may contain snippets of illegally-obtained information, which creates opportunities for faculty and students alike to violate Academic Integrity policies entirely by accident.Â
Transparency and Oversight
The above issues can be compounded by a lack of transparency on the part of the company hosting an AI tool. Corporations may obscure or outright hide the source of their training data, as well as information such as who is responsible for the AI's development and management. This can make it difficult for users to obtain clear answers on key questions regarding safeguards against misinformation, examination of potential biases, or even legal issues such as data theft. It is the opinion of this department that transparency is necessary in order to ensure any AI tool is being developed ethically and responsibly, and that AI tools that operate with a lack of transparency or clear oversight should be avoided until such transparency and oversight are provided.Â
Beyond education, we have additional recommendations for both faculty and students considering using AI tools in the classroom.Â
For Faculty
Create Clear Expectations
CNU does not currently have specific rules in their Academic Integrity policy regarding the use of AI tools in the classroom. As a result, it is the responsibility of the faculty to outline their expectations for their classroom in clear terms. These terms may vary depending on your curriculum, your personal teaching philosophy, or whether you intend to incorporate such tools into the classroom yourself for specific assignments. This provides you the freedom to state whether or not you consider the usage of AI tools to be a violation of academic integrity or not, based on the learning expectations you have for your students. It is currently recommended that faculty outline these expectations in their syllabus, as well as clearly state them for students at the beginning of the semester. Failure to do so may create ambiguity in the face of a potential incident.Â
Scaffold Usage to Encourage Learning
Common worries about AI use in the classroom include the possibility of it providing shortcuts for students that prevent them from learning. However, this is not a concern unique to AI tools; similar worries have followed almost every new introduction of technology into education. Similarly to tech like calculators and LMS platforms like Blackboard, AI can create valuable learning opportunities when implemented carefully. Faculty considering the usage of AI tools in the classroom should carefully consider what kinds of assignments can benefit from using AI, what students are intended to learn from these assignments, and how such usage can encourage learning rather than provide shortcuts. A recent example may include the use of AI tools in a writing class to generate text that students then edit and critique. Rather than bypassing the students' need to learn writing for themselves, this usage can enable students to learn how to edit and critique existing writing by providing a neutral sample for them to practice on.Â
If you are struggling to find an appropriate way to scaffold AI tools in your curriculum, please reach out to Academic Technologies, or the Center for Effective Teaching, for assistance.Â
Provide Materials on Smart and Ethical Usage
Students unfamiliar with AI tools may come into your classroom either never having used an AI tool, or already using these tools in ways that run counter to their best interest. A common example includes a trend of incoming students using ChatGPT as a Google Search analog, unaware that the AI model's algorithm does not prioritize the accuracy of information being output. As a result, we recommend that alongside your expectations for academic integrity, faculty provide educational materials on the nature and safe/ethical usage of AI, especially if the usage of AI tools has been scaffolded into part of your curriculum. You can use the material in this document as a starting point, but can also use recent news articles, links to current models' training data, or other documents.Â
Do Not Depend on Detection Software
At this time, there are no tech tools that can reliably detect when a piece of text was generated using an AI tool. As a result, using tools that can claim to do so in order to check student work can not be currently recommended. We instead recommend that faculty use more analog methods to investigate student work, such as checking the accuracy of the text and noting sudden stylistic changes. Depending on the context and submission method of the assignment, IT department may investigate text submitted into Scholar to provide further evidence, but this may also be inconclusive.Â
Carefully Vet Suggested Tools
AI models are trained on large sets of outside data to develop their algorithms, which can come from a variety of places depending on the type of model. This can be the source of multiple kinds of unethical and even illegal behavior on the part of the model, including the generation of unintended biases, copyright violation, and outright theft of confidential data such as medical records. Work is being done by multiple organizations to enforce transparency regarding the sourcing of training data used to train popular AI models such as ChatGPT; this work has included the filing of multiple lawsuits against models accused of scraping data from unethical or illegal sources. In order to ensure that campus work does not unintentionally replicate these past mistakes, we recommend that all users research the intended AI model's history with their training data, and avoid using models that do not provide information about where said data is sourced from. We also recommend that faculty warn students about AI models that have either current issues or a record of unethical data sourcing.Â
For Students
Check Classroom Guidelines
Before any other consideration, a student should confirm with their instructor whether the usage of AI tools in their classroom is approved or not. Different instructors may have widely varying policies depending on the type of class they teach and their personal teaching philosophy. As a result, what may be considered a valuable and useful tool in one classroom may be considered a breach of academic integrity, plagiarism, or outright cheating in another. Other faculty may have a more nuanced take, encouraging their use in some contexts and discouraging them in others. Faculty are being encouraged to detail their specific expectations in their course syllabus, which will be an ideal place for you to first check; if said expectations are not present or unclear, err on the side of caution by checking with the instructor directly before using any AI assistance.Â
Verify All Facts And Sources
The coding behind AI tools such as large language models does not differentiate between true and false information, and has often been seen generating false facts or nonexistent references simply by mimicking the structure of the texts they were trained on. In addition, the use of inaccurate, misleading, or even satirical content in a model's training data may also result in the model returning similar inaccuracies. As a result, it is recommended that any user of AI tools carefully review the output of said tool for accuracy. Flag any false or nonsense information produced by the model, as well as any nonexistent sources the text may claim to be utilizing, and use your own knowledge to correct them.Â
Use Ethically-Trained Models
AI models are trained on large sets of outside data to develop their algorithms, which can come from a variety of places depending on the type of model. This can be the source of multiple kinds of unethical and even illegal behavior on the part of the model, including the generation of unintended biases, copyright violation, and outright theft of confidential data such as medical records. Work is being done by multiple organizations to enforce transparency regarding the sourcing of training data used to train popular AI models such as ChatGPT; this work has included the filing of multiple lawsuits against models accused of scraping data from unethical or illegal sources. In order to ensure that student work does not unintentionally replicate these past mistakes, we recommend that all users research the intended AI model's history with their training data, and avoid using models that do not provide information about where said data is sourced from.Â
Edit, Edit, Edit
Another common failing of AI generated models is their limited ability to replicate tone or style. While large language models can reliably mimic common writing styles, or the style of well-known individuals whose work may be part of their training data, it is not guaranteed that the resulting output could mimic your own way of writing or speaking reliably. In addition to providing opportunities to correct any misinformation, heavily editing the output of an AI tool allows you to customize the work so it more closely aligns with your body of work as a whole.Â
Practice Full Disclosure
To untrained eyes, the results of an AI tool like ChatGPT can be indistinguishable from original work generated by a human mind. However, this can create issues of academic integrity, particularly in situations where the implied expectation is that any writing used in a classroom setting is human-generated. When in situations where the expectations are not clearly outlined, it is suggested that a disclaimer be attached to any AI-assisted work notifying anyone reading or reviewing the work that such assistance is present. This can bypass potential confusion and misattribution of AI-presented work, and can further clarify the cause of other potential issues that may arise as a result, such as those listed above.Â