#cs390#notes#ethics#ai#research#hw
Questions to consider:
-
How could attackers leverage GenAI technologies? Attackers gain a general purpose text generator from GenAI. This opens them up to attacks which prey on making humans indistinguishable from ārobotsā. Thus, in these types of attacks, attackers gain a increase in victim susceptibility. Examples include phishing mail, spread of disinformation, and targeted manipulation of victims. Attackers also gain a tool which is capable of doing āhuman-likeā tasks. This opens itself up to allowing attackers to analyze and find vulnerabilities in large system that they were unable to attack before. Attackers now also can scope out many more victims automatically. Imagine a attacker that is attempting to find a unsecure website ā a crawler mixed with an AI which attempts to find vulnerabilities in the source can give the attacker a list of good potential targets. Or, with a trove of data and information about victims ā a generative AI can use information, say text messages, of victims and automatically deduce personal information to use against these individuals.
-
How should security measures change in response to GenAI? When it comes to text-based attacks or disinformation, companies and individuals must take great care in confirming that their interactions are with humans and not a malicious actor. Safeguards preventing AI from interacting with individuals in the first place can be utilized to prevent AI attacks. An example of this can be email filters which use AI to detect LLM generated text. For other vulnerabilities, penetration testing can help a lot in preventing and identifying security breaches. Companies can use LLMs to detect malicious code. Alongside this, AI can help increase preventative measures ā AI can aid programmers in identifying best security practices and maintaining code clarity.
-
What are some current and emerging technologies we should pay attention to for designing countermeasures?
-
Find 3 other LLMs that are used for nefarious purposes.
-
What do you think is the most critical malicious implication of large language models that could negatively impact society?
The First Pass š
-
Read title, abstract, and introduction carefully.
- Dual-Use Dilemma, the use of the invention for good and bad.
- Generative AI dual use. How it can be used to create new attacks or amplify existing ones
-
Read section and sub-section headings. Ignore everything else.
- Sections:
-
- Introduction
-
- Capabilities of GenAI relevant to attack and defense
-
- How attackers can leverage GenAI
-
- How defenders can leverage GenAI to mitigate risk of attacks
-
- Short term goals
-
- Long term goals to challenging issues
-
- Conclusion
-
Read conclusions.
- Discusses short and long term goals
-
Glace over references
- Ethical AI references
At the end of the first pass, you should be able to answer the 5 Cās:
- Category: What type of paper is this? A measurement paper? Analysis of an existing system? A description of a research prototype?
- Study of existing AI landscape and potential threats.
- Context: What other papers is it related to? Which theoretical basis were used to analyze the problem?
- Related to ethical AI
- Correctness: Do the assumptions seem to be valid?
- Yes
- Contributions: What are the paperās main contributions?
- Identifies and brings discussion into attacks present from LLMs
- Clarity: Is the paper well written?
- Yes
The Second Pass š§
Read the paper with greater care, but ignore the details such as proofs. Write down the key points, make comments on the margins as you read.
- Look carefully at figures, try to notice mistakes, axes labels, etc.
- Mark relevant unread references for further reading (good for learning more about the background of the paper). The second pass should take up to an hour. After this pass, you should be able to grasp the content of the paper and summarize it with supporting evidence.
Now you may choose to: (a) Set the paper aside, hoping you may not need to understand the material to be successful (b) Return to the paper later, perhaps after reading background material (c) Persevere and go onto the 3rd pass.
Reading the Paper
GenAI Capabilities
- Generating targeted text
- Generating realistic images and video
- Drawing on detailed technical knowledge and expertise.
- Can produce and analyze code, reproduce reasoning, and answer complex questions. Albeit not flawless, still very powerful.
- Summarizing and paraphrasing source material.
- Persisting time-consuming and exhausting tasks.
Attacks
- Spear-phishing: Phishing emails tailored to individual targets. AI can create personalized messages.
- Hallucinations: Generated output is factually incorrect or entirely fictitious, while being apparently coherent. Concerning if users lack domain knowledge.
- Dissemination of deepfakes:
- Dissemination: action or fact of spreading something ā especially information, widely.
- Malicious users can disseminate widespread misinformation and disinformation that aligns with their narratives.
- Proliferation of Cyberattacks: LLMs can generate code that can be used to design malware automatically.
- Low barrier-for-entry for adversaries: Less cost and labor to attack. More individuals are now able to carry out attacks by using AI.
- Lack of social awareness and human sensibility: Provide inappropriate and disturbing advice to vulnerable individual.
- Data Feedback Loops: Future models may train on data built by other older models ā thus not using ārealā human data. Internet degrades in quality as a data source.
- Unpredictability: LLMs are very general purpose, which make them very susceptible to attacks not previously thought of.
Defenses
- Detecting LLM Content: Detectors to detect if text is generated by a LLM. Exploit the fact that distribution of text generated by LLM is slightly different from natural text.
- Watermarking: āStatistical signalā embedded in Gen-AI process so it can be detected later.
- Code Analysis: LLMs can be used to de-obfuscate code and find malicious code.
- Penetration Testing: Use LLMs to automatically find vulnerabilities in a system.
- Multi-modal Analysis: Analyzing multiple modalities (ex. twitter): text, code, images, speech. By LLMs understanding more, they can give a better understanding of the information.
- Personalized Skill Training: LLMs can be used to cater to specific needs of users. Example is teaching cyber security education.
- Human-AI Collaboration: Improve work efficiency of humans and quality and consistency of output.