Project Group Formation

Group formation due <2024-10-04 Fri>

  • Groups of 7
  • Signup assignment and Google sheet will be announced early next week
  • You don’t need to have an idea yet to form a group
  • But it’s good to start thinking
  • Hopefully this lecture will help

How do I find my team members?

  • You choose your groups

  • Piazza

  • Class discord?*

  • Talk to your neighbors

    \* class discord is not an officially supported communications channel. I’m not on it, I’m not monitoring it, use it at your own discretion.

Project Proposal

Due <2024-10-11 Fri>

  • Assignment Description on the course website soon
  • Examples on course website soon

Length:

  • Submit a 1-page (single spaced) document in paragraph form
  • Include at least 1 system diagram

Content:

  • Describe what your group plans to do for the course project. Answer the following questions:

What type of project is this?

  • A research re-implementation?
  • Novel research?
  • A web app?
  • A profiling/performance evaluation report?
  • An ethics evaluation?
  • Something else?
  1. Ways to choose:

    • Choose a kind of problem first, then choose an algorithm (recommended)
    • Choose an algorithm, then find a problem to solve (also ok)
    • Choose an algorithm, choose a problem, then find out whether they are compatible (not recommended!)

How will this project make use of AI?

  • What kind of AI will be used in this project?
    • Supervised Machine Learning
    • Unsupervised Machine Learning
    • Pretrained generative models
    • Reinforcement Learning
    • Search
    • Retrieval
    • Knowledge Representation
    • 


What value could someone get by viewing or interacting with your project?

  • This is an introductory class, of course
  • But it helps to have a use case in mind
  • Some past student projects were quite creative
    • Creative projects tended to have a very specific goal, motivated by a problem or topic area where some groups members had prior exposure

Types of Projects

Research Re-implementation

  • Re-implement one of the algorithms from an AI research paper.
  • This involves
    • reading the paper to understand what you’re implementing,
    • writing code, and
    • running (a subset of) the evaluations from your selected paper.

System Implementation

  • E.g. train a supervised machine learning model and evaluate its performance

System Evaluation

  • Evaluate an existing AI system for its functional performance.
  • This often includes an overall assessment of correctness/accuracy using a variety of metrics, as well as analysis of system failures.
  • How well does X algorithm perform on Y task? When does it fail? Why?

Performance Profiling/Evaluation/Optimization

  • Evaluate an AI system for its non-functional performance.
  • This often includes an assessment of how resource consumption (wall clock time, CPU time, memory, 
) varies with input size (number of data points, number of features, 
).
  • Profiling involves tracing system operation to identify the most time-consuming steps.

Novel Algorithm?

  • Develop a new AI technique or algorithm
  • Have seen only a few examples of this, from relatively advanced students
  • Doesn’t mean it’s impossible!
  • Highly recommend seeking advice from TAs or myself to check feasibility
  • Not recommended if this is your first “AI” class

Literature Survey

  • Synthesize trends in the way that AI is developed and used
  • Aggregate many primary sources into secondary commentary
  • Only project that does not require implementing or running an AI system
  • Most of the work is reading and synthesis
  • Cite at least 20 references
  • Include a review table
  • Search semantic scholar for “systematic review AI” for examples
  • Contact me if you’re interested in this track and need more help

Ethics or Socio-technical Systems Analysis

  • Analyze an existing system or technique from an ethical or STS perspective
  • Not just freeform commentary or personal opinion
  • choose a framework/lens and justify its use
  • Must demonstrate concrete engagement with the system or technique under study
    • E.g. collect inputs and outputs that demonstrate a particular failure mode

Types of AI

  • One view on intelligence is intelligence = problem-solving
  • Different kinds of AI ←> Different problems
  • Class is linear, some topics will be later than others
  • Hopefully this preview helps

Symbolic AI, Logic-based systems

  • Represent problems, facts, beliefs, as a system of symbols that are related to each other

Image credit: Dabbeeru, M. (August 18, 2021 )

  1. Example: Path-finding

    • Nodes in a graph symbolize locations on a map
    • Finding the shortest path in the graph means finding an efficient route from point A to point B
    • Dijkstra’s, A*
      • Next week

  1. Example: Knowledge Graph Completion

    • From a set of facts, what other facts can be inferred?
    • I will attempt to squeeze this in around Week 4
 TBD

Game-Playing

  • Multiple agents
  • Board games like Go, Chess, Connect-4
  • Any setting with defined game rules and multiple players

Lee Sedol vs. AlphaGo, 2016

  1. Example: Min-Max

    • We’ll cover this in a couple weeks from now
  2. Example: AlphaGo/AlphaGoZero/AlphaZero/MuZero

    • A series of game-playing AIs from Google Deepmind
    • We’ll talk about this briefly in a few weeks, but you’ll have to look into implementation details yourself
    • Third-party packages available

Supervised Machine Learning

  • Prediction
    • Given values for a set of input variables, predict the values for a set of output variables
  1. Example: Trash Classification


Image Credit: New AI Proves to Be a Trash Sorter Extraordinaire, IEEE Spectrum

-   Given a picture of a piece of trash, classify whether it's landfill, compost, or recycleable
-   Involves:
    -   finding data with labels
    -   "training" a model
    -   Evaluating the model's performance
-   Former students got "Best Statistical Model" for implementing this at HackDavis '24
-   We'll talk about these techniques in the middle of the quarter
-   LOTS of online examples thanks to Kaggle and Medium

2. Algorithms:

-   Decision Tree (tabular)
-   Random Forest (tabular)
-   SVM (tabular)
-   Neural Networks
    -   Feed-forward (tabular data)
    -   Convolutional (image data)
    -   Recurrent (sequential data)
    -   &#x2026;

Unsupervised Machine Learning

  • Discovering patterns in data without pre-existing labels
  • Focuses on finding structure or relationships in data
  1. Example: Clustering

    • Grouping similar data points together
    • K-means, hierarchical clustering

  1. Example: Dimensionality Reduction

    • Reducing the number of features while preserving important information
    • Principal Component Analysis (PCA), t-SNE


Image Credit: Turing Finance

  1. Example: Anomaly Detection

    • Identifying unusual patterns that do not conform to expected behavior
    • Useful in fraud detection, system health monitoring


Image credit: IBM Developer

”Generative AI”

  • AI systems that can create new content
  • Based on patterns learned from existing data
  • These days, often a deep neural network trained “self-supervised” (more on that later)
  • Data and compute-intensive to train from scratch
  • Can use pre-trained models
  • We’ll talk about this in the middle of the quarter
  1. Example: Large Language Models (LLMs)

    • Generate human-like text based on input prompts
    • GPT-4, Claude, LLaMA
    • Operate based on conditional probability


Image Credit: The Gradient

  1. Example: Image Generation

    • Create new images from text descriptions or other images
    • DALL-E, Midjourney, Stable Diffusion


Image Credit: Our World in Data

  1. Example: Music Generation

    • Compose new music in various styles
    • MuseNet, Jukebox

Recommender Systems

  • Predict user preferences and suggest relevant items
  • Used in e-commerce, streaming services, and social media
  • Not in class
  1. Example: Collaborative Filtering

    • Recommend items based on user behavior and preferences of similar users
    • Netflix movie recommendations, Amazon product suggestions

Image Credit: Ashmi Banerjee

  1. Example: Content-Based Filtering

    • Recommend items similar to those a user has liked in the past
    • Spotify playlist recommendations, news article suggestions
  2. Example: Hybrid Systems

    • Combine collaborative and content-based approaches for more accurate recommendations

Reinforcement Learning

  • Learning through interaction with an environment
  • Agent learns to make decisions by receiving rewards or penalties
  • We’ll talk about this later in the quarter (Probably Week 7 or later)
  1. Example: Game AI

    • Learning to play video games or board games
    • DeepMind’s AlphaGo, OpenAI’s Dota 2 AI

2. Example: Robotics

-   Training robots to navigate and manipulate objects in real-world environments


Crowd-Comfort Robot Navigation Among Dynamic Environment Based on Social-Stressed Deep Reinforcement Learning

  1. Example: Resource Management

    • Optimizing systems like traffic light control or data center cooling
  2. Algorithms:

    • Q-Learning
    • Deep Q-Network (DQN)
    • Policy Gradient Methods
    • Actor-Critic Methods
  • Supervised Machine Learning and RL system implementations are most popular
    • Probably due to the many examples that are available, and student interest in developing hands-on skills
  • As mainstream ML moves towards larger and more general models, system evaluation becomes more complicated, potentially more in-demand
    • Famously, GPT 4 attempted to recruit a crowd worker to solve CAPTCHAs during METR’s early-access evaluation

Back to the project

What to look out for

Compute

  • What computing resources will be required for your project?
  • Keep in mind that much of today’s “fancy AI” is compute-intensive.
  • Several cloud compute providers offer free trials, but there is overhead involved to set this up.
  • If you need more compute than what’s available to you on your own machine, CSIF, etc., start looking early and think ahead.
  • Ask me or the TAs for advice if you’re not sure

Scaffolding

Applies to all projects.

  • What scaffolding (external resources) will be used? Examples of scaffolding:
    • Kaggle submissions
    • GitHub code for research paper
    • Medium/Towards Data Science blog post
    • Github search for “Intro to AI projects”
  • Your project proposal should include a plan for what scaffolding you will use, and in what way

Representation

For most projects, be sure to consider representation.

Representation: You need to consider how you you will describe the world to your algorithm.

If you’re doing home price prediction, what will your algorithm know about any given home? What dimensions characterize a home?

Overall Scope

You’ll have ~8 weeks to implement this, so be mindful - don’t take on too much scope.

Supervision

Applies to many projects, especially system implementations, and especially ML.

If you’re doing supervised ML, you must comment on this in your proposal.

Supervision:

  • how will your algorithm learn what kinds of predictions it should make? (supervised ML)
  • how will you know if your algorithm is performing well or not? (any system implementation)
  1. Def:

    A source of supervision is any artifact that captures what humans think the “right answer” is for a given problem.

    • Most often, this is provided in the form of a dataset of paired input-output examples for the problem that you’re working on.
    • For example, if you want to predict home listing price, you need a dataset with many examples of houses and their prices.
    • For reinforcement learning projects, supervision comes in the form of a reward function rather than input-output examples.

Questions? Let’s talk about your project ideas

_