Course Details

Date

Sept 25 - Oct 22, 2023

Language

English

Modules

4

September 25 - October 22, 2023

£95

How to use ChatGPT and other generative AI tools in your newsrooms

Overview

Welcome to the Knight Center’s new online course, “Generative AI for journalists: Discovering what data can do,” organized by the Knight Center for Journalism in the Americas in partnership with Hacks/Hackers. Generative AI is pushing the envelope of how journalists use data. This course will equip you with the skills and knowledge necessary to navigate the next generation of computing.

In this four-week course, held from November 20 to December 17, 2023, you’ll learn hands-on skills like how to make your data machine-readable, plug your data into generative AI models, and share your new AI tools with others.

Watch the video above and read on for more details.

Goals

Through hands-on tutorials over the next four weeks, we want to help you get in-the-know by training you on practical applications and concepts integral to generative technologies. This course will introduce you to the machine learning domain by building up specific skills week over week, leaving you prepared for the next generation of generative AI applications and workflows only now entering newsrooms.

Objectives

We will build up your AI expertise such that you will be able to participate in AI policy formulation and implementation in your organization. Finishing this course will allow you to:

  • Understand what generative AI is and is not
  • Be able to clearly articulate when and where to deploy generative AI technologies
  • Convert your data to formats suitable for language models
  • Learn the basics of prompt engineering
  • Embed your documents in a vector database to search through them with natural language
  • Quickly develop prototype workflows to assess potency 

Who can enroll?

This course is for journalists who may have heard of generative AI before and would like to begin engaging with these technologies on a more practical basis, whether to be better prepared for the future of computing or to improve their data journalism practice with new capabilities made possible by machine learning.

No coding experience is necessary, nor will technical skills be assumed. Course videos will introduce all prerequisite skills at your pace, and we’ll make sure your computing environment is properly set up to download and run your own language models.

We’ll also provide you with plenty of exercises to practice the skills using your own data at your own pace. These exercises will be available asynchronously, and the instructor will be around to answer any questions you may have.

Introduction Module – Generative AI For Journalists

Welcome to the course! We’ll begin by diving into the recent history of generative AI through a study of successful AI projects instructor Sil Hamilton has observed while working with newsrooms and organizations across the industry. Next, we’ll get you set up with the required tools we’ll be using to discover AI during the course. We’ll also set aside time to go through what you’ll be learning — the exercises and discussions throughout the course will encourage you to try these techniques on your own datasets.

This module will cover:

  • Defining generative AI and understanding what makes a successful implementation
  • An overview of the course structure
  • Getting set up with our required tools and applications
  • Tips on how to make the best of this course

Module 1 – But What Are Models? (November 20 – 26, 2023)

What is called generative AI today is built on the success of machine learning models capable of understanding the world around us through text and images. We’ll develop an intuitive understanding of what is, and is not, possible with generative AI models today by looking at what makes these models tick. 

This module will cover:

  • Prediction tasks: how generative models are trained 
  • Natural language processing fundamentals
  • How ChatGPT works — and why
  • Why understanding modeling matters
  • Office Hours: Wednesday at 2 PM CST.

 

Module 2 – Discover The Data In Your Documents (November 27 – December 3, 2023)

Generative models talk to each other through text. Learn how to see your data in new ways by making your data — and your newsroom — “AI ready” by converting your unstructured documents into structured formats via optical character recognition (OCR) and embeddings, the fundamental unit of meaning for generative AI models. Embed your articles, documents, sources, and more.

This module will cover:

 

  • What sorts of data machine learning models expect
  • Converting your non-textual data to structured formats suitable for language models
  • Ways to “embed” your data with the help of embedding models and vector stores
  • Office Hours: Tuesday, Wednesday at 2 PM CST.
  • In Conversation: John Keefe, weather data editor at the New York Times.


Module 3 – Run And Use AI Models (December 4 – 10, 2023)

With your data cleaned and structured, it is now time to use generative models to transform your data in interesting and useful ways. Learn how to run a variety of multimodal models both in the cloud and on your local computer with LangChain, a framework for learning language models into conversational “agents” capable of many things: trawling your archives, summarizing documents, and rearranging your sources in new ways.

This module will cover:

  • Creating an agent with LangChain, a framework for developing applications with AI
  • Plugging your new agent into your vector store to create your very own research assistant
  • Giving your agent a custom personality
  • Extending your agent with new capabilities via tools and external APIs
  • Office Hours: Wednesday at 2 PM CST.

Module 4 – Putting It All Together (December 11 – 17, 2023)

Now that you’ve created your very own agent using LangChain, learn how to share it with the wider world by packaging and deploying it with the help of Hugging Face Spaces — an easy-to-use hosting platform for machine learning applications suitable for use in your newsroom.

This module will cover:

  • Giving your LangChain application a stylish interface with the help of Gradio
  • Customizing and styling front-end
  • Hosting your application online on your very own Hugging Face space
  • In Conversation: Freddy Boulton, software developer at Hugging Face.
  • Office Hours: Wednesday at 2 PM CST.

Sil Hamilton

AI researcher-in-residence at Hacks/Hackers

Sil Hamilton is AI researcher-in-residence at Hacks/Hackers, a network of journalists who rethink the future of news through talks, hackathons, and conferences.

A machine learning researcher at McGill University exploring the intersection of AI and culture, Sil has published research at NLP conferences like ACL, AAAI, and COLING. His work exploring the limits of language models has been discussed by Wired, The Financial Times, and Le Devoir.

Sil has given talks on AI and the newsroom at the Nieman Foundation for Journalism at Harvard; the Brown Institute for Media Innovation at Columbia; the Computer History Museum in Mountain View, California; and The Knight Center for Journalism in the Americas at the University of Texas at Austin.

Sil has consulted for The Associated Press on AI policies and serves as technology advisor at Health Tech Without Borders, a non-profit seeking to mitigate healthcare crises with digital tools.

Tools

Students will need a computer with an internet connection. The computer should be a laptop or desktop running an operating system like MacOS, Windows, or Linux. Mobile devices like phones and tablets are not recommended, as the tools we will be using do not support mobile platforms.

We will be using the below resources:

JupyterLab Desktop, an all-in-one application for running language models in a Python environment. You can download the program for Windows and MacOS, and it is recommended to do this before the course starts. For those whose computers are not modern, we recommend using the completely online Google Colaboratory. This will require a Google account.

Hugging Face, a website for accessing language and image models. We recommend making a free account to access certain features and models to be demonstrated in this course.

Certificate of Completion

A certificate of completion is available for students who meet all course requirements. The Knight Center will verify if these requirements have been satisfied every week. Once verification is completed, participants will receive a confirmation message containing detailed instructions on downloading the certificate.

To be eligible for a certificate, you must:

Watch the weekly video classes and read the weekly readings

Achieve a minimum score of 70% on the weekly quizzes. Retaking the quizzes multiple times is permissible, and only the highest score attained will be recorded.

Create OR reply to at least one discussion forum each week

The certificate of completion is included in the $95 course fee. No formal course credit of any kind is associated with the certificate. 

Our certificate’s primary purpose is to acknowledge and validate a participant’s active involvement in the Knight Center for Journalism in the Americas online course.