Course 1: Data Literacy & Governance
As AI and automation become more widely adopted, financial services firms are discovering that many of their greatest risks don’t stem from the tools themselves, but the inputs used in those tools. That includes how data is defined, interpreted, and governed before deployment.
Many guidelines for the responsible use of data and AI exist across industries and aren’t necessarily FINRA or SEC regulations. Still, the idea is to stay compliant upstream as AI and automation scale content downstream. You’ll find resources for rules, regulations, and best practices throughout the 11-lesson course. Rather than focusing on models or technologies, Course 1 concentrates on the upstream decisions that shape accountability, supervision, and defensibility across the entire data lifecycle.
You’ll learn how regulators evaluate data use in practice, how interpretation introduces regulatory meaning, and how governance structures support consistent, explainable outcomes. The goal is to explain, govern, and defend how data is used within organizations, and to create a stable foundation for responsible analytics, automation, and AI.
-
-
Lesson 1.1: What Data Means in Finance
In regulated finance, data is not neutral input. This lesson establishes how regulators define data and why responsibility attaches long before analytics or AI are introduced.
In regulated finance, data is not neutral input. This lesson establishes how regulators define data and why responsibility attaches long before analytics or AI are introduced.
-
Lesson 1.2: Data Types & Risk
Not all data carries the same regulatory risk. As data moves from common to alternative to derived forms, interpretive distance increases and governance requirements escalate. This lesson explains why.
Not all data carries the same regulatory risk. As data moves from common to alternative to derived forms, interpretive distance increases and governance requirements escalate. This lesson explains why.
-
Lesson 1.3: Data Lineage, Provenance, and Why Regulators Care
Regulators evaluate how data moved, changed, and was permitted to be used over time. This lesson shows why documentation of origin and transformation must exist before analytics or automation scale.
Regulators evaluate how data moved, changed, and was permitted to be used over time. This lesson shows why documentation of origin and transformation must exist before analytics or automation scale.
-
Lesson 1.4: Data Ownership vs. Data Use Rights
Owning or accessing data doesn’t automatically grant permission to use it in every context. In regulated finance, responsibility attaches to how data is applied, not who holds it. This lesson explains why.
Owning or accessing data doesn’t automatically grant permission to use it in every context. In regulated finance, responsibility attaches to how data is applied, not who holds it. This lesson explains why.
-
Lesson 1.5: Supervisory Expectations Around Data Usage
Supervision applies wherever data influences decisions, not just where outputs are reviewed or published. Regulators expect oversight to follow data through analysis, interpretation, and internal use. This lesson explains how supervisory responsibility is defined, documented, and evaluated in practice.
Supervision applies wherever data influences decisions, not just where outputs are reviewed or published. Regulators expect oversight to follow data through analysis, interpretation, and internal use. This lesson explains how supervisory responsibility is defined, documented, and evaluated in practice.
-
Lesson 1.6: Bias, Sampling Error, and Model Risk in Financial Datasets
Bias and model risk aren’t introduced only by advanced analytics or AI. They often originate earlier, through how data is selected, sampled, and transformed. This lesson explains why regulators expect firms to recognize and document these risks even in simple, non-technical subject matter.
Bias and model risk aren’t introduced only by advanced analytics or AI. They often originate earlier, through how data is selected, sampled, and transformed. This lesson explains why regulators expect firms to recognize and document these risks even in simple, non-technical subject matter.
-
Lesson 1.7: Data Minimization and Purpose Limitation
An introduction to data minimization and purpose limitation as core governance principles. The lesson explains why firms are expected to define what data they collect, why it exists, how it may be used, and how long it is retained. The lesson also shows why these boundaries must be established before analytics, automation, or AI are introduced.
An introduction to data minimization and purpose limitation as core governance principles. The lesson explains why firms are expected to define what data they collect, why it exists, how it may be used, and how long it is retained. The lesson also shows why these boundaries must be established before analytics, automation, or AI are introduced.
-
Lesson 1.8: Documentation Standards | What Must Be Written Down
In regulated finance, undocumented decisions are treated as if they never occurred. Regulators rely on documentation to verify governance, supervision, and accountability across the data lifecycle. This lesson explains what data-related decisions must be documented, and why documentation must exist before analytics or automation scale.
In regulated finance, undocumented decisions are treated as if they never occurred. Regulators rely on documentation to verify governance, supervision, and accountability across the data lifecycle. This lesson explains what data-related decisions must be documented, and why documentation must exist before analytics or automation scale.
-
Lesson 1.9: Record Retention and Audit Replay
Documentation only matters if it can be retrieved and reconstructed under examination. Regulators evaluate whether firms can replay decisions in context, using contemporaneous records that show how data was used, reviewed, and approved. This lesson explains why record retention and audit replay are essential to defensible data governance.
Documentation only matters if it can be retrieved and reconstructed under examination. Regulators evaluate whether firms can replay decisions in context, using contemporaneous records that show how data was used, reviewed, and approved. This lesson explains why record retention and audit replay are essential to defensible data governance.
-
Lesson 1.10: Data Incidents: What Went Wrong in Real Firms
Most data incidents do not result from bad data or malicious intent. They arise when accurate information is reused, interpreted, or scaled without sufficient governance, documentation, or supervision. This lesson examines common patterns regulators see in real incidents and what firms are expected to learn from them.
Most data incidents do not result from bad data or malicious intent. They arise when accurate information is reused, interpreted, or scaled without sufficient governance, documentation, or supervision. This lesson examines common patterns regulators see in real incidents and what firms are expected to learn from them.
-
Lesson 1.11: Governance Templates: Stewardship, Approvals, and Reviews
Effective data governance requires more than principles or policies. Regulators expect firms to define responsibility, control new uses, and supervise data through repeatable structures. This lesson introduces governance templates that make oversight consistent, scalable, and defensible.
Effective data governance requires more than principles or policies. Regulators expect firms to define responsibility, control new uses, and supervise data through repeatable structures. This lesson introduces governance templates that make oversight consistent, scalable, and defensible.
-
Course Wrap-Up
This final lesson ties together the core ideas from Course 1, showing how regulators evaluate data across its full lifecycle: from collection and interpretation to governance, documentation, retention, and reuse. The key takeaway is that most regulatory risk begins long before AI or automation, with everyday data decisions that shape how meaning, accountability, and control are established.
This final lesson ties together the core ideas from Course 1, showing how regulators evaluate data across its full lifecycle: from collection and interpretation to governance, documentation, retention, and reuse. The key takeaway is that most regulatory risk begins long before AI or automation, with everyday data decisions that shape how meaning, accountability, and control are established.
-

