AI and Ethics – Principles, Fairness, and Regulation

Artificial Intelligence is a powerful tool – and like any powerful tool, its use comes with great responsibility. AI and machine learning systems are now being deployed in areas that directly affect people’s lives: hiring, finance, healthcare, law enforcement, education, and beyond. This amplifies the importance of ethics in AI – we must ask not only “Can we do it?” but “Should we do it, and how?”. In this chapter, we explore what it means to align AI with human values and rights. We discuss definitions of fairness, sources of bias in the AI pipeline, and emerging best practices to mitigate these issues. We also survey major international principles and regulations that have been proposed or enacted to guide the ethical development and use of AI.

Why AI Ethics Matters

Some of the stories from the previous chapter likely underscore why ethics in AI is crucial. When a recruitment algorithm discriminates by gender or an exam-grading algorithm unfairly disadvantages certain students, it becomes evident that AI decisions can have profound moral and social impact. At its core, AI ethics is about ensuring AI systems are aligned with our social and moral values, and that they operate in a manner that is beneficial and fair to individuals and society. Key considerations include:

Fairness and non-discrimination: AI should not treat people unfairly or unequally on the basis of characteristics like race, gender, age, etc., without justification. Unintended bias needs to be detected and corrected, so that systems do not perpetuate historical discrimination.
Transparency and explainability: There should be clarity about how AI systems make decisions. If you are denied a loan or a job by an algorithm, ideally you should know why, or at least have the decision be challengeable. Opaque “black boxes” can be problematic, especially in critical areas.
Accountability: There must be an answer to “Who is responsible if something goes wrong?” Is it the developer, the company deploying the AI, or the AI itself (which legally can’t be held accountable)? Ethical AI frameworks insist that accountability lies with the humans and organizations that design and deploy the system.
Privacy: AI systems often rely on large amounts of data. Respecting privacy and securing data is an ethical imperative. Misuse of personal data, or surveillance without consent, are major concerns. (For example, face recognition used pervasively in public without oversight raises serious privacy issues.)
Human autonomy: AI should augment human decision-making, not undermine it. For example, lethal autonomous weapons that decide whom to target without human approval raise deep ethical questions. Even in consumer tech – say, a content recommendation algorithm that nudges someone toward addiction or extreme beliefs – we must ask if the AI is respecting the user’s autonomy and well-being.
Safety and security: AI systems, particularly those in physical domains (self-driving cars, medical devices) or critical infrastructure, need to be safe and robust. Faulty behavior can cause physical harm. Additionally, AI systems should be secured against attacks or misuse (imagine someone hacking an AI traffic control system – the results could be catastrophic).
Human dignity and rights: Fundamentally, AI ethics aligns with human rights. AI should not be used to undermine human dignity. For instance, social scoring systems that rank citizens (as seen in some controversial surveillance implementations) risk treating people as data points rather than individuals with rights.

Joanna Bryson, an AI ethics researcher, famously remarked that “AI is not too different from any other technology; it’s just so powerful and amplifying that it forces us to face who we are.” In other words, AI will reflect and magnify our values – good or bad – at scale. That puts the onus on us to be very intentional about those values when we design and deploy AI.

Understanding Bias and Fairness in the AI Pipeline

To create ethical AI, a key first step is understanding where things can go wrong. A recent framework by Suresh and Guttag (2021) outlined several stages in the machine learning pipeline where bias or harm can be introduced. Let’s briefly examine these potential sources of unfairness or harm:

Historical Bias (in Data Collection): The world itself can be biased, and data collected from the world will reflect that. For example, historical hiring data reflected gender bias in tech (fewer women were hired), so any model trained on it would inherit that bias. Historical bias isn’t caused by the AI per se; it’s in the input data. But it becomes part of the AI’s “DNA.” Another example: crime data might show more arrests in certain neighborhoods not only because of true crime rates but because of biased policing practices. An AI predicting crime based on that data would reinforce the policing bias. Lesson: We need to scrutinize what our data represents. Is it an accurate, fair picture of what we want to model, or is it a mirror of past injustices?
Representation Bias (Sampling): This occurs if the data collected doesn’t represent the population that the model will serve. Suppose you build a health diagnostic AI using data mostly from male patients; it might perform poorly for female patients. Or an image recognition system trained mostly on lighter-skinned faces will do badly on darker-skinned faces – which indeed happened with early facial recognition systems. One landmark study (Gender Shades) found error rates under 1% for gender classification on light-skinned male faces, but as high as 34% on dark-skinned female faces. Ensuring diverse and representative data is crucial. Otherwise, the model will systematically disadvantage under-represented groups (a form of aggregation bias – treating a diverse population as if it were homogeneous, typically like the majority in the data).
Measurement Bias (Labeling and Features): Sometimes the features or labels we use are problematic proxies. For instance, using ZIP code as a feature in a credit scoring model might indirectly encode race or socioeconomic status, leading to “redlining” effects (denying loans to certain neighborhoods). Or consider a label like “creditworthiness” – if defined by past loan repayment data, it might incorporate bias if certain groups were unfairly denied loans in the past (meaning we lack data on their repayment capability, or those who got loans are a skewed sample). Measuring the wrong thing, or measuring in a skewed way, leads to biased outcomes. Another example: in hiring, using “years of experience” as a key feature might seem neutral, but if women often had career breaks or faced past exclusion, that feature could indirectly disadvantage them.
Aggregation Bias (Modeling): This refers to using one model for groups that really have different data patterns. For example, a healthcare diagnostic algorithm might perform better for one demographic group than another if important differences aren’t accounted for. The “one-size-fits-all” approach can fail if the population is heterogeneous in ways that affect the prediction. The solution might be to have group-specific models or at least include group attributes so the model can adjust, though doing so raises its own fairness questions (explicitly using a sensitive attribute can be controversial, even if the intent is to improve fairness).
Learning Bias (Objective Function & Optimization): The choice of objective function and how we train the model can introduce bias. If a model is trained just to maximize overall accuracy, it might sacrifice performance on minority groups because it can achieve higher accuracy by focusing on the majority. For instance, if 90% of training data is one class and 10% another, a classifier could be 90% accurate by always predicting the majority class – but that means it’s 0% accurate on the minority class. A naive training process might thus encode a form of bias by not prioritizing performance equity. Additionally, many algorithms assume data are IID (independent and identically distributed); if this isn’t true (e.g., if certain patterns vary by subgroup or over time), the learned model might be biased or brittle (evaluation bias – the metrics used might not capture fairness or real-world performance for all groups).
Deployment Bias: This occurs when the model, once deployed, is used in a context or manner not originally intended, leading to harm. For example, a predictive policing model might have been intended to allocate resources, but if it’s used punitively (e.g., justifying heavier policing or surveillance in an area without broader context), it can cause a feedback loop of reinforced bias. Or using an algorithm designed for one population on a different population without recalibrating can be problematic. Deployment also covers feedback loops – how the model’s outputs can change the world and thus the future data. A classic case: if a loan algorithm disproportionately denies a certain demographic, those people never get a chance to build credit, and the model’s bias is reinforced by the subsequent data (it “learns” that those people have no credit history, justifying future denials). Similarly, in predictive policing, if an area is flagged by the algorithm, police are sent there more often and will record more incidents (not necessarily because the true crime rate was higher, but because they looked more), and then the new data “proves” the area is high-crime, prompting further enforcement – a self-fulfilling prophecy.

Understanding these phases helps identify where interventions are possible: improving data collection, adjusting feature choices, revising model objectives, etc. To systematically address fairness, researchers have also defined various fairness criteria. It turns out fairness can be defined in multiple ways, and not all definitions can be satisfied simultaneously – there are trade-offs (sometimes even formal impossibility results). A few common notions include:

“Blindness” (Anti-classification): The idea that the model should not use protected attributes (like race, gender) at all – essentially, treating everyone the same way. This is a common intuition (fairness as attribute-blindness). For example, a hiring algorithm that explicitly ignores gender. While this can prevent direct discrimination, it doesn’t guarantee fairness – because proxies for the sensitive attribute might still be in play, and treating everyone the same can actually preserve disparities if the underlying data were biased. In practice, simply being “blind” to a characteristic doesn’t ensure fair outcomes.
Group-Specific Treatment or Thresholds: One approach to improve fairness is to allow the decision rules to differ by group, to offset biases in data. For instance, if an algorithm scores loan applicants and women’s scores are on average lower due to biased historical data, one might approve women with a slightly lower score threshold than men to equalize acceptance rates or outcomes. This is akin to corrective affirmative action in the model. It’s controversial to some (as it explicitly uses protected attributes to change decisions), but it acknowledges that a one-size-fits-all threshold can perpetuate bias present in the data.
Demographic Parity (Statistical Parity): This criterion demands that the model’s positive prediction rate is the same across groups. For example, if 70% of men are approved for a loan, then ~70% of women should be approved as well. It doesn’t mean individuals are treated identically regardless of group, but it ensures no group is disproportionately selected (or rejected). The drawback is that parity can be achieved in ways that might seem unfair at an individual level (it doesn’t consider qualifications or true outcomes, just the rates). Also, if the actual qualification rates differ by group (due to external factors or past inequities), forcing parity might harm overall accuracy or even the qualified individuals in a group (by approving some unqualified people from that group to meet the quota).
Equal Opportunity: Proposed by Hardt et al. (2016), this focuses on the true positive rate being equal across groups. In a lending context: “Of those who would repay a loan (truly creditworthy), the same fraction of men and women should be approved.” This ensures that qualified people have equal chance, regardless of group – a model shouldn’t disproportionately miss true positives in one group. Equal Opportunity is a relaxation of Equalized Odds, which would require both TPR and FPR (false positive rate) to be equal across groups. Equalized Odds means the model’s error rates are identical across groups (so it doesn’t more often wrongly deny one group or wrongly grant to another). Equalized Odds is stricter; Hardt et al. argued that equal opportunity (just equal TPR) may be the more relevant guarantee in many cases (especially where the negative outcome is, say, denying a loan or treatment – we care more about not missing those who deserve positive outcomes).
Predictive Parity (Calibration): Another notion is that the model’s predictive value should be equal across groups. For example, if the algorithm assigns risk scores, a given score should correspond to the same likelihood of the actual outcome regardless of group (this was a point of contention in the famous COMPAS case on recidivism predictions). Predictive parity sometimes conflicts with equalized odds: the COMPAS debate showed you can’t have both calibration and equal error rates unless base rates are equal.
Overall Accuracy Equality: We might also ask that the accuracy (or some error rate) be equal for each group. For instance, the model is 90% accurate for both Group A and Group B. This can be a baseline check, but it’s not very nuanced: a model could have equal overall accuracy yet still be making more of one type of mistake for one group than another (e.g., more false negatives for one group and more false positives for another, which might be masked by equal aggregate accuracy).

The sobering realization from formal studies is that you generally cannot satisfy all fairness criteria at once (except in trivial cases, such as if groups truly have identical distributions or outcomes). This was demonstrated by multiple researchers (e.g., an “impossibility theorem” by Kleinberg, Mullainathan, and Raghavan, and by Chouldechova, in 2016-2017): for example, if base rates (prevalence of the outcome) differ between groups, you cannot have both equal predictive parity and equal error rates without sacrificing accuracy. Thus, deciding on a fairness goal often involves value judgments and context. Different domains might prioritize different metrics:

In hiring or college admissions, some stakeholders argue for a form of parity in selection rates (to ensure opportunities are distributed).
In medical diagnostics, one might prioritize equal opportunity or equalized odds (so that no group is less likely to get a correct diagnosis or treatment if needed), even if that means some parity in false alarms is compromised.

Beyond statistical measures of fairness in outcomes, there’s also fairness in process. For instance, procedural fairness might mean giving people the opportunity to contest a decision or to have a human review. Even if outcomes are balanced, an AI decision process might feel unfair if people have no say or understanding in how it was made. This is why many AI ethics guidelines include transparency and a “human-in-the-loop” for high-stakes decisions – to ensure there is recourse and that people don’t feel subject to a faceless algorithm with no appeal.

It’s also worthwhile to consider what some call the “bias mirror” problem – AI often holds a mirror to society. When evaluating an AI system’s impact, we might ask: “Is the AI making things worse or better compared to human decision-makers in the same task?” Sometimes algorithms can reduce human bias (if designed carefully). For example, a well-tuned algorithm for screening job candidates might ignore demographic cues that human interviewers (even unconsciously) use, potentially improving diversity in hiring. There have been cases reported where algorithmic selection improved gender balance in certain internship or grant selections relative to historical human decisions (because the AI, unlike some humans, did not systematically undervalue the female candidates). On the other hand, a poorly designed AI can amplify bias – doing systematically worse than humans would, and at a larger scale – all while appearing objective (“the computer says so, it must be fair”). This veneer of objectivity can make algorithmic bias harder to detect or challenge.

Another concern is model blind spots. These are regions or scenarios where the model is confidently wrong. For example, an autonomous car’s vision system might misclassify an unusual object or condition (there’s a famous anecdote of a self-driving car not recognizing a kangaroo properly because the way it jumps confused the detection system). For fairness, a blind spot might mean the model works poorly for a subgroup it didn’t “see” much during training – say, a voice assistant that struggles with certain accents or dialects not present in training data. Humans might catch these odd cases or approach them with caution, whereas a pure AI system might barrel ahead. This underscores why human oversight and rigorous testing are important, especially in safety-critical systems.

Researchers have been developing fairness-aware ML algorithms to address bias. Techniques include:

Pre-processing: Modify the training data to reduce bias, e.g. re-sampling or re-weighting data to better represent minority groups, or “debiasing” features (removing components correlated with sensitive attributes).
In-processing: Change the learning algorithm itself, e.g. adding a fairness constraint or regularizer to the objective. For instance, train the model with an added penalty if the predictions deviate too much from parity or equal opportunity between groups. This way the model directly learns to balance accuracy with fairness goals.
Post-processing: Adjust the model’s outputs after training. One example is the method by Hardt et al. (2016) which takes any classifier’s scores and finds group-specific decision thresholds to satisfy equalized odds or opportunity. In practice, this could mean using a lower score cutoff for a historically disadvantaged group to equalize true positive rates.

A key takeaway is that context matters – what fairness means and which techniques are appropriate will depend on the application and the values of the stakeholders involved. Engaging those affected (the stakeholders or protected groups) in defining fairness objectives is increasingly seen as important. Fairness in a criminal justice risk assessment might be defined and handled differently than fairness in a credit scoring tool or an online advertising algorithm.

To wrap up this section: achieving fairness in AI is not a one-time fix but an ongoing process. It requires:

Good data practices (collecting diverse, high-quality data and being mindful of historical bias),
Thoughtful modeling (choosing objectives and model forms that align with fairness goals),
Rigorous evaluation (measuring performance separately for different groups and looking at multiple fairness metrics, not just overall accuracy),
Possibly incorporating technical fairness interventions (pre/in/post-processing as needed),
And considering non-technical steps: transparency, user consent, and avenues for redress.

The encouraging news is that awareness of these issues is higher than ever in the AI community. Conferences now often require an ethics impact statement; companies are creating “Responsible AI” teams to audit products; and practical toolkits have emerged (e.g., IBM’s AI Fairness 360 toolkit for bias detection/mitigation, and Google’s What-If Tool which allows developers to visually probe model decisions and check fairness metrics). These developments are helping translate fairness principles into practice.

Principles and Frameworks for Ethical AI

In response to both the concerns and the promise of AI, various organizations – from research institutes to governments – have proposed high-level principles to guide AI development. Remarkably, many of these principles converge on similar themes, echoing classic ethical and human-rights values. We will review some of the prominent frameworks:

The Montréal Declaration (Canada, 2017)

The Montréal Declaration for Responsible AI (2017) was one of the early comprehensive sets of AI ethics principles, developed through a multi-stakeholder, crowdsourced process in Quebec. It outlines 10 principles grounded in fundamental values:

Well-being: AI should serve to enhance the well-being of all sentient beings. For example, AI in healthcare should improve health outcomes; AI in environmental management should help the planet. (The Declaration emphasizes that AI systems “must permit the growth of the well-being of all sentient beings” and not become a source of ill-being.)
Respect for Autonomy: AI should respect people’s autonomy and freedom of choice. This means AI shouldn’t coerce, deceive, or manipulate people against their will. For instance, users should know when they are interacting with an AI (no covert bots impersonating humans), and AI systems should be designed to give users control (e.g. the ability to opt out or override automated decisions).
Privacy and Intimacy: AI must protect privacy. Data acquisition and use should not unjustifiably intrude into people’s private lives. Individuals should have control over their personal data used by AI. The Declaration calls out the need to safeguard the “intimacy of thoughts and emotions” from AI analysis without consent. Misuses of personal data or excessive surveillance are to be avoided.
Solidarity: AI should promote solidarity and inclusion, helping to reduce inequalities. The benefits of AI should be shared broadly, not only by a privileged few. For example, if AI increases productivity, it should ideally enable better services or support for the disadvantaged, not just increase corporate profits. AI should also be used to support social safety nets and community values.
Democratic Participation: There should be democratic debate and oversight regarding AI. The public should have a say in how AI is deployed, especially in government and civic contexts. Decisions made by AI that affect people’s rights or opportunities should be intelligible and justifiable to those affected. Transparency to public authorities and the possibility of audits are highlighted.
Equity: AI should be equitable – it should not create or worsen unfair inequalities among individuals or groups. It should be designed to not discriminate or create new forms of domination or exclusion. Equity also implies accessibility: important AI technologies (like health or education AI) shouldn’t be available only to the rich; efforts should be made to make them broadly accessible to reduce social gaps.
Diversity and Inclusion: AI development should involve diverse stakeholders, and AI systems should be mindful of cultural and social diversity. They should not force a homogenization of values or lifestyles. In practice, this means including people of different backgrounds in AI design, and ensuring AI works well for different languages, cultures, and demographic groups. It also cautions against filter bubbles or profile locking – AI should not unduly narrow an individual’s opportunities or exposure to diverse content.
Prudence (Caution): A principle of caution means we should proactively think about the potential risks of AI and mitigate them. Before deploying AI in sensitive areas, it should undergo testing and safeguards to ensure safety. The Declaration even suggests restricting certain AI research or dissemination if misuse could pose grave risks (akin to how one might handle dual-use technologies). In short, move fast and break things is not acceptable for AI when human lives or social stability are at stake – prudence and foresight are required.
Responsibility: Those who design, deploy, or use AI must take responsibility for it. Humans remain accountable for AI-driven outcomes. For example, if an AI recommends an action, the ultimate decision should, in critical cases, be made by a human who can be held responsible (the Declaration explicitly says decisions like the choice to use lethal force must always remain with humans). It also implies that developers should follow professional codes of conduct and that there should be accountability mechanisms (like audit trails or liability frameworks) so that harms can be addressed.
Environmental Sustainability: AI should be developed and used with an eye towards environmental impact. This principle, increasingly noted in recent years, reminds us that training large AI models consumes significant energy and that AI hardware relies on resource-intensive supply chains. The Declaration urges minimizing energy consumption and e-waste, and using AI in service of environmental goals (not to accelerate ecological harm).

These Montréal principles collectively map to widely accepted ethical domains – well-being (beneficence), autonomy, justice (equity), privacy, accountability, etc. The declaration is also noteworthy for its process: it was developed via public workshops and consultations, which gave it a certain democratic legitimacy (citizens’ voices were included in shaping the principles). It was intended as a “living document,” to be updated as AI evolves. While not legally binding, it has influenced policy discussions and ethical charters in Canada and beyond.

The OECD AI Principles (2019)

The OECD (Organisation for Economic Co-operation and Development) AI Principles were adopted in May 2019 by 42 countries (OECD members and others) and later endorsed by the G20, making them a significant international consensus. They comprise five broad value-based principles for AI, which are very much in harmony with the Montreal Declaration and other frameworks:

Inclusive growth, sustainable development and well-being: AI should benefit people and the planet by driving inclusive economic growth, improving societal welfare, and advancing sustainability. This principle envisions AI as a tool for positive social impact – reducing inequality, improving quality of life, and helping to achieve the UN Sustainable Development Goals. For example, AI applications in education could help broaden access for underserved communities, or AI in environmental science could help combat climate change. It warns against AI that only benefits a small segment or exacerbates disparities.
Human-centered values and fairness: AI should respect the rule of law, human rights, and democratic values, throughout its lifecycle. This includes principles of non-discrimination and equality, liberty, privacy, and social justice. In practice, this means AI systems should be designed in a way that upholds these values and does not unfairly undermine them. The OECD explicitly includes fairness here – meaning AI should be designed to avoid unjust bias and to treat people equitably. It also calls for mechanisms like human oversight when necessary to ensure these values are respected. (Notably, the OECD uses the term “human-centered” – similar to EU’s later term “trustworthy AI” – emphasizing that AI should ultimately serve humanity, not the other way around.)
Transparency and explainability: There should be transparency and responsible disclosure regarding AI systems. Stakeholders should be able to understand when AI is being used and obtain appropriate information about how it works (to the extent possible). For example, users have a right to be notified when they are interacting with an AI (a chatbot, a decision system) rather than a human. Moreover, those adversely affected by an AI decision should be able to get an explanation sufficient to challenge or seek redress. Transparency can also mean documenting the design and training of AI so that regulators or auditors can inspect it. The OECD acknowledges that full explainability might not always be achievable (some AI like deep neural nets are complex), but the spirit is to strive for meaningful transparency.
Robustness, security and safety: AI systems must be robust, secure, and safe throughout their lifecycle. This principle demands that AI should be tested and assured against a range of conditions – it should reliably do what it’s intended to, withstand cyberattacks or manipulation, and fail safely if something goes wrong. For instance, an AI in a self-driving car should handle not just ordinary scenarios but also edge cases (a sudden obstacle, a sensor glitch) in a way that minimizes harm. Robustness includes resilience to both unintentional failures and intentional misuse. This also implies continuous monitoring and evaluation of AI systems once deployed, to ensure safety is maintained over time.
Accountability: AI actors (whether organizations or individuals) should be accountable for the proper functioning of AI systems and compliance with the above principles. There needs to be accountability mechanisms – such as audits, risk assessments, or the ability to appeal decisions. This principle makes it clear that saying “the AI did it” is not an excuse; those who develop or deploy AI must be held responsible for its impacts, and governance structures should reflect that. For example, a company using an AI recruiting tool should be accountable for biases or errors in that tool as if a human were making the decisions, and they should have processes to regularly check and mitigate such issues.

In addition to these five principles, the OECD document provides recommendations to governments on how to foster a trustworthy AI ecosystem (invest in R&D, support workforce training, enable data sharing, etc.). The OECD principles have been highly influential; they provided a blueprint for the G20 AI Principles and informed the European approach. They also led to the OECD launching an AI Policy Observatory to track national AI policies and progress on implementing these principles.

The European Union: “Trustworthy AI” Guidelines and the AI Act

The European Union has been very active in the AI ethics and governance space. In 2019, the EU’s High-Level Expert Group on AI released Ethics Guidelines for Trustworthy AI. They articulated that trustworthy AI rests on three pillars: it should be lawful (complying with all laws), ethical (adhering to ethical principles), and robust (technically and socially robust). They then listed 7 key requirements that operationalize these principles:

Human Agency and Oversight: AI systems should empower people and respect human autonomy. Humans should be able to intervene or oversee AI when appropriate (“human-in-the-loop” or “human-in-command”). Important decisions should not be left entirely to automated systems without possibility of human review. For example, an AI medical diagnosis system might assist a doctor, but the doctor remains the final authority and can override the AI’s suggestion. This also means AI shouldn’t nudge or manipulate people in ways that undermine their agency – e.g., always giving users the ability to opt out of automated decisions if feasible.
Technical Robustness and Safety: AI needs to be resilient and secure. It should be reliable, with a fallback plan if something goes wrong. This includes consideration of adversarial attacks, potential misuse, and general reliability. For instance, an AI in a power grid should be tested for how it handles unusual spikes or faults; if it fails, it should fail gracefully (not catastrophically). Robustness also covers accuracy and reproducibility – the system should perform as intended consistently and not be overly fragile.
Privacy and Data Governance: AI must respect privacy and ensure adequate data protection. This not only means complying with privacy laws (like GDPR in Europe) but also ensuring proper data governance – quality of data, integrity, and access controls. For example, data used to train AI should be obtained lawfully and with consent where required, stored securely, and there should be measures to prevent unnecessary or prolonged retention of personal data. Privacy by design is emphasized (i.e., building systems that minimize use of personal data or use privacy-preserving techniques).
Transparency: AI systems should be transparent about their operations, limitations, and outputs. This includes traceability of the AI’s processes (developers should document how the model was trained, what data, what algorithms, etc., so that its development can be traced if needed). It also includes explainability – individuals have the right to know the reasons behind AI decisions affecting them, in a way they can understand. Moreover, people should know when they are interacting with an AI and not assume it’s human. For instance, if a chatbot is handling customer service, it should disclose that it’s an AI. If an AI rejects your loan, there should be a way to get an explanation like “your debt-to-income ratio was below the required threshold.”
Diversity, Non-discrimination, and Fairness: AI should be inclusive and avoid unfair bias. It should be usable by and deliver benefits to a wide range of people – regardless of age, gender, abilities, or other characteristics. In design terms, that means involving diverse teams in development and testing across different user groups. The system’s decisions should be periodically checked for bias or disparate impact, and steps taken to mitigate any inequities found. Accessibility is also part of this – e.g., ensuring AI tools have interfaces that people with disabilities can use (for example, screen reader compatibility, or if it’s a voice assistant, that it works across accents and speech impairments as much as possible).
Societal and Environmental Well-being: AI should benefit all of society and the environment, not just individuals or specific narrow interests. This principle encourages assessing the broader impact of AI deployments – for example, will an AI-driven automation cause large job losses in a community, and if so, are there retraining programs or other mitigations? AI should be aligned with sustainable development: using AI to improve environmental outcomes (smart grids, climate modeling) rather than worsen them, and being mindful of AI’s own carbon footprint. It also means considering the social impact: for instance, an AI-driven content platform should consider its effect on public discourse or mental health.
Accountability: There should be mechanisms to ensure responsibility and accountability for AI systems and their outcomes. This could involve auditability – enabling third parties or regulators to audit algorithms and data. It also involves having redress mechanisms: if someone is harmed or wronged by an AI decision, they should have an avenue to appeal or correct it. Organisations should conduct impact assessments for their AI (similar to privacy impact assessments), and there should be oversight bodies or processes to ensure compliance with these principles. In practice, this might mean internal AI ethics boards, external audits, or even regulatory supervision for high-stakes AI.

These EU guidelines were non-binding, but they heavily influenced subsequent policy. Building on the ethics guidelines, the EU moved to draft a law: the EU AI Act (proposed in 2021, with adoption expected around 2024-2025). The AI Act is a landmark attempt to regulate AI by categorizing applications by risk. In its proposed form, it includes:

Unacceptable Risk AI: Certain AI applications are outright banned for being contrary to fundamental rights or safety. Examples include AI for social scoring of citizens (as practiced in some form by China) and AI for real-time biometric identification in public (e.g., live facial recognition by police in crowds), with very narrow exceptions. Also banned is AI that involves subliminal techniques or manipulative methods that can cause physical or psychological harm, and systems that exploit vulnerabilities of specific groups (like children, persons with disabilities) to materially distort their behavior. These are uses of AI that the EU deems have no place in a human-rights-respecting society.
High Risk AI: These are AI systems that aren’t banned but are seen as having a high potential for harm if not properly managed. They are allowed only if they comply with strict requirements and undergo assessment. The Act’s annex defines which applications fall here, including AI used in critical infrastructure, education (e.g. scoring exams), employment (screening or evaluating candidates), credit lending, law enforcement (certain analytic tools), border control (e.g., lie detectors at borders), judicial decision-making, and medical devices. For instance, an AI CV-scanning tool used in hiring or an AI system used by judges to inform sentencing would be high-risk. These systems will have to meet requirements such as: having a risk management system, high-quality training data to minimize bias, transparency to users, human oversight, robustness, and accuracy. Providers must also maintain documentation (technical documentation, logs, etc.) for audit purposes. They will likely have to go through a conformity assessment before deployment (similar to how electronics get a CE mark for safety in Europe).
Limited Risk AI: These include systems that interact with humans but are not high-risk—like chatbots or AI that generates deepfakes. They are not subject to strict requirements, but do have some transparency obligations. For example, a chatbot must disclose to users that it is AI and not human. Likewise, if an image or video is AI-generated or manipulated (a “deepfake”), it should be disclosed (unless it’s for certain authorized purposes like satire or security research). This category essentially covers AI where the main risk is that users might be misled or not realize they are dealing with AI. A notice is deemed sufficient mitigation.
Minimal Risk AI: All other AI systems, like most recommender systems, video game AIs, spam filters, etc., which pose minimal risk to rights or safety. These face no new obligations under the Act. The vast majority of AI applications today likely fall in this category. The EU did not want to stifle innovation for low-risk use cases, so it largely leaves them unregulated (aside from existing laws). The Act does encourage voluntary codes of conduct for such AI, but nothing mandatory.

The AI Act also has provisions for oversight and enforcement, including fines (proposed fines can be up to 6% of global turnover for the most severe violations, somewhat analogous to GDPR fines). It’s still under negotiation as of 2025, with debates on issues such as how to treat general-purpose AI (like GPT-type models) and ensuring the requirements are practical.

The EU approach, combining broad ethical principles with a binding regulation (the AI Act), is being closely watched globally. If enacted, it will be the first comprehensive AI regulation in the world, and it may set a de facto standard (any company selling AI products into the EU will have to comply, which could influence what they do elsewhere).

It’s notable that the EU AI Act’s philosophy is risk-based: regulate more where potential harm is greater. This is similar to how we regulate, say, drugs or airplanes more strictly than household appliances. The high-risk categories map closely to areas where ethical concerns are strongest (hiring – bias; law enforcement – rights; healthcare – safety; etc.), essentially putting into law many of the ethical principles we discussed.

United States Initiatives

In the United States, the approach to AI ethics has so far been more piecemeal. There isn’t (yet) a single federal law like the EU’s AI Act. However, various initiatives and sector-specific steps capture a growing focus on AI ethics:

White House “Blueprint for an AI Bill of Rights” (2022): The U.S. Office of Science and Technology Policy (OSTP) released a Blueprint for an AI Bill of Rights in October 2022, which is a non-binding set of principles for the design and deployment of AI systems. It outlines five core protections:
1. Safe and Effective Systems: You should be protected from unsafe or ineffective AI systems. This means AI should be tested pre-deployment for safety and potential risks, with input from diverse communities and domain experts. For instance, an AI system used in hospitals should undergo rigorous trials similar to a medical device. The principle also implies continuous monitoring of AI performance and the ability to shut down or adjust systems that are not working as intended.
2. Algorithmic Discrimination Protections: You should not face discrimination from AI; algorithms should be designed and used in an equitable way. Developers should proactively assess and mitigate bias (for example, performing bias audits and making the results public when possible). This principle explicitly connects to civil rights – existing laws against discrimination (in credit, employment, housing, etc.) still apply if AI is making the decisions. The Blueprint suggests techniques like algorithmic impact assessments, similar to environmental impact reports but for algorithms, to evaluate potential disparate impacts before deployment.
3. Data Privacy: You should be protected from abusive data practices and have agency over how data about you is used. This aligns with privacy rights: AI systems should employ data minimization (only using what’s needed), security, and obtain consent for data usage in most cases. It also encourages methods like privacy-by-design and not using data in ways people would object to (like monitoring employees’ private conversations or behavior without oversight).
4. Notice and Explanation: You should be informed when an automated system is in use and understand its outputs. In practice, this means if AI is involved in a decision that impacts you, you should know that AI played a role. Moreover, you should be able to get an explanation that is understandable about what the AI is doing and why it arrived at its decision. For example, if an AI denies you a job interview, you might get a notice: “An algorithm screened your application,” and an explanation like “The system identified a mismatch in required skills (Python programming) compared to your resume.” Transparency is key for trust and for enabling recourse.
5. Human Alternatives, Consideration, and Fallback: You should be able to opt out of AI decisions in favor of a human review in many cases, and there should be a backup plan if the AI fails. For instance, if an AI customer service chatbot isn’t helping you, you should have the option to talk to a human agent. Or if an AI system flags an error (say, a bank’s fraud detection), you should have a path to have a human investigate and correct any mistake. This principle recognizes that AI will sometimes be wrong or inappropriate, and human judgment is the final safety net.
The Blueprint for an AI Bill of Rights is not law, but it serves as guidance to federal agencies and a signal to industry about best practices. It also has an accompanying technical document with concrete steps for implementers. It explicitly tied these principles to American values, like how freedom from discrimination and rights to privacy should extend into the AI era. We might expect that future regulations or procurement requirements (for federal agencies buying AI) will draw on these principles.
Federal Trade Commission (FTC): The FTC has asserted it can regulate unfair or deceptive AI practices under its existing authority. In April 2021, the FTC warned businesses that selling or using racially biased algorithms could be considered an “unfair or deceptive practice” and lead to enforcement. Basically, even without new laws, the FTC can potentially fine companies if their AI discriminates or if they lie about what their AI does (e.g., claiming it’s bias-free when it isn’t). This puts a degree of pressure on companies to be truthful and careful with AI in consumer contexts.
NIST AI Risk Management Framework (2023): The National Institute of Standards and Technology (NIST) released a voluntary AI Risk Management Framework (version 1.0) in January 2023. It’s a guidance document to help organizations identify and mitigate risks of AI systems in terms of reliability, bias, security, etc. The framework uses a cross-industry, collaborative approach (similar to NIST’s famous Cybersecurity Framework) and includes considerations of trustworthiness and fairness. While voluntary, many companies may adopt it to demonstrate responsible AI practices, and it might shape standards or future regulation.
Sectoral regulations: In certain domains, AI is being addressed by updating existing laws:
- The Equal Employment Opportunity Commission (EEOC) has started examining AI in hiring for compliance with anti-discrimination laws. In 2023, the EEOC released guidance on employers using AI for employment decisions, clarifying that the use of AI doesn’t excuse discrimination – employers are liable if their AI tools have disparate impact. There’s also an initiative to draft guidelines for auditing hiring AI tools.
- The Food and Drug Administration (FDA) has been working on how to regulate AI/ML-based medical devices, especially those that continuously learn from new data (adaptive algorithms). They’ve proposed that companies submit plans for how the AI will be updated and monitored in the field.
- Transportation (FAA/NHTSA): For autonomous vehicles and drones, regulators are figuring out safety standards that incorporate AI reliability.
State and local laws: A number of U.S. states and cities have started passing their own AI-related laws, especially on AI in hiring and surveillance:
- Illinois was a pioneer with the Artificial Intelligence Video Interview Act (effective Jan 2020). It requires employers to notify candidates if AI is being used to analyze their video interviews, to explain how the AI works, and to obtain consent. It also has provisions on limiting data sharing and requiring destruction of interview videos upon request. This law was about transparency and privacy more than bias, but it set a precedent for regulating AI in hiring.
- New York City passed Local Law 144 (effective July 2023) which requires that companies using “automated employment decision tools” for hiring or promotion must subject those tools to an annual bias audit by an independent auditor. They also must publicly post a summary of the audit results and notify candidates about the use of such tools. The bias audit checks for disparate impact (e.g., difference in selection rates by gender or race) and the results (like accuracy and bias metrics) have to be made available. NYC’s law is one of the first to mandate an actual bias testing regime for AI.
- California and other states have proposals around AI decisions in finance and employment; e.g., California’s employment regulations now explicitly say that if AI or automated decision tools are used, the outcomes are subject to anti-discrimination law and need validation to ensure they’re job-related and fair.
- Several U.S. cities (San Francisco, Boston, Portland, etc.) banned government use of facial recognition technology starting in 2019, due to concerns over accuracy and civil liberties. Some of these bans have exceptions (e.g., for accessing phone face unlock or if mandated by federal law), but they represent a pushback on unregulated surveillance.

Overall, in the U.S. the trend is towards more guidance and some targeted rules, but not a comprehensive national law yet. There are ongoing discussions in Congress (e.g., past bills like the Algorithmic Accountability Act were introduced to require companies to audit high-risk AI for bias, but haven’t passed as of 2025). It’s likely that we will see more sector-specific regulations first (for example, rules for AI in financial services or in healthcare, via those sectors’ regulators), as well as continued use of existing laws (civil rights law, consumer protection law) to litigate harmful AI outcomes.

Other Notable International Efforts

UNESCO’s Recommendation on the Ethics of AI (2021): In late 2021, all 193 member states of UNESCO adopted a comprehensive framework on AI ethics. It covers principles similar to OECD’s (inclusive benefit, harm avoidance, human oversight, fairness, privacy, transparency, accountability, sustainability, etc.) and also provides detailed policy guidance – for example, calling for ethics impact assessments, bans on social scoring for surveillance, and strong data governance. While not binding, it’s a globally negotiated document, so it carries moral weight and can inspire national strategies.
IEEE “Ethically Aligned Design”: The IEEE, a major engineering organization, published extensive guidance (a multi-edition document) on how to align AI systems with ethical values (first edition in 2019). It delves into issues like embedding human rights into AI, methodology for ethical design, and specific issues (e.g., autonomous weapons, or how to treat AI agents). This has influenced industry thinking and offers a toolbox of considerations for practitioners.
Industry Self-regulation: Many tech companies have their own set of AI principles publicly announced (Google’s AI Principles, Microsoft’s Responsible AI principles, etc.), often echoing the themes above (fairness, transparency, reliability, privacy, accountability). Some have internal review boards for sensitive AI projects. However, the effectiveness of self-regulation has been questioned – e.g., Google’s handling of internal AI ethics researchers in 2020 led to criticism that the company wasn’t fully living up to its principles when there was a conflict with business interests (the high-profile departures of Timnit Gebru and Margaret Mitchell from Google’s AI ethics team after raising concerns exemplified this tension).
Academic and Civil Society: There’s an active role of NGOs and research institutes in AI ethics. Organizations like the Algorithmic Justice League (founded by Joy Buolamwini) advocate against bias in AI. The Partnership on AI (a consortium of companies and nonprofits) works on best practices. Universities have introduced ethics curricula into computer science programs to train the next generation of engineers in these considerations.

A significant emerging trend is the idea of algorithmic audits and certifications. Just as financial audits provide assurance of accuracy in financial statements, independent audits of AI systems for bias, privacy, and security are starting to occur. For instance, some startups and consulting firms now specialize in auditing AI systems. In some domains this might even become required (NYC’s law essentially mandates an audit for hiring tools). We may see the rise of compliance frameworks (e.g., “AI ethics certification” for products that meet certain standards).

Another development: AI research conferences and journals are instituting ethics review for submissions. Researchers need to disclose if their work has potential societal impacts or involves sensitive data, etc. This is akin to research involving human subjects requiring institutional review board (IRB) approval. AI is increasingly viewed through that lens, especially research on things like deepfakes, surveillance tech, or large language models that could be misused for misinformation – researchers are expected to discuss the ethical implications.

Synthesis – Towards Responsible AI

Across all these efforts – whether principles, laws, or self-governance – we see a common vision of Responsible AI: AI that is fair, transparent, accountable, safe, and aligned with societal values. The challenge now is turning these high-level principles into consistent practice.

What are some concrete steps organizations and society can take to ensure AI is developed and used responsibly?

Ethical Impact Assessments: Before deploying AI in high-stakes situations, perform an assessment of potential impacts on people. Similar to environmental impact assessments for big construction projects, an AI impact assessment would analyze possible biases, privacy issues, safety risks, and impacts on stakeholders. The assessment should involve input from diverse stakeholders, and it can be made public for accountability. In fact, some jurisdictions (like Canada and parts of the EU) have started requiring Algorithmic Impact Assessments for government use of AI.
By-design Approaches: Incorporate ethics from the start of design. “Privacy by design” and “security by design” are established ideas; now we talk about “fairness by design” or “ethics by design.” This means thinking about who could be harmed by the system, how to mitigate that, and what governance to embed (e.g., logs for accountability, user consent flows, bias mitigation algorithms) at the design phase. It’s much harder to bolt on ethics at the end.
Diverse and Interdisciplinary Teams: Ensure that teams working on AI include not just technical experts, but also people versed in ethics, law, and domain-specific social issues. Additionally, demographic diversity within teams can bring perspectives that spot problems others might miss (for example, a team with no women might not immediately see a gender bias issue in a hiring algorithm). Interdisciplinary collaboration between computer scientists and social scientists (psychologists, sociologists) can improve understanding of AI’s context.
User Engagement and Education: For AI systems used by the public, it’s important to educate users about what the system does and its limitations. In some cases, involving end-users in testing can surface issues (like how a medical AI tool might be confusing to doctors or patients, indicating a need for better explanation or training). Public engagement is also key in governmental AI deployments – e.g., having community forums about police use of AI surveillance, to gauge public sentiment and set boundaries.
Continuous Monitoring and Auditing: AI systems should be continuously monitored in operation, since their performance can change over time (data drift, or users interacting in new ways). Key metrics like error rates and bias indicators should be tracked. Periodic audits (internal or external) can check if the AI is still meeting fairness and accuracy goals. For example, a bank might audit its credit AI annually to see if any particular group is being disproportionately rejected and why. Auditing can also include checking for concept drift or for new types of errors that weren’t present initially.
Transparency and Documentation: There’s a saying, “sunlight is the best disinfectant.” Having transparency can deter irresponsible practices. This could mean publishing information about an AI system (model card or datasheet) describing its intended use, performance, and limitations. It could also mean open-sourcing certain models or sharing datasets for public scrutiny, when appropriate. For high-impact systems, regulators could require companies to submit documentation (similar to how drug companies publish clinical trial results).
Regulatory and Oversight Mechanisms: Encourage policymakers to develop smart regulations that protect people without unduly stifling innovation. The EU’s risk-based approach is one model; others might include updating consumer protection laws to explicitly cover automated decision-making. Oversight bodies might be established – for instance, some have proposed an FDA-like agency for algorithms, or expanding the mandate of existing agencies to specifically cover AI. The key is that there needs to be an enforcement backstop, not just voluntary adherence.
International Cooperation: AI is global. Models and data cross borders, and ethical issues (like deepfake misinformation or autonomous weapons) are international in scope. Cooperation through venues like the UN or OECD can help set common norms and prevent “race to the bottom” scenarios. It also helps smaller nations voice their concerns (e.g., about not having their cultures steamrolled by AI products made elsewhere). International agreements, even if non-binding, create peer pressure and shared expectations that can raise the overall standard.

Importantly, ethical AI is an ongoing journey, not a one-time certification. Just as human society’s values and expectations evolve, AI systems will need to be continually evaluated and improved. What matters is building a culture of responsibility. This is similar to what happened in fields like medicine or engineering over the past century: early on, disasters and scandals (building collapses, harmful medical experiments) led to the development of professional ethics and regulatory standards. AI is going through a similar maturation process. For instance, one could envision that in a decade, it will be standard for AI developers to be licensed or certified in some way, or to have taken an “Hippocratic Oath”-like pledge for AI (there are already discussions of this in some professional bodies).

From a pragmatic perspective, organizations are finding that responsible AI is also good for business in the long run. Biased or unsafe AI can lead to lawsuits, regulatory fines, or reputational damage. On the flip side, if you can prove your AI is fair and robust, that becomes a competitive advantage as users and clients demand trust. For example, a recruiting software company that can show audited fairness might win contracts over one that cannot.

As a concluding thought, the rapid advancement of AI makes ethical considerations ever more critical. The recent rise of powerful AI systems (like GPT-4 style language models that can generate human-like text, or deepfakes that can create ultra-realistic fake images/video) has demonstrated both exciting capabilities and new risks (e.g., potential for mass-produced disinformation, impersonation scams, or simply the propagation of biases present in training data). These developments have spurred calls for even more robust ethics and governance – including from AI researchers themselves who note that as systems get more general and powerful, the unknown risks grow. Some experts advocate for a degree of humility and restraint: deploying AI gradually and with safeguards, rather than rushing “fast and break things.” Concepts like “AI alignment” (ensuring AI goals align with human values) and “existential risk from AI” (the idea that superintelligent AI could pose a risk to humanity if not properly controlled) were once purely theoretical but are increasingly part of mainstream discourse.

To end on a positive note: if we manage AI well, it can be an incredible force for good. Imagine AI systems that help discover cures for diseases, personalize education for every child, optimize energy use to fight climate change, or take over dangerous jobs so humans don’t have to. These are all on the horizon or already happening. The key question is often not whether to use AI at all, but how to use it responsibly. As one of the themes of this text has been: technology doesn’t automatically equate to progress – it’s progress if it leads to better outcomes for people. AI ethics and governance are about ensuring that link: guiding this powerful technology such that it truly improves societies and lives, while respecting the rights and dignity of all.

In summary, the future of AI will be shaped by the choices we make today. Upholding principles of ethics and human-centric design in AI is not a one-off task but a continuous commitment. It involves technical innovation (to design fair, explainable, and safe AI), legal and policy innovation (to create adaptive regulations), and social innovation (to involve communities in decisions about AI). By striving for Responsible AI, we increase the chances that AI will be a boon to humanity – helping to create a more just, prosperous, and sustainable world.

Fun Projects and Further Exploration: Ethics doesn’t make AI any less fascinating to play with! If you are intrigued by AI, here are some ideas to explore its capabilities (responsibly) and get a personal sense of the issues discussed:

Try out a large language model (LLM) or coding assistant (many are available via web demos or open-source). See how well it can write a poem or answer questions. As you do, notice where it might go wrong – does it ever give biased or incorrect answers? This hands-on experience highlights why transparency and human oversight are needed; these models are powerful but not infallible.
Experiment with image generation models (like DALL-E or the open-source Stable Diffusion). You can create astounding artwork from text prompts. But also try prompts involving people – do you notice any biases in how the AI portrays gender or ethnicity in certain occupations? (Researchers found some image AIs had bias, e.g., prompting for “CEO” often yielded images of men.) This can deepen your understanding of bias and the importance of the data that goes into these systems.
If you like working with data, consider entering an AI fairness hackathon or competition. For example, there have been competitions to reduce bias in mortgage lending data or to improve fairness in healthcare AI. These let you apply techniques from this chapter – and see firsthand the trade-offs between accuracy and fairness.
Check out open-source tools like IBM AI Fairness 360 or Google’s What-If Tool (part of TensorBoard) on a dataset. They provide an environment to test bias metrics or see the effect of changing thresholds. It’s a practical way to learn how data and model choices affect outcomes.
Follow some AI ethics thought leaders or organizations on social media (such as Arvind Narayanan, Timnit Gebru, Kate Crawford, the ACM FAT conference, etc.). They often share the latest examples of AI issues (both good and bad). Keeping up-to-date will show you that this field is evolving quickly – with new cases, scandals, and breakthroughs happening all the time.*
Finally, if you’re interested in policy, you could read the full text of some frameworks we mentioned (they’re usually readable). For instance, skim the OECD AI Principles or the EU Ethics Guidelines (and the draft AI Act). Think about how these abstract principles might apply to a specific AI system you use or are building.

Each of these explorations can make the concepts from this chapter more concrete. AI is an amazingly rich field, and engaging with it hands-on will reinforce why ethics is not just an add-on, but an integral part of creating technology that truly benefits everyone.

As we conclude this chapter, remember that the future of AI is not predetermined. It will be shaped by the collective actions of researchers, developers, policymakers, and users. By infusing ethical consideration at every stage – from design to deployment – we can guide AI toward outcomes that augment human capabilities and uphold human values, rather than undermine them. The coming years will test our wisdom in wielding this double-edged sword of technology. Let’s ensure we meet that challenge with both ingenuity and integrity, so that AI fulfills its promise as a tool for building a better society.

References

Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. In Proceedings of Machine Learning Research, 81, 1–15.

Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data, 5(2), 153–163.

European Commission. (2021). Proposal for a regulation laying down harmonised rules on Artificial Intelligence (Artificial Intelligence Act).

European Commission High-Level Expert Group on AI. (2019). Ethics Guidelines for Trustworthy AI. European Commission.

Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems (NeurIPS 2016).

IBM Research. (2018). AI Fairness 360 Open Source Toolkit. IBM Corporation.

Illinois General Assembly. (2019). Public Act 101-0260: Artificial Intelligence Video Interview Act. State of Illinois.

Montréal Declaration. (2017). Montréal Declaration for the Responsible Development of Artificial Intelligence. Université de Montréal.

New York City Council. (2021). Local Law No. 144: A local law to regulate the use of automated employment decision tools. City of New York.

OECD. (2019). OECD Principles on Artificial Intelligence. Organisation for Economic Co-operation and Development.

Office of Science and Technology Policy. (2022). Blueprint for an AI Bill of Rights: Making automated systems work for the American people. Executive Office of the President of the United States.

Suresh, H., & Guttag, J. V. (2021). A framework for understanding sources of harm throughout the machine learning life cycle. In Proceedings of the 2021 ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO ’21).

Wexler, J., Pushkarna, M., Bolukbasi, T., Wattenberg, M., Viégas, F., & Wilson, J. (2019). The What-If Tool: Interactive probing of machine learning models. Google PAIR.

# AI and Ethics – Principles, Fairness, and Regulation {.unnumbered} Artificial Intelligence is a powerful tool – and like any powerful tool, its use comes with great responsibility. AI and machine learning systems are now being deployed in areas that directly affect people’s lives: hiring, finance, healthcare, law enforcement, education, and beyond. This amplifies the importance of **ethics in AI** – we must ask not only *“Can we do it?”* but *“Should we do it, and how?”*. In this chapter, we explore what it means to align AI with human values and rights. We discuss definitions of fairness, sources of bias in the AI pipeline, and emerging best practices to mitigate these issues. We also survey major **international principles and regulations** that have been proposed or enacted to guide the ethical development and use of AI. ## Why AI Ethics Matters Some of the stories from the previous chapter likely underscore why ethics in AI is crucial. When a recruitment algorithm discriminates by gender or an exam-grading algorithm unfairly disadvantages certain students, it becomes evident that AI decisions can have profound moral and social impact. At its core, **AI ethics** is about ensuring AI systems are aligned with our **social and moral values**, and that they operate in a manner that is **beneficial and fair** to individuals and society. Key considerations include: * **Fairness and non-discrimination:** AI should not treat people unfairly or unequally on the basis of characteristics like race, gender, age, etc., without justification. Unintended bias needs to be detected and corrected, so that systems do not perpetuate historical discrimination. * **Transparency and explainability:** There should be clarity about how AI systems make decisions. If you are denied a loan or a job by an algorithm, ideally you should know why, or at least have the decision be challengeable. Opaque “black boxes” can be problematic, especially in critical areas. * **Accountability:** There must be an answer to “Who is responsible if something goes wrong?” Is it the developer, the company deploying the AI, or the AI itself (which legally can’t be held accountable)? Ethical AI frameworks insist that accountability lies with the humans and organizations that design and deploy the system. * **Privacy:** AI systems often rely on large amounts of data. Respecting privacy and securing data is an ethical imperative. Misuse of personal data, or surveillance without consent, are major concerns. (For example, face recognition used pervasively in public without oversight raises serious privacy issues.) * **Human autonomy:** AI should augment human decision-making, not undermine it. For example, lethal autonomous weapons that decide whom to target without human approval raise deep ethical questions. Even in consumer tech – say, a content recommendation algorithm that nudges someone toward addiction or extreme beliefs – we must ask if the AI is respecting the user’s autonomy and well-being. * **Safety and security:** AI systems, particularly those in physical domains (self-driving cars, medical devices) or critical infrastructure, need to be safe and robust. Faulty behavior can cause physical harm. Additionally, AI systems should be secured against attacks or misuse (imagine someone hacking an AI traffic control system – the results could be catastrophic). * **Human dignity and rights:** Fundamentally, AI ethics aligns with human rights. AI should not be used to undermine human dignity. For instance, social scoring systems that rank citizens (as seen in some controversial surveillance implementations) risk treating people as data points rather than individuals with rights. Joanna Bryson, an AI ethics researcher, famously remarked that *“AI is not too different from any other technology; it’s just so powerful and amplifying that it forces us to face who we are.”* In other words, AI will reflect and magnify our values – good or bad – at scale. That puts the onus on us to **be very intentional about those values** when we design and deploy AI. ## Understanding Bias and Fairness in the AI Pipeline To create ethical AI, a key first step is understanding **where things can go wrong.** A recent framework by Suresh and Guttag (2021) outlined several stages in the machine learning pipeline where bias or harm can be introduced. Let’s briefly examine these potential sources of unfairness or harm: * **Historical Bias (in Data Collection):** The world itself can be biased, and data collected from the world will reflect that. For example, historical hiring data reflected gender bias in tech (fewer women were hired), so any model trained on it would inherit that bias. Historical bias isn’t caused by the AI per se; it’s in the input data. But it becomes part of the AI’s “DNA.” Another example: crime data might show more arrests in certain neighborhoods not only because of true crime rates but because of biased policing practices. An AI predicting crime based on that data would reinforce the policing bias. *Lesson:* We need to scrutinize what our data represents. Is it an accurate, fair picture of what we want to model, or is it a mirror of past injustices? * **Representation Bias (Sampling):** This occurs if the data collected doesn’t represent the population that the model will serve. Suppose you build a health diagnostic AI using data mostly from male patients; it might perform poorly for female patients. Or an image recognition system trained mostly on lighter-skinned faces will do badly on darker-skinned faces – which indeed happened with early facial recognition systems. One landmark study (Gender Shades) found error rates under 1% for gender classification on light-skinned male faces, but as high as **34%** on dark-skinned female faces. Ensuring diverse and representative data is crucial. Otherwise, the model will systematically disadvantage under-represented groups (a form of **aggregation bias** – treating a diverse population as if it were homogeneous, typically like the majority in the data). * **Measurement Bias (Labeling and Features):** Sometimes the features or labels we use are problematic proxies. For instance, using ZIP code as a feature in a credit scoring model might indirectly encode race or socioeconomic status, leading to “redlining” effects (denying loans to certain neighborhoods). Or consider a label like “creditworthiness” – if defined by past loan repayment data, it might incorporate bias if certain groups were unfairly denied loans in the past (meaning we lack data on their repayment capability, or those who got loans are a skewed sample). Measuring the wrong thing, or measuring in a skewed way, leads to biased outcomes. Another example: in hiring, using “years of experience” as a key feature might seem neutral, but if women often had career breaks or faced past exclusion, that feature could indirectly disadvantage them. * **Aggregation Bias (Modeling):** This refers to using one model for groups that really have different data patterns. For example, a healthcare diagnostic algorithm might perform better for one demographic group than another if important differences aren’t accounted for. The “one-size-fits-all” approach can fail if the population is heterogeneous in ways that affect the prediction. The solution might be to have group-specific models or at least include group attributes so the model can adjust, though doing so raises its own fairness questions (explicitly using a sensitive attribute can be controversial, even if the intent is to improve fairness). * **Learning Bias (Objective Function & Optimization):** The choice of objective function and how we train the model can introduce bias. If a model is trained just to maximize overall accuracy, it might sacrifice performance on minority groups because it can achieve higher accuracy by focusing on the majority. For instance, if 90% of training data is one class and 10% another, a classifier could be 90% accurate by always predicting the majority class – but that means it’s 0% accurate on the minority class. A naive training process might thus encode a form of bias by not prioritizing performance equity. Additionally, many algorithms assume data are IID (independent and identically distributed); if this isn’t true (e.g., if certain patterns vary by subgroup or over time), the learned model might be biased or brittle (**evaluation bias** – the metrics used might not capture fairness or real-world performance for all groups). * **Deployment Bias:** This occurs when the model, once deployed, is used in a context or manner not originally intended, leading to harm. For example, a predictive policing model might have been intended to allocate resources, but if it’s used punitively (e.g., justifying heavier policing or surveillance in an area without broader context), it can cause a feedback loop of reinforced bias. Or using an algorithm designed for one population on a different population without recalibrating can be problematic. Deployment also covers *feedback loops* – how the model’s outputs can change the world and thus the future data. A classic case: if a loan algorithm disproportionately denies a certain demographic, those people never get a chance to build credit, and the model’s bias is reinforced by the subsequent data (it “learns” that those people have no credit history, justifying future denials). Similarly, in predictive policing, if an area is flagged by the algorithm, police are sent there more often and will record more incidents (not necessarily because the true crime rate was higher, but because they looked more), and then the new data “proves” the area is high-crime, prompting further enforcement – a self-fulfilling prophecy. Understanding these phases helps identify where interventions are possible: improving data collection, adjusting feature choices, revising model objectives, etc. To systematically address fairness, researchers have also defined various **fairness criteria**. It turns out fairness can be defined in multiple ways, and not all definitions can be satisfied simultaneously – there are trade-offs (sometimes even formal impossibility results). A few common notions include: * **“Blindness” (Anti-classification):** The idea that the model should not use protected attributes (like race, gender) at all – essentially, treating everyone the same way. This is a common intuition (fairness as *attribute-blindness*). For example, a hiring algorithm that explicitly ignores gender. While this can prevent *direct* discrimination, it doesn’t guarantee fairness – because proxies for the sensitive attribute might still be in play, and treating everyone the same can actually *preserve* disparities if the underlying data were biased. In practice, simply being “blind” to a characteristic doesn’t ensure fair outcomes. * **Group-Specific Treatment or Thresholds:** One approach to improve fairness is to allow the decision rules to differ by group, to offset biases in data. For instance, if an algorithm scores loan applicants and women’s scores are on average lower due to biased historical data, one might approve women with a slightly lower score threshold than men to equalize acceptance rates or outcomes. This is akin to corrective affirmative action in the model. It’s controversial to some (as it explicitly uses protected attributes to change decisions), but it acknowledges that a one-size-fits-all threshold can perpetuate bias present in the data. * **Demographic Parity (Statistical Parity):** This criterion demands that the model’s positive prediction rate is the same across groups. For example, if 70% of men are approved for a loan, then \~70% of women should be approved as well. It doesn’t mean individuals are treated identically regardless of group, but it ensures no group is disproportionately selected (or rejected). The drawback is that parity can be achieved in ways that might seem unfair at an individual level (it doesn’t consider qualifications or true outcomes, just the rates). Also, if the actual qualification rates differ by group (due to external factors or past inequities), forcing parity might harm overall accuracy or even the qualified individuals in a group (by approving some unqualified people from that group to meet the quota). * **Equal Opportunity:** Proposed by Hardt et al. (2016), this focuses on the **true positive rate** being equal across groups. In a lending context: *“Of those who would repay a loan (truly creditworthy), the same fraction of men and women should be approved.”* This ensures that **qualified** people have equal chance, regardless of group – a model shouldn’t disproportionately miss true positives in one group. Equal Opportunity is a relaxation of **Equalized Odds**, which would require both TPR and FPR (false positive rate) to be equal across groups. Equalized Odds means the model’s *error rates* are identical across groups (so it doesn’t more often wrongly deny one group or wrongly grant to another). Equalized Odds is stricter; Hardt et al. argued that equal opportunity (just equal TPR) may be the more relevant guarantee in many cases (especially where the negative outcome is, say, denying a loan or treatment – we care more about not missing those who deserve positive outcomes). * **Predictive Parity (Calibration):** Another notion is that the model’s predictive value should be equal across groups. For example, if the algorithm assigns risk scores, a given score should correspond to the same likelihood of the actual outcome regardless of group (this was a point of contention in the famous COMPAS case on recidivism predictions). Predictive parity sometimes conflicts with equalized odds: the COMPAS debate showed you can’t have both calibration and equal error rates unless base rates are equal. * **Overall Accuracy Equality:** We might also ask that the *accuracy* (or some error rate) be equal for each group. For instance, the model is 90% accurate for both Group A and Group B. This can be a baseline check, but it’s not very nuanced: a model could have equal overall accuracy yet still be making more of one type of mistake for one group than another (e.g., more false negatives for one group and more false positives for another, which might be masked by equal aggregate accuracy). The sobering realization from formal studies is that **you generally cannot satisfy all fairness criteria at once** (except in trivial cases, such as if groups truly have identical distributions or outcomes). This was demonstrated by multiple researchers (e.g., an “impossibility theorem” by Kleinberg, Mullainathan, and Raghavan, and by Chouldechova, in 2016-2017): for example, if base rates (prevalence of the outcome) differ between groups, you cannot have both equal predictive parity and equal error rates without sacrificing accuracy. Thus, deciding on a fairness goal often involves value judgments and context. Different domains might prioritize different metrics: * In hiring or college admissions, some stakeholders argue for a form of parity in selection rates (to ensure opportunities are distributed). * In medical diagnostics, one might prioritize equal opportunity or equalized odds (so that no group is less likely to get a correct diagnosis or treatment if needed), even if that means some parity in false alarms is compromised. Beyond statistical measures of fairness in outcomes, there’s also fairness in *process*. For instance, **procedural fairness** might mean giving people the opportunity to contest a decision or to have a human review. Even if outcomes are balanced, an AI decision process might feel unfair if people have no say or understanding in how it was made. This is why many AI ethics guidelines include transparency and a “human-in-the-loop” for high-stakes decisions – to ensure there is recourse and that people don’t feel subject to a faceless algorithm with no appeal. It’s also worthwhile to consider what some call the **“bias mirror”** problem – AI often holds a mirror to society. When evaluating an AI system’s impact, we might ask: “Is the AI making things worse or better *compared to human decision-makers* in the same task?” Sometimes algorithms can *reduce* human bias (if designed carefully). For example, a well-tuned algorithm for screening job candidates might ignore demographic cues that human interviewers (even unconsciously) use, potentially improving diversity in hiring. There have been cases reported where algorithmic selection improved gender balance in certain internship or grant selections relative to historical human decisions (because the AI, unlike some humans, did not systematically undervalue the female candidates). On the other hand, a poorly designed AI can **amplify** bias – doing systematically worse than humans would, and at a larger scale – all while appearing objective (“the computer says so, it must be fair”). This veneer of objectivity can make algorithmic bias harder to detect or challenge. Another concern is **model blind spots**. These are regions or scenarios where the model is confidently wrong. For example, an autonomous car’s vision system might misclassify an unusual object or condition (there’s a famous anecdote of a self-driving car not recognizing a kangaroo properly because the way it jumps confused the detection system). For fairness, a blind spot might mean the model works poorly for a subgroup it didn’t “see” much during training – say, a voice assistant that struggles with certain accents or dialects not present in training data. Humans might catch these odd cases or approach them with caution, whereas a pure AI system might barrel ahead. This underscores why **human oversight** and rigorous testing are important, especially in safety-critical systems. Researchers have been developing **fairness-aware ML algorithms** to address bias. Techniques include: * **Pre-processing**: Modify the training data to reduce bias, e.g. re-sampling or re-weighting data to better represent minority groups, or “debiasing” features (removing components correlated with sensitive attributes). * **In-processing**: Change the learning algorithm itself, e.g. adding a fairness constraint or regularizer to the objective. For instance, train the model with an added penalty if the predictions deviate too much from parity or equal opportunity between groups. This way the model directly learns to balance accuracy with fairness goals. * **Post-processing**: Adjust the model’s outputs after training. One example is the method by Hardt et al. (2016) which takes any classifier’s scores and finds group-specific decision thresholds to satisfy equalized odds or opportunity. In practice, this could mean using a lower score cutoff for a historically disadvantaged group to equalize true positive rates. A key takeaway is that **context matters** – what fairness means and which techniques are appropriate will depend on the application and the values of the stakeholders involved. Engaging those affected (the stakeholders or protected groups) in defining fairness objectives is increasingly seen as important. Fairness in a criminal justice risk assessment might be defined and handled differently than fairness in a credit scoring tool or an online advertising algorithm. To wrap up this section: achieving fairness in AI is not a one-time fix but an ongoing process. It requires: * Good data practices (collecting diverse, high-quality data and being mindful of historical bias), * Thoughtful modeling (choosing objectives and model forms that align with fairness goals), * Rigorous evaluation (measuring performance separately for different groups and looking at multiple fairness metrics, not just overall accuracy), * Possibly incorporating technical fairness interventions (pre/in/post-processing as needed), * And considering non-technical steps: transparency, user consent, and avenues for redress. The encouraging news is that awareness of these issues is higher than ever in the AI community. Conferences now often require an ethics impact statement; companies are creating “Responsible AI” teams to audit products; and practical toolkits have emerged (e.g., IBM’s **AI Fairness 360** toolkit for bias detection/mitigation, and Google’s **What-If Tool** which allows developers to visually probe model decisions and check fairness metrics). These developments are helping translate fairness principles into practice. ## Principles and Frameworks for Ethical AI In response to both the concerns and the promise of AI, various organizations – from research institutes to governments – have proposed high-level **principles** to guide AI development. Remarkably, many of these principles converge on similar themes, echoing classic ethical and human-rights values. We will review some of the prominent frameworks: ### The Montréal Declaration (Canada, 2017) The **Montréal Declaration for Responsible AI (2017)** was one of the early comprehensive sets of AI ethics principles, developed through a multi-stakeholder, crowdsourced process in Quebec. It outlines **10 principles** grounded in fundamental values: 1. **Well-being:** AI should serve to enhance the well-being of all sentient beings. For example, AI in healthcare should improve health outcomes; AI in environmental management should help the planet. (The Declaration emphasizes that AI systems *“must permit the growth of the well-being of all sentient beings”* and not become a source of ill-being.) 2. **Respect for Autonomy:** AI should respect people’s autonomy and freedom of choice. This means AI shouldn’t coerce, deceive, or manipulate people against their will. For instance, users should know when they are interacting with an AI (no covert bots impersonating humans), and AI systems should be designed to give users control (e.g. the ability to opt out or override automated decisions). 3. **Privacy and Intimacy:** AI must protect privacy. Data acquisition and use should not unjustifiably intrude into people’s private lives. Individuals should have control over their personal data used by AI. The Declaration calls out the need to safeguard the “intimacy of thoughts and emotions” from AI analysis without consent. Misuses of personal data or excessive surveillance are to be avoided. 4. **Solidarity:** AI should promote solidarity and inclusion, helping to reduce inequalities. The benefits of AI should be shared broadly, not only by a privileged few. For example, if AI increases productivity, it should ideally enable better services or support for the disadvantaged, not just increase corporate profits. AI should also be used to support social safety nets and community values. 5. **Democratic Participation:** There should be democratic debate and oversight regarding AI. The public should have a say in how AI is deployed, especially in government and civic contexts. Decisions made by AI that affect people’s rights or opportunities should be *intelligible* and *justifiable* to those affected. Transparency to public authorities and the possibility of audits are highlighted. 6. **Equity:** AI should be equitable – it should not create or worsen unfair inequalities among individuals or groups. It should be designed to **not** discriminate or create new forms of domination or exclusion. Equity also implies accessibility: important AI technologies (like health or education AI) shouldn’t be available only to the rich; efforts should be made to make them broadly accessible to reduce social gaps. 7. **Diversity and Inclusion:** AI development should involve diverse stakeholders, and AI systems should be mindful of cultural and social diversity. They should not force a homogenization of values or lifestyles. In practice, this means including people of different backgrounds in AI design, and ensuring AI works well for different languages, cultures, and demographic groups. It also cautions against *filter bubbles* or profile locking – AI should not unduly narrow an individual’s opportunities or exposure to diverse content. 8. **Prudence (Caution):** A principle of caution means we should proactively think about the potential risks of AI and mitigate them. Before deploying AI in sensitive areas, it should undergo testing and safeguards to ensure safety. The Declaration even suggests restricting certain AI research or dissemination if misuse could pose grave risks (akin to how one might handle dual-use technologies). In short, *move fast and break things* is not acceptable for AI when human lives or social stability are at stake – prudence and foresight are required. 9. **Responsibility:** Those who design, deploy, or use AI must take responsibility for it. Humans remain accountable for AI-driven outcomes. For example, if an AI recommends an action, the ultimate decision should, in critical cases, be made by a human who can be held responsible (the Declaration explicitly says decisions like the choice to use lethal force must always remain with humans). It also implies that developers should follow professional codes of conduct and that there should be accountability mechanisms (like audit trails or liability frameworks) so that harms can be addressed. 10. **Environmental Sustainability:** AI should be developed and used with an eye towards environmental impact. This principle, increasingly noted in recent years, reminds us that training large AI models consumes significant energy and that AI hardware relies on resource-intensive supply chains. The Declaration urges minimizing energy consumption and e-waste, and using AI in service of environmental goals (not to accelerate ecological harm). These Montréal principles collectively map to widely accepted ethical domains – well-being (beneficence), autonomy, justice (equity), privacy, accountability, etc. The declaration is also noteworthy for its process: it was developed via public workshops and consultations, which gave it a certain democratic legitimacy (citizens’ voices were included in shaping the principles). It was intended as a “living document,” to be updated as AI evolves. While not legally binding, it has influenced policy discussions and ethical charters in Canada and beyond. ### The OECD AI Principles (2019) The **OECD (Organisation for Economic Co-operation and Development) AI Principles** were adopted in May 2019 by 42 countries (OECD members and others) and later endorsed by the G20, making them a significant international consensus. They comprise five broad value-based principles for AI, which are very much in harmony with the Montreal Declaration and other frameworks: 1. **Inclusive growth, sustainable development and well-being:** AI should benefit people and the planet by driving inclusive economic growth, improving societal welfare, and advancing sustainability. This principle envisions AI as a tool for positive social impact – reducing inequality, improving quality of life, and helping to achieve the UN Sustainable Development Goals. For example, AI applications in education could help broaden access for underserved communities, or AI in environmental science could help combat climate change. It warns against AI that only benefits a small segment or exacerbates disparities. 2. **Human-centered values and fairness:** AI should respect the rule of law, human rights, and democratic values, throughout its lifecycle. This includes principles of non-discrimination and equality, liberty, privacy, and social justice. In practice, this means AI systems should be designed in a way that upholds these values and does not unfairly undermine them. The OECD explicitly includes *fairness* here – meaning AI should be designed to avoid unjust bias and to treat people equitably. It also calls for mechanisms like human oversight when necessary to ensure these values are respected. (Notably, the OECD uses the term “human-centered” – similar to EU’s later term “trustworthy AI” – emphasizing that AI should ultimately serve humanity, not the other way around.) 3. **Transparency and explainability:** There should be transparency and responsible disclosure regarding AI systems. Stakeholders should be able to understand when AI is being used and obtain appropriate information about how it works (to the extent possible). For example, users have a right to be notified when they are interacting with an AI (a chatbot, a decision system) rather than a human. Moreover, those adversely affected by an AI decision should be able to get an explanation sufficient to challenge or seek redress. Transparency can also mean documenting the design and training of AI so that regulators or auditors can inspect it. The OECD acknowledges that full explainability might not always be achievable (some AI like deep neural nets are complex), but the spirit is to strive for *meaningful* transparency. 4. **Robustness, security and safety:** AI systems must be robust, secure, and safe throughout their lifecycle. This principle demands that AI should be tested and assured against a range of conditions – it should reliably do what it’s intended to, withstand cyberattacks or manipulation, and fail safely if something goes wrong. For instance, an AI in a self-driving car should handle not just ordinary scenarios but also edge cases (a sudden obstacle, a sensor glitch) in a way that minimizes harm. Robustness includes resilience to both unintentional failures and intentional misuse. This also implies continuous monitoring and evaluation of AI systems once deployed, to ensure safety is maintained over time. 5. **Accountability:** AI actors (whether organizations or individuals) should be accountable for the proper functioning of AI systems and compliance with the above principles. There needs to be accountability mechanisms – such as audits, risk assessments, or the ability to appeal decisions. This principle makes it clear that saying “the AI did it” is not an excuse; those who develop or deploy AI must be held responsible for its impacts, and governance structures should reflect that. For example, a company using an AI recruiting tool should be accountable for biases or errors in that tool as if a human were making the decisions, and they should have processes to regularly check and mitigate such issues. In addition to these five principles, the OECD document provides recommendations to governments on how to foster a trustworthy AI ecosystem (invest in R\&D, support workforce training, enable data sharing, etc.). The OECD principles have been highly influential; they provided a blueprint for the G20 AI Principles and informed the European approach. They also led to the OECD launching an AI Policy Observatory to track national AI policies and progress on implementing these principles. ### The European Union: “Trustworthy AI” Guidelines and the AI Act The European Union has been very active in the AI ethics and governance space. In 2019, the EU’s High-Level Expert Group on AI released **Ethics Guidelines for Trustworthy AI**. They articulated that trustworthy AI rests on three pillars: it should be **lawful** (complying with all laws), **ethical** (adhering to ethical principles), and **robust** (technically and socially robust). They then listed **7 key requirements** that operationalize these principles: 1. **Human Agency and Oversight:** AI systems should empower people and respect human autonomy. Humans should be able to intervene or oversee AI when appropriate (“human-in-the-loop” or “human-in-command”). Important decisions should not be left entirely to automated systems without possibility of human review. For example, an AI medical diagnosis system might assist a doctor, but the doctor remains the final authority and can override the AI’s suggestion. This also means AI shouldn’t nudge or manipulate people in ways that undermine their agency – e.g., always giving users the ability to opt out of automated decisions if feasible. 2. **Technical Robustness and Safety:** AI needs to be resilient and secure. It should be reliable, with a fallback plan if something goes wrong. This includes consideration of adversarial attacks, potential misuse, and general reliability. For instance, an AI in a power grid should be tested for how it handles unusual spikes or faults; if it fails, it should fail gracefully (not catastrophically). Robustness also covers accuracy and reproducibility – the system should perform as intended consistently and not be overly fragile. 3. **Privacy and Data Governance:** AI must respect privacy and ensure adequate data protection. This not only means complying with privacy laws (like GDPR in Europe) but also ensuring proper data governance – quality of data, integrity, and access controls. For example, data used to train AI should be obtained lawfully and with consent where required, stored securely, and there should be measures to prevent unnecessary or prolonged retention of personal data. Privacy by design is emphasized (i.e., building systems that minimize use of personal data or use privacy-preserving techniques). 4. **Transparency:** AI systems should be transparent about their operations, limitations, and outputs. This includes **traceability** of the AI’s processes (developers should document how the model was trained, what data, what algorithms, etc., so that its development can be traced if needed). It also includes **explainability** – individuals have the right to know the reasons behind AI decisions affecting them, in a way they can understand. Moreover, people should *know* when they are interacting with an AI and not assume it’s human. For instance, if a chatbot is handling customer service, it should disclose that it’s an AI. If an AI rejects your loan, there should be a way to get an explanation like “your debt-to-income ratio was below the required threshold.” 5. **Diversity, Non-discrimination, and Fairness:** AI should be inclusive and avoid unfair bias. It should be usable by and deliver benefits to a wide range of people – regardless of age, gender, abilities, or other characteristics. In design terms, that means involving diverse teams in development and testing across different user groups. The system’s decisions should be periodically checked for bias or disparate impact, and steps taken to mitigate any inequities found. Accessibility is also part of this – e.g., ensuring AI tools have interfaces that people with disabilities can use (for example, screen reader compatibility, or if it’s a voice assistant, that it works across accents and speech impairments as much as possible). 6. **Societal and Environmental Well-being:** AI should benefit all of society and the environment, not just individuals or specific narrow interests. This principle encourages assessing the broader impact of AI deployments – for example, will an AI-driven automation cause large job losses in a community, and if so, are there retraining programs or other mitigations? AI should be aligned with sustainable development: using AI to *improve* environmental outcomes (smart grids, climate modeling) rather than worsen them, and being mindful of AI’s own carbon footprint. It also means considering the social impact: for instance, an AI-driven content platform should consider its effect on public discourse or mental health. 7. **Accountability:** There should be mechanisms to ensure responsibility and accountability for AI systems and their outcomes. This could involve auditability – enabling third parties or regulators to audit algorithms and data. It also involves having redress mechanisms: if someone is harmed or wronged by an AI decision, they should have an avenue to appeal or correct it. Organisations should conduct impact assessments for their AI (similar to privacy impact assessments), and there should be oversight bodies or processes to ensure compliance with these principles. In practice, this might mean internal AI ethics boards, external audits, or even regulatory supervision for high-stakes AI. These EU guidelines were non-binding, but they heavily influenced subsequent policy. Building on the ethics guidelines, the EU moved to draft a law: the **EU AI Act** (proposed in 2021, with adoption expected around 2024-2025). The AI Act is a landmark attempt to regulate AI by categorizing applications by risk. In its proposed form, it includes: * **Unacceptable Risk AI:** Certain AI applications are outright banned for being contrary to fundamental rights or safety. Examples include AI for social scoring of citizens (as practiced in some form by China) and AI for real-time biometric identification in public (e.g., live facial recognition by police in crowds), with very narrow exceptions. Also banned is AI that involves subliminal techniques or manipulative methods that can cause physical or psychological harm, and systems that exploit vulnerabilities of specific groups (like children, persons with disabilities) to materially distort their behavior. These are uses of AI that the EU deems have no place in a human-rights-respecting society. * **High Risk AI:** These are AI systems that aren’t banned but are seen as having a high potential for harm if not properly managed. They are allowed **only** if they comply with strict requirements and undergo assessment. The Act’s annex defines which applications fall here, including AI used in critical infrastructure, education (e.g. scoring exams), employment (screening or evaluating candidates), credit lending, law enforcement (certain analytic tools), border control (e.g., lie detectors at borders), judicial decision-making, and medical devices. For instance, an AI CV-scanning tool used in hiring or an AI system used by judges to inform sentencing would be high-risk. These systems will have to meet requirements such as: having a risk management system, high-quality training data to minimize bias, transparency to users, human oversight, robustness, and accuracy. Providers must also maintain documentation (technical documentation, logs, etc.) for audit purposes. They will likely have to go through a conformity assessment before deployment (similar to how electronics get a CE mark for safety in Europe). * **Limited Risk AI:** These include systems that interact with humans but are not high-risk—like chatbots or AI that generates deepfakes. They are not subject to strict requirements, but *do* have some transparency obligations. For example, a chatbot must disclose to users that it is AI and **not** human. Likewise, if an image or video is AI-generated or manipulated (a “deepfake”), it should be disclosed (unless it’s for certain authorized purposes like satire or security research). This category essentially covers AI where the main risk is that users might be misled or not realize they are dealing with AI. A notice is deemed sufficient mitigation. * **Minimal Risk AI:** All other AI systems, like most recommender systems, video game AIs, spam filters, etc., which pose minimal risk to rights or safety. These face no new obligations under the Act. The vast majority of AI applications today likely fall in this category. The EU did not want to stifle innovation for low-risk use cases, so it largely leaves them unregulated (aside from existing laws). The Act does encourage voluntary codes of conduct for such AI, but nothing mandatory. The AI Act also has provisions for oversight and enforcement, including fines (proposed fines can be up to 6% of global turnover for the most severe violations, somewhat analogous to GDPR fines). It’s still under negotiation as of 2025, with debates on issues such as how to treat general-purpose AI (like GPT-type models) and ensuring the requirements are practical. The EU approach, combining broad ethical principles with a binding regulation (the AI Act), is being closely watched globally. If enacted, it will be the first comprehensive AI regulation in the world, and it may set a de facto standard (any company selling AI products into the EU will have to comply, which could influence what they do elsewhere). It’s notable that the EU AI Act’s philosophy is *risk-based*: regulate more where potential harm is greater. This is similar to how we regulate, say, drugs or airplanes more strictly than household appliances. The high-risk categories map closely to areas where ethical concerns are strongest (hiring – bias; law enforcement – rights; healthcare – safety; etc.), essentially putting into law many of the ethical principles we discussed. ### United States Initiatives In the United States, the approach to AI ethics has so far been more piecemeal. There isn’t (yet) a single federal law like the EU’s AI Act. However, various initiatives and sector-specific steps capture a growing focus on AI ethics: * **White House “Blueprint for an AI Bill of Rights” (2022):** The U.S. Office of Science and Technology Policy (OSTP) released a *Blueprint for an AI Bill of Rights* in October 2022, which is a non-binding set of principles for the design and deployment of AI systems. It outlines five core protections: 1. *Safe and Effective Systems:* You should be protected from unsafe or ineffective AI systems. This means AI should be tested pre-deployment for safety and potential risks, with input from diverse communities and domain experts. For instance, an AI system used in hospitals should undergo rigorous trials similar to a medical device. The principle also implies continuous monitoring of AI performance and the ability to shut down or adjust systems that are not working as intended. 2. *Algorithmic Discrimination Protections:* You should not face discrimination from AI; algorithms should be designed and used in an equitable way. Developers should proactively assess and mitigate bias (for example, performing bias audits and making the results public when possible). This principle explicitly connects to civil rights – existing laws against discrimination (in credit, employment, housing, etc.) still apply if AI is making the decisions. The Blueprint suggests techniques like algorithmic impact assessments, similar to environmental impact reports but for algorithms, to evaluate potential disparate impacts **before** deployment. 3. *Data Privacy:* You should be protected from abusive data practices and have agency over how data about you is used. This aligns with privacy rights: AI systems should employ data minimization (only using what’s needed), security, and obtain consent for data usage in most cases. It also encourages methods like privacy-by-design and not using data in ways people would object to (like monitoring employees’ private conversations or behavior without oversight). 4. *Notice and Explanation:* You should be informed when an automated system is in use and understand its outputs. In practice, this means if AI is involved in a decision that impacts you, you should know that AI played a role. Moreover, you should be able to get an explanation that is understandable about what the AI is doing and why it arrived at its decision. For example, if an AI denies you a job interview, you might get a notice: “An algorithm screened your application,” and an explanation like “The system identified a mismatch in required skills (Python programming) compared to your resume.” Transparency is key for trust and for enabling recourse. 5. *Human Alternatives, Consideration, and Fallback:* You should be able to opt out of AI decisions in favor of a human review in many cases, and there should be a backup plan if the AI fails. For instance, if an AI customer service chatbot isn’t helping you, you should have the option to talk to a human agent. Or if an AI system flags an error (say, a bank’s fraud detection), you should have a path to have a human investigate and correct any mistake. This principle recognizes that AI will sometimes be wrong or inappropriate, and human judgment is the final safety net. The Blueprint for an AI Bill of Rights is not law, but it serves as guidance to federal agencies and a signal to industry about best practices. It also has an accompanying technical document with concrete steps for implementers. It explicitly tied these principles to American values, like how freedom from discrimination and rights to privacy should extend into the AI era. We might expect that future regulations or procurement requirements (for federal agencies buying AI) will draw on these principles. * **Federal Trade Commission (FTC):** The FTC has asserted it can regulate unfair or deceptive AI practices under its existing authority. In April 2021, the FTC warned businesses that selling or using racially biased algorithms could be considered an “unfair or deceptive practice” and lead to enforcement. Basically, even without new laws, the FTC can potentially fine companies if their AI discriminates or if they lie about what their AI does (e.g., claiming it’s bias-free when it isn’t). This puts a degree of pressure on companies to be truthful and careful with AI in consumer contexts. * **NIST AI Risk Management Framework (2023):** The National Institute of Standards and Technology (NIST) released a voluntary AI Risk Management Framework (version 1.0) in January 2023. It’s a guidance document to help organizations identify and mitigate risks of AI systems in terms of reliability, bias, security, etc. The framework uses a cross-industry, collaborative approach (similar to NIST’s famous Cybersecurity Framework) and includes considerations of trustworthiness and fairness. While voluntary, many companies may adopt it to demonstrate responsible AI practices, and it might shape standards or future regulation. * **Sectoral regulations:** In certain domains, AI is being addressed by updating existing laws: * The **Equal Employment Opportunity Commission (EEOC)** has started examining AI in hiring for compliance with anti-discrimination laws. In 2023, the EEOC released guidance on employers using AI for employment decisions, clarifying that the use of AI doesn’t excuse discrimination – employers are liable if their AI tools have disparate impact. There’s also an initiative to draft guidelines for auditing hiring AI tools. * The **Food and Drug Administration (FDA)** has been working on how to regulate AI/ML-based medical devices, especially those that continuously learn from new data (adaptive algorithms). They’ve proposed that companies submit plans for how the AI will be updated and monitored in the field. * **Transportation (FAA/NHTSA)**: For autonomous vehicles and drones, regulators are figuring out safety standards that incorporate AI reliability. * **State and local laws:** A number of U.S. states and cities have started passing their own AI-related laws, especially on AI in hiring and surveillance: * **Illinois** was a pioneer with the **Artificial Intelligence Video Interview Act** (effective Jan 2020). It requires employers to notify candidates if AI is being used to analyze their video interviews, to explain how the AI works, and to obtain consent. It also has provisions on limiting data sharing and requiring destruction of interview videos upon request. This law was about transparency and privacy more than bias, but it set a precedent for regulating AI in hiring. * **New York City** passed **Local Law 144** (effective July 2023) which requires that companies using “automated employment decision tools” for hiring or promotion must subject those tools to an annual **bias audit** by an independent auditor. They also must publicly post a summary of the audit results and notify candidates about the use of such tools. The bias audit checks for disparate impact (e.g., difference in selection rates by gender or race) and the results (like accuracy and bias metrics) have to be made available. NYC’s law is one of the first to mandate an actual bias testing regime for AI. * **California** and other states have proposals around AI decisions in finance and employment; e.g., California’s employment regulations now explicitly say that if AI or automated decision tools are used, the outcomes are subject to anti-discrimination law and need validation to ensure they’re job-related and fair. * Several U.S. cities (San Francisco, Boston, Portland, etc.) banned government use of facial recognition technology starting in 2019, due to concerns over accuracy and civil liberties. Some of these bans have exceptions (e.g., for accessing phone face unlock or if mandated by federal law), but they represent a pushback on unregulated surveillance. Overall, in the U.S. the trend is towards more guidance and some targeted rules, but not a comprehensive national law yet. There are ongoing discussions in Congress (e.g., past bills like the Algorithmic Accountability Act were introduced to require companies to audit high-risk AI for bias, but haven’t passed as of 2025). It’s likely that we will see more sector-specific regulations first (for example, rules for AI in financial services or in healthcare, via those sectors’ regulators), as well as continued use of existing laws (civil rights law, consumer protection law) to litigate harmful AI outcomes. ### Other Notable International Efforts * **UNESCO’s Recommendation on the Ethics of AI (2021):** In late 2021, all 193 member states of UNESCO adopted a comprehensive framework on AI ethics. It covers principles similar to OECD’s (inclusive benefit, harm avoidance, human oversight, fairness, privacy, transparency, accountability, sustainability, etc.) and also provides detailed policy guidance – for example, calling for ethics impact assessments, bans on social scoring for surveillance, and strong data governance. While not binding, it’s a globally negotiated document, so it carries moral weight and can inspire national strategies. * **IEEE “Ethically Aligned Design”:** The IEEE, a major engineering organization, published extensive guidance (a multi-edition document) on how to align AI systems with ethical values (first edition in 2019). It delves into issues like embedding human rights into AI, methodology for ethical design, and specific issues (e.g., autonomous weapons, or how to treat AI agents). This has influenced industry thinking and offers a toolbox of considerations for practitioners. * **Industry Self-regulation:** Many tech companies have their own set of AI principles publicly announced (Google’s AI Principles, Microsoft’s Responsible AI principles, etc.), often echoing the themes above (fairness, transparency, reliability, privacy, accountability). Some have internal review boards for sensitive AI projects. However, the effectiveness of self-regulation has been questioned – e.g., Google’s handling of internal AI ethics researchers in 2020 led to criticism that the company wasn’t fully living up to its principles when there was a conflict with business interests (the high-profile departures of Timnit Gebru and Margaret Mitchell from Google’s AI ethics team after raising concerns exemplified this tension). * **Academic and Civil Society:** There’s an active role of NGOs and research institutes in AI ethics. Organizations like the **Algorithmic Justice League** (founded by Joy Buolamwini) advocate against bias in AI. The **Partnership on AI** (a consortium of companies and nonprofits) works on best practices. Universities have introduced ethics curricula into computer science programs to train the next generation of engineers in these considerations. A significant emerging trend is the idea of **algorithmic audits** and **certifications**. Just as financial audits provide assurance of accuracy in financial statements, independent audits of AI systems for bias, privacy, and security are starting to occur. For instance, some startups and consulting firms now specialize in auditing AI systems. In some domains this might even become required (NYC’s law essentially mandates an audit for hiring tools). We may see the rise of compliance frameworks (e.g., **“AI ethics certification”** for products that meet certain standards). Another development: AI research conferences and journals are instituting ethics review for submissions. Researchers need to disclose if their work has potential societal impacts or involves sensitive data, etc. This is akin to research involving human subjects requiring institutional review board (IRB) approval. AI is increasingly viewed through that lens, especially research on things like deepfakes, surveillance tech, or large language models that could be misused for misinformation – researchers are expected to discuss the ethical implications. ## Synthesis – Towards Responsible AI Across all these efforts – whether principles, laws, or self-governance – we see a common vision of **Responsible AI**: AI that is **fair, transparent, accountable, safe, and aligned with societal values**. The challenge now is turning these high-level principles into consistent practice. What are some concrete steps organizations and society can take to ensure AI is developed and used responsibly? * **Ethical Impact Assessments:** Before deploying AI in high-stakes situations, perform an assessment of potential impacts on people. Similar to environmental impact assessments for big construction projects, an AI impact assessment would analyze possible biases, privacy issues, safety risks, and impacts on stakeholders. The assessment should involve input from diverse stakeholders, and it can be made public for accountability. In fact, some jurisdictions (like Canada and parts of the EU) have started requiring Algorithmic Impact Assessments for government use of AI. * **By-design Approaches:** Incorporate ethics from the start of design. “Privacy by design” and “security by design” are established ideas; now we talk about “fairness by design” or “ethics by design.” This means thinking about who could be harmed by the system, how to mitigate that, and what governance to embed (e.g., logs for accountability, user consent flows, bias mitigation algorithms) at the design phase. It’s much harder to bolt on ethics at the end. * **Diverse and Interdisciplinary Teams:** Ensure that teams working on AI include not just technical experts, but also people versed in ethics, law, and domain-specific social issues. Additionally, demographic diversity within teams can bring perspectives that spot problems others might miss (for example, a team with no women might not immediately see a gender bias issue in a hiring algorithm). Interdisciplinary collaboration between computer scientists and social scientists (psychologists, sociologists) can improve understanding of AI’s context. * **User Engagement and Education:** For AI systems used by the public, it’s important to educate users about what the system does and its limitations. In some cases, involving end-users in testing can surface issues (like how a medical AI tool might be confusing to doctors or patients, indicating a need for better explanation or training). Public engagement is also key in governmental AI deployments – e.g., having community forums about police use of AI surveillance, to gauge public sentiment and set boundaries. * **Continuous Monitoring and Auditing:** AI systems should be continuously monitored in operation, since their performance can change over time (data drift, or users interacting in new ways). Key metrics like error rates and bias indicators should be tracked. Periodic audits (internal or external) can check if the AI is still meeting fairness and accuracy goals. For example, a bank might audit its credit AI annually to see if any particular group is being disproportionately rejected and why. Auditing can also include checking for concept drift or for new types of errors that weren’t present initially. * **Transparency and Documentation:** There’s a saying, “sunlight is the best disinfectant.” Having transparency can deter irresponsible practices. This could mean publishing information about an AI system (model card or datasheet) describing its intended use, performance, and limitations. It could also mean open-sourcing certain models or sharing datasets for public scrutiny, when appropriate. For high-impact systems, regulators could require companies to submit documentation (similar to how drug companies publish clinical trial results). * **Regulatory and Oversight Mechanisms:** Encourage policymakers to develop smart regulations that protect people without unduly stifling innovation. The EU’s risk-based approach is one model; others might include updating consumer protection laws to explicitly cover automated decision-making. Oversight bodies might be established – for instance, some have proposed an FDA-like agency for algorithms, or expanding the mandate of existing agencies to specifically cover AI. The key is that there needs to be an enforcement backstop, not just voluntary adherence. * **International Cooperation:** AI is global. Models and data cross borders, and ethical issues (like deepfake misinformation or autonomous weapons) are international in scope. Cooperation through venues like the UN or OECD can help set common norms and prevent “race to the bottom” scenarios. It also helps smaller nations voice their concerns (e.g., about not having their cultures steamrolled by AI products made elsewhere). International agreements, even if non-binding, create peer pressure and shared expectations that can raise the overall standard. Importantly, **ethical AI is an ongoing journey, not a one-time certification.** Just as human society’s values and expectations evolve, AI systems will need to be continually evaluated and improved. What matters is building a culture of responsibility. This is similar to what happened in fields like medicine or engineering over the past century: early on, disasters and scandals (building collapses, harmful medical experiments) led to the development of professional ethics and regulatory standards. AI is going through a similar maturation process. For instance, one could envision that in a decade, it will be standard for AI developers to be licensed or certified in some way, or to have taken an “Hippocratic Oath”-like pledge for AI (there are already discussions of this in some professional bodies). From a pragmatic perspective, organizations are finding that responsible AI is also good for business in the long run. Biased or unsafe AI can lead to lawsuits, regulatory fines, or reputational damage. On the flip side, if you can *prove* your AI is fair and robust, that becomes a competitive advantage as users and clients demand trust. For example, a recruiting software company that can show audited fairness might win contracts over one that cannot. As a concluding thought, the rapid advancement of AI makes ethical considerations ever more critical. The recent rise of powerful AI systems (like GPT-4 style language models that can generate human-like text, or deepfakes that can create ultra-realistic fake images/video) has demonstrated both exciting capabilities and new risks (e.g., potential for mass-produced disinformation, impersonation scams, or simply the propagation of biases present in training data). These developments have spurred calls for even more robust ethics and governance – including from AI researchers themselves who note that as systems get more general and powerful, the unknown risks grow. Some experts advocate for a degree of humility and restraint: deploying AI gradually and with safeguards, rather than rushing “fast and break things.” Concepts like **“AI alignment”** (ensuring AI goals align with human values) and **“existential risk from AI”** (the idea that superintelligent AI could pose a risk to humanity if not properly controlled) were once purely theoretical but are increasingly part of mainstream discourse. To end on a positive note: if we manage AI well, it can be an incredible force for good. Imagine AI systems that help discover cures for diseases, personalize education for every child, optimize energy use to fight climate change, or take over dangerous jobs so humans don’t have to. These are all on the horizon or already happening. The key question is often *not* whether to use AI at all, but *how* to use it responsibly. As one of the themes of this text has been: technology doesn’t automatically equate to progress – it’s progress *if* it leads to better outcomes for people. AI ethics and governance are about ensuring that link: guiding this powerful technology such that it truly improves societies and lives, while respecting the rights and dignity of all. In summary, the future of AI will be shaped by the choices we make today. Upholding principles of ethics and human-centric design in AI is not a one-off task but a continuous commitment. It involves technical innovation (to design fair, explainable, and safe AI), legal and policy innovation (to create adaptive regulations), and social innovation (to involve communities in decisions about AI). By striving for **Responsible AI**, we increase the chances that AI will be a boon to humanity – helping to create a more just, prosperous, and sustainable world. **Fun Projects and Further Exploration:** *Ethics doesn’t make AI any less fascinating to play with! If you are intrigued by AI, here are some ideas to explore its capabilities (responsibly) and get a personal sense of the issues discussed:* * *Try out a large language model (LLM) or coding assistant (many are available via web demos or open-source). See how well it can write a poem or answer questions. As you do, notice where it might go wrong – does it ever give biased or incorrect answers? This hands-on experience highlights why transparency and human oversight are needed; these models are powerful but not infallible.* * *Experiment with image generation models (like DALL-E or the open-source Stable Diffusion). You can create astounding artwork from text prompts. But also try prompts involving people – do you notice any biases in how the AI portrays gender or ethnicity in certain occupations? (Researchers found some image AIs had bias, e.g., prompting for “CEO” often yielded images of men.) This can deepen your understanding of bias and the importance of the data that goes into these systems.* * *If you like working with data, consider entering an **AI fairness hackathon or competition**. For example, there have been competitions to reduce bias in mortgage lending data or to improve fairness in healthcare AI. These let you apply techniques from this chapter – and see firsthand the trade-offs between accuracy and fairness.* * *Check out open-source tools like **IBM AI Fairness 360** or Google’s **What-If Tool** (part of TensorBoard) on a dataset. They provide an environment to test bias metrics or see the effect of changing thresholds. It’s a practical way to learn how data and model choices affect outcomes.* * *Follow some AI ethics thought leaders or organizations on social media (such as Arvind Narayanan, Timnit Gebru, Kate Crawford, the ACM FAT* conference, etc.). They often share the latest examples of AI issues (both good and bad). Keeping up-to-date will show you that this field is evolving quickly – with new cases, scandals, and breakthroughs happening all the time.\* * *Finally, if you’re interested in policy, you could read the full text of some frameworks we mentioned (they’re usually readable). For instance, skim the OECD AI Principles or the EU Ethics Guidelines (and the draft AI Act). Think about how these abstract principles might apply to a specific AI system you use or are building.* Each of these explorations can make the concepts from this chapter more concrete. AI is an amazingly rich field, and engaging with it hands-on will reinforce why ethics is not just an add-on, but an integral part of creating technology that truly benefits everyone. As we conclude this chapter, remember that **the future of AI is not predetermined**. It will be shaped by the collective actions of researchers, developers, policymakers, and users. By infusing ethical consideration at every stage – from design to deployment – we can guide AI toward outcomes that *augment* human capabilities and uphold human values, rather than undermine them. The coming years will test our wisdom in wielding this double-edged sword of technology. Let’s ensure we meet that challenge with both ingenuity *and* integrity, so that AI fulfills its promise as a tool for building a better society. ## References {.unnumbered} Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. In *Proceedings of Machine Learning Research*, 81, 1–15. Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. *Big Data, 5*(2), 153–163. European Commission. (2021). *Proposal for a regulation laying down harmonised rules on Artificial Intelligence (Artificial Intelligence Act)*. European Commission High-Level Expert Group on AI. (2019). *Ethics Guidelines for Trustworthy AI*. European Commission. Hardt, M., Price, E., & Srebro, N. (2016). Equality of opportunity in supervised learning. In *Advances in Neural Information Processing Systems (NeurIPS 2016)*. IBM Research. (2018). *AI Fairness 360 Open Source Toolkit*. IBM Corporation. Illinois General Assembly. (2019). *Public Act 101-0260: Artificial Intelligence Video Interview Act*. State of Illinois. Montréal Declaration. (2017). *Montréal Declaration for the Responsible Development of Artificial Intelligence*. Université de Montréal. New York City Council. (2021). *Local Law No. 144: A local law to regulate the use of automated employment decision tools*. City of New York. OECD. (2019). *OECD Principles on Artificial Intelligence*. Organisation for Economic Co-operation and Development. Office of Science and Technology Policy. (2022). *Blueprint for an AI Bill of Rights: Making automated systems work for the American people*. Executive Office of the President of the United States. Suresh, H., & Guttag, J. V. (2021). A framework for understanding sources of harm throughout the machine learning life cycle. In *Proceedings of the 2021 ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization (EAAMO ’21)*. Wexler, J., Pushkarna, M., Bolukbasi, T., Wattenberg, M., Viégas, F., & Wilson, J. (2019). The What-If Tool: Interactive probing of machine learning models. Google PAIR.