Data Governance – Mitigating Data Biases
Part 3
Credit unions have many data sources beyond their core operating system. When a credit union conducts a data source inventory, it is not uncommon to have 50 or more data sources. These third-party data sets and reports commonly come in various formats, with different labels and definitions, and upload and report timing. Inconsistency in data extracts means inconsistency in data quality, balancing one report against another, and the trustworthiness of the data. For example, with one vendor, a member is called a member; with another, it is a unique identifier; with another, it is an account; and on and on when the data is Extracted from the source. During the Transfer process, all of these data inconsistencies must be normalized, cleansed, and verified before Loading into, hopefully, a single data repository (data warehouse or data lake.)
In my data strategy engagements, I have witnessed credit unions with multiple data sources, siloed data sources, and data stored in individual desktop drives. This lack of central oversight and data monitoring poses significant risks for the credit union in all reporting, the resulting decisions, and the deployment of AI. To mitigate this risk, the credit union must implement a Data Governance Discipline and, as discussed in Posts 1 and 2, normalizing these data sets is vital to successfully transition to Open Banking and Distributed Ledger Technology (DLT) and must be in place before attempting to leverage AI and machine learning algorithms.
Implementing a Data Governance Discipline:
Data Governance refers to managing an organization’s availability, usability, integrity, and security of their data. Effective data governance practices significantly enhance the value of adopting Open Banking services and applications and will help avoid the challenges faced by the Navy Federal Credit Union (NFCU.) This challenge NFCU faces demonstrates how poor data governance can lead to significant risks and diminish potential benefits.
- Reliable and Accurate Data:
- High-quality data governance ensures that data is accurate, complete, and reliable. This governance is crucial in open banking, where data-driven insights power personalized financial products and services. Accurate data minimizes data and algorithmic bias, enhances decision-making and forecasting, reduces errors, and builds internal and member trust.
- Consistency Across Platforms:
- Consistent data standards and definitions across the credit union and third-party providers ensure seamless integration and interoperability. This consistency is vital for providing a cohesive member experience and leveraging the full potential of Open Banking services. This consistency can only be arrived at and maintained when all data is centralized in a single data warehouse or lake. The organizational structure for data and data governance is an independent business unit (Business Intelligence) that serves the entire organization, similar to how HR, Marketing, or IT operates.
- Compliance with Regulations:
- Robust data governance practices help credit unions comply with regulatory requirements such as the General Data Protection Regulation (GDPR,) California Consumer Privacy Act (CCPA,) and other data protection laws. Compliance reduces the risk of legal penalties, minimizes reputation risk, and fosters trust among members, who are more likely to consent to data sharing when they feel their information is secure. In addition, Open Banking helps ensure regulatory compliance by incorporating data protection regulations, adhering to financial directives like the Payment Services Directive (PSD2,) utilizing standardized Application Programming Interfaces (APIs,) maintaining transparency and accountability, protecting consumers, implementing robust security measures, continuously monitoring regulatory changes, and collaborating with regulatory bodies. These comprehensive measures ensure that Open Banking systems are secure, transparent, and compliant with legal and regulatory standards.
- Strong Data Privacy and Security Measures:
- Data Governance strengthens data security and privacy by establishing clear policies and procedures while using, managing, and sharing data internally and externally. These policies and procedures ensure data quality and integrity, maintain regulatory compliance, implement robust security measures, promote accountability and transparency, facilitate effective data management, strengthen risk management, and educate employees. Employee and member education is vital to a data-driven culture and informing members and staff about the credit union’s data privacy practices. This education is crucial to building the trust necessary to obtain explicit consent from the member for their data usage.
Addressing Algorithmic Bias: Addressing algorithmic bias is essential for credit unions to ensure fair and equitable data usage and service delivery. We know algorithmic bias leads to unfair treatment of certain groups of members and will undermine trust in the credit union. Here are steps a credit union can take to address bias in their data and algorithms:
- Understanding and Identifying Bias
- The data team must conduct Algorithm Bias Audits regularly to identify potential biases. These audits involve analyzing algorithm outcomes to see if certain groups are disproportionately affected.
- Diverse Testing Data: Identify and use diverse data sets to test algorithms and ensure equitable and fair performance across different demographic groups.
- Bias Detection Tools:
- Use Specialized Tools to detect and analyze biases in AI and machine learning models. These tools can highlight where and how bias is occurring. Some tools include Amazon SageMaker Clarify, Aequitas, Themis-ML, Microsoft Fairlearn, Google What-If Tool (WIT), and IBM AI Fairness 360 (AIF360).
- Inclusive Data Sets:
- Representative Data: Ensure that the data used to train algorithms represents the entire population the credit union serves. Representative data includes demographic diversity in age, gender, race, income, and geographic location. Ensuring data is representative of all demographics requires a comprehensive approach involving diverse data collection, rigorous data quality management, bias detection and mitigation, inclusive policies and practices, advanced technology and tools, and a commitment to continuous improvement. By adopting these steps, credit unions can provide fair and equitable services to all members.
- Data Augmentation: Where data could be more diverse, consider data augmentation techniques to create a more balanced training set. Mitigating data biases requires a multifaceted approach that includes diversifying data sources, employing advanced tools, regularly updating data, using synthetic data, and adhering to ethical AI practices. Credit unions can ensure more equitable and accurate decision-making processes by taking these steps, ultimately leading to better member experiences and outcomes.
Data Preprocessing:
- Remove Bias: Implement data preprocessing techniques to identify and remove biased data. Removing data biases in a credit union requires a multifaceted approach that includes diversifying data sources, implementing robust data preprocessing techniques, utilizing advanced bias detection and mitigation tools, ensuring continual learning, involving human oversight, and establishing strong governance and ethical guidelines. By adopting these strategies, credit unions can create fairer, more equitable services for all members.
- Anonymization: Ensure sensitive information that could introduce bias is anonymized while maintaining the data’s utility. Anonymizing data sets in a credit union involves several steps to ensure that personally identifiable information (PII) is effectively removed or obscured, protecting member privacy while maintaining the data’s utility for analysis.
Fair Algorithm Design and Development
Bias-Reduction Algorithms: Use algorithms designed to minimize bias. Various machine learning techniques, such as fairness constraints, adversarial debiasing, and fairness-aware hyperparameter tuning, can help reduce bias. Steps to Implement Adversarial Debiasing:
- Data Preparation:
- Collect and prepare a dataset that includes features, labels, and sensitive attributes.
- Define Models:
- Predictor Model: A neural network or another suitable machine learning model for the primary task.
- Adversary Model: Another neural network designed to predict the sensitive attribute.
- Training Loop:
- Train the predictor model on the primary task.
- Train the adversary model to predict the sensitive attribute from the predictor’s output.
- Adjust the predictor model to minimize the adversary’s ability to predict the sensitive attribute.
- Iterate until the adversary’s accuracy is at random chance levels, indicating the sensitive attribute cannot be inferred.
- Practical Tools and Libraries
- AI Fairness 360 (AIF360): An open-source library from IBM that includes implementations of adversarial debiasing algorithms.
- Fairlearn: A toolkit by Microsoft that provides various algorithms and tools for assessing and improving the fairness of machine learning models.
Transparency and Explainability:
- Explainable AI: Implement Explainable AI (XAI) techniques to make algorithms’ decision-making process transparent. XAIs help people understand how decisions are made and identify potential sources of bias.
- Documentation: Maintain thorough documentation of how algorithms are developed, including the data sources, feature selection processes, and validation methods.
Regulatory and Ethical Compliance
- Adopt Ethical AI Principles: Follow ethical AI guidelines and principles that emphasize fairness, transparency, accountability, and privacy. Creating ethical AI guidelines for a credit union involves establishing principles and practices that ensure AI technologies’ responsible and fair use. These principles include ensuring that AI models do not perpetuate or amplify biases, making AI systems understandable and accountable, safeguarding member data and ensuring privacy, establishing clear accountability for AI decisions and actions (AI ethics committee,) ensuring AI systems are accessible and beneficial to all members, and promote sustainable and socially responsible AI use,
- Compliance with Regulations: Ensure the algorithms comply with relevant regulations and guidelines, such as GDPR, which mandates fairness and transparency in automated decision-making.
Bias Mitigation Policies:
- Formal Policies: Develop and implement formal policies for mitigating algorithmic bias. The policies should include data collection, model training, and bias testing guidelines.
- Regular Reviews: Regularly reviews these policies to ensure they are up-to-date with the latest best practices and regulatory requirements.
Stakeholder Engagement
- Diverse Development Teams: Ensure the teams developing and managing algorithms are diverse. Various perspectives help identify and mitigate biases that a more homogenous group might overlook.
- Stakeholder Feedback: Engage with stakeholders, including members, to get feedback on the fairness of the credit union’s services and products. This feedback can provide valuable insights into potential biases and areas for improvement.
Member Transparency, Education, and Empowerment
- Inform Members: Communicate to members how their data is used, and decisions are made. This transparency helps build trust and allows members to understand and question decisions. Informing members must include a listening loop that seeks actionable insights.
- Right to Appeal: Provide members with mechanisms to appeal algorithms’ decisions. Ensure they can quickly request a human review of automated decisions with a follow-up on what the review learned and what was acted on.
Educational Initiatives:
- Raise Awareness: Educate members about the importance of data quality and how they can help by providing accurate and complete information. Please encourage them to report any biases they perceive in services.
Addressing algorithmic bias while incorporating Open Banking and Distributed Ledger technology and deploying AI requires a multifaceted approach involving technical, organizational, and educational measures. By conducting regular audits, ensuring data quality and representation, designing fair algorithms, complying with ethical standards, engaging stakeholders, and empowering members, credit unions can mitigate bias and ensure their services are fair and equitable for all members. This proactive approach enhances fairness, builds trust, and strengthens the credit union’s reputation.