Data masking | Notion

Introduction

NexusGPT has conducted a thorough analysis of its data handling practices and has determined that data masking is not required at this time based on the type of data processed, stored, and displayed. Data masking is typically applied when organizations handle highly sensitive personal, financial, or confidential information that must be obfuscated for security or privacy reasons. However, given the nature of the data NexusGPT processes, data masking is not necessary within our platform's current architecture.

Data Types Handled by NexusGPT

The types of data that NexusGPT processes and stores do not include sensitive personal, financial, or highly confidential information. Our analysis of the data categories includes the following:

Chat History:
- Chat histories between users and AI agents are logged for the purpose of providing context during subsequent interactions. These logs do not include sensitive financial data, personally identifiable information (PII), or any highly confidential content.
- Justification: Since chat history consists only of user inputs and AI outputs, which are generated and displayed back to the user, the need for masking is minimal. This information is neither sensitive nor confidential beyond what the user voluntarily inputs.
Agent Configurations:
- AI agent configurations include task descriptions, workflows, and skill assignments, which are technical in nature and do not contain sensitive personal information.
- Justification: Agent configurations relate to system functions rather than personal or sensitive data, making data masking irrelevant for this context.
Files Stored in AWS S3:
- User-uploaded files are securely stored in AWS S3. Each file is identified using a randomly generated UUID, ensuring that filenames and paths do not reveal any sensitive information.
- Justification: File storage in AWS is handled using secure methods, including encryption at rest using AES-256. Since the UUID system obfuscates file identifiers and files themselves are securely stored, data masking is not necessary.

Data Security Measures in Place

Even though data masking is not applied, NexusGPT ensures high security through a combination of encryption and access control mechanisms. These measures mitigate risks and protect user data without the need for masking:

Password and API Key Encryption:
- All passwords, API keys, and other sensitive credentials are stored in an encrypted format using industry-standard encryption algorithms. This ensures that sensitive credentials are never stored or transmitted in plain text.
- Justification: Since encryption is applied to all sensitive credentials, masking is unnecessary. Data at rest is secure and cannot be accessed in its original form without the appropriate decryption mechanisms.
Stripe Integration for Financial Data:
- NexusGPT does not store or process sensitive financial information, such as credit card numbers, on its own systems. All financial transactions and payment data are securely handled by Stripe, which is PCI-DSS compliant.
- Justification: Since NexusGPT does not process or store financial data, data masking for this type of information is irrelevant. Stripe’s own security measures and compliance ensure that financial data is handled safely and securely outside the scope of our platform.
AWS S3 Encryption:
- All files stored in AWS S3 are encrypted using AES-256 encryption. This ensures that even if a file is accessed without authorization, it cannot be viewed without the appropriate decryption keys.
- Justification: Encryption at rest provides a stronger layer of security than masking for files stored in S3, as it completely protects the file’s contents. Since we use randomly generated UUIDs to name files, file names are obfuscated and inaccessible without the necessary permissions.
Access Controls:
- NexusGPT enforces strict role-based access controls (RBAC). Only authorized personnel have access to specific datasets, which limits the exposure of sensitive information to a minimal number of people.
- Justification: Proper access control mitigates the risk of unauthorized data exposure, reducing the need for additional masking techniques.

Data Masking and its Irrelevance in NexusGPT's Context

Data masking is typically implemented in scenarios where sensitive information, such as credit card numbers, Social Security numbers, or other PII, must be displayed in a way that protects the data from unauthorized access. However, the types of data handled by NexusGPT do not fall into this category for the following reasons:

No Sensitive Financial Data:
- NexusGPT does not store or handle any payment or financial information directly. All financial transactions are securely processed by Stripe. Since we do not process this type of data, there is no need to apply data masking techniques.
No Personal Identifiable Information (PII):
- NexusGPT does not handle or store sensitive PII, such as Social Security numbers, passport numbers, or personal addresses. The data we handle is limited to agent interactions and user inputs, none of which meet the threshold for requiring data masking.
Encryption vs. Masking:
- Encryption is applied to sensitive information such as passwords and API keys, providing a much stronger layer of security than masking. Since the primary goal of masking is to prevent sensitive data exposure in a readable format, encryption already achieves this by preventing unauthorized access to raw data.

Conclusion: Why Data Masking is Not Applied

Based on the nature of the data NexusGPT processes, data masking is unnecessary. Instead, we employ encryption, access control, and secure third-party integrations (such as Stripe) to ensure data security. Our platform does not handle sensitive financial information or PII that typically warrants masking.