Macie
Macie is a Data security and privacy service that uses machine learning and pattern matching. Macie provides you with an inventory of your S3 general purpose buckets, and automatically evaluates and monitors the buckets for security and access control, but also automates discovery and reporting of sensitive data to provide you with a better understanding of the data that your organization stores in Amazon S3.
You can use built-in criteria and techniques that Macie provides, custom criteria that you define, or a combination of the two.
Macie can use a multi-account architecture with central management, either:
-
by integrating Macie with AWS Organizations.
-
by sending and accepting membership invitations in Macie.
You can use it to discover:
-
Sensitive Data:
-
Personally Identifiable Information (PII)
-
Insurance numbers
-
Birth dates
-
Driving license numbers
-
Passport numbers
-
Addresses
-
-
Personal Health Information (PHI)
-
Financial data
-
Bank account numbers
-
Credit card numbers or expiry dates
-
-
Credentials
-
AWS Access Keys
-
SSH credentials
-
PGP keys
-
-
Data belonging to multiple categories
-
Custom Identifiers
-
-
S3 Policies
-
Policy:IAMUser/S3BlockPublicAccessDisabled
-
Policy:IAMUser/S3BucketEncryptionDisabled
-
Policy:IAMUser/S3BucketPublic
-
Policy:IAMUser/S3BucketReplicatedExternally
-
Policy:IAMUser/S3BucketSharedExternally
-
Policy:IAMUser/S3BucketSharedWithCloudFront
-
Data Identifiers
They’re rules that objects and their contents are assessed against:
-
Managed data identifiers: built-in, they use Machine Learning and Pattern Matching and can identify sensitive information for many countries and regions.
-
Custom data identifiers: if you have special needs, you may look for a specific regex. In custom identifiers you can use refiners:
-
Keywords: sequences that need to be in proximity to a regex match.
-
Maximum Match Distance: the maximum distance between the Keyword and the regex match.
-
Ignore Words: sequences within the regex match that invalidate it.
-
Discovery Jobs
They use identifier to scan data and generate Findings.
You specify which buckets you want to analyze and which identifiers (you can use all, include a subset or exclude a subset, or use none of the managed ones) to use.
It runs on a schedule.
Findings
They’re report for matched data. They can either be Sensitive Data Findings or Policy Findings.
They can be viewed from the console or used in integration with other services (like Security Hub, EventBridge and others).
A common pattern is Job ⇒ Finding ⇒ Finding Event on EventBrisge ⇒ Lambda (remediation)/SNS/others
Policy Findings
They’re policies that reduce the security of buckets and object, but they’re only detected if they’re changed after Macie is enabled!
An insecure policy already present when Macie is enabled won’t be caught. For example, if block public access settings are disabled for an S3 bucket after you enable Macie, Macie generates a Policy:IAMUser/S3BlockPublicAccessDisabled finding for the bucket. If block public access settings were disabled for a bucket when you enabled Macie and they continue to be disabled, Macie doesn’t generate a Policy:IAMUser/S3BlockPublicAccessDisabled finding for the bucket.