Configuring AWS and Qrvey for Postgres Connections Utilizing the Export to S3 Method
This method requires additional configuration. The changes allow Postgres to write to a Qrvey S3 bucket.
Steps
Many of the steps below are necessary only when the Postgres RDS and Qrvey deployment are on different AWS accounts.
- Configure Postgres for the export to S3 extension.
- a. Using Postgres query tool, send the following psql command.
CREATE EXTENSION aws_s3 CASCADE;
- b. Grant access to queries used for monitoring and canceling other queries, replacing “myUser” with the Postgres Username which will be used for Qrvey Postgres Connections.
GRANT SELECT ON pg_stat_activityTO myUser;
GRANT SELECT ON pg_terminate_backend TO myUser;
- c. Grant access to “myUser” to use aws_s3 schema and query_export_to_s3 functions.
GRANT USAGE ON SCHEMA aws_s3 TO "myUser";
GRANT EXECUTE ON FUNCTION aws_s3.query_export_to_s3(query text, s3_info aws_commons._s3_uri_1, options text, OUT rows_uploaded bigint, OUT files_uploaded bigint, OUT bytes_uploaded bigint) TO "myUser";
GRANT EXECUTE ON FUNCTION aws_s3.query_export_to_s3(query text, bucket text, file_path text, region text, options text, OUT rows_uploaded bigint, OUT files_uploaded bigint, OUT bytes_uploaded bigint) TO "myUser";
- Create a new Policy.
- a. Open the AWS console.
- b. Switch to the Postgres RDS’s AWS account.
- c. Open AWS Identity and Access Management (IAM)
- d. Click Policies.
- e. Click Create policy.
- f. Select the JSON tab.
- g. Paste the policy below. Change QRVEY_DEPLOYMENT to the Qrvey deployment name.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:AbortMultipartUpload",
"s3:ListBucket",
"s3:PutObjectAcl"
],
"Resource": [
"arn:aws:s3:::QRVEY_DEPLOYMENT-dataload-drdatacommons/*",
"arn:aws:s3:::QRVEY_DEPLOYMENT-dataload-drdatacommons"
]
}
]
}
- h. Click Next: Tags.
- i. Click Next: Review.
- j. Name the new Policy. For example: `PostgresDataAccessPolicy`.
Create a new Role.
- a. Continue in the Postgres AWS account, IAM.
- b. Click Roles.
- c. Click Create Role.
- d. Select AWS account.
- e. Use the filter to find the policy created above.
- f. Check its box.
- g. Click Next.
- h. Enter the Role name. For example: `PostgresDataAccessRole`.
- i. Click Create Role.
- j. Note this Role’s ARN string for later use.
Add a trust relationship.
- a. Click Roles.
- b. Locate the newly created Role.
- c. Click the Trust Relations tab.
- d. Click Edit trust relationship
- e. Click the Trust relationships tab, and then click Edit trust relationship.
- f. Paste the trust relationship shown below, replacing `AWS_ACCOUNT_QRVEY` with the AWS account number for the Qrvey deployment.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::AWS_ACCOUNT_QRVEY:root"
},
"Action": "sts:AssumeRole"
} ]
}
- g. Click Update Trust Policy.
- Set inline policies for the Qrvey lambdas.
- a. Use the Qrvey AWS account.
- b. Open AWS Identity and Access Management (IAM).
- c. Click Roles
- d. Search for the DB lambda role. It contains this string: `DBDatasourcePumpFunction`. There is one Role for each Qrvey deployment. Select the role applying to the current Qrvey deployment.
- e. Click Add inline policy.
- f. Click the JSON tab.
- g. Paste the policy, replacing `AWS_ACCOUNT_POSTGRES` with the AWS account number for the Postgres cluster.
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": [
"arn:aws:iam::AWS_ACCOUNT_POSTGRES:role/PostgresDataAccessRole"
]
}
}
- h. Click Review policy.
- i. Name the policy. For example: `PostgresDataAccessAssumeRole`.
- j. Click Create Policy. See that the new policy has been added.
- k. Repeat these steps for the lambda role `DatasourcePumpFunction`.
- l. Repeat these steps for the lambda role `FilesourcePrimerFunction`.
Set policy for commons bucket.
- a. Use the Qrvey AWS account.
- b. Open AWS S3 service and open the bucket called `QRVEY_DEPLOYMENT-dataload-drdatacommons`.
- c. Click the Permissions tab, scroll down to the Bucket policy section and add the next policy. Change `POSTGRES_DATA_ACCESS_ROLE_ARN` by the access role arn created in the step #3 and change `QRVEY_DEPLOYMENT` to the Qrvey deployment name.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Principal": {
"AWS": "POSTGRES_DATA_ACCESS_ROLE_ARN"
},
"Action": [
"s3:PutObject",
"s3:GetBucketLocation"
],
"Resource": [
"arn:aws:s3:::QRVEY_DEPLOYMENT-dataload-drdatacommons/*",
"arn:aws:s3:::QRVEY_DEPLOYMENT-dataload-drdatacommons"
]
}
]
}
```
- Associate the Postgres cluster with the new Role.
- a. Go to AWS RDS.
- b. Click the desired postgres cluster.
- c. Click the Connectivity and security tab.
- d. Go to the Manage IAM roles section.
- e. Select the role in Add IAM roles to this instance combo box.
- f. Select s3Export in the feature combo box.
- g. Click Add Role.
- Configure a Qrvey Connection.
- a. Open the Qrvey Composer application.
- b. Click Data.
- c. Click Connections.
- d. Create a new Connection for Postgres.
- e. Enter the Host URL, Username, and Password.
- f. Enter the Role ARN created above. (Different accounts only.)
- g. Test the new Connection.
- h. Create a Dataset using the Connection.
- i. Load the Dataset data from Postgres.
For More Information
Importing Amazon S3 data into an RDS for PostgreSQL DB instance