Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 22 additions & 1 deletion score-server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,28 @@ bucket:
download:
partsize: 250000000 # 250 MB
```
At this time, only a single Azure Blob Storage account (and container) is used. However, since Azure Storage can only store 500 TB per account, the Storage Server will need to manage multiple account/key credentials in the near future. It may also make sense to have multiple containers per account as well. There are suggestions that having many objects in a single container can impose a performance penalty on some operations.
### S3 Configuration
To configure SCORe to use an S3-compatible object storage service, the s3 Spring profile must be activated. This profile is designed to work with services like AWS S3, MinIO, or any other compatible service. Below are the configuration properties available under the s3 section in application.yml, along with their descriptions:

Profile Name: ``s3``

# Configuration Properties

| Property | Description |
| ------------- | ------------- |
|`s3.secured` | Determines whether the connection to the S3 service should use HTTPS (true) or HTTP (false). Set to true to secure the connection. |
|`s3.endpoint` |The URL of the S3-compatible service. This is the endpoint where the service is hosted. This property should be provided based on the service you're using.
|`s3.accessKey` |The access key for authenticating with the S3 service. It's required for secure access and should be kept confidential.|
|`s3.secretKey`|The secret key paired with the access key for secure authentication to the S3 service. It must also be kept confidential.|
|`s3.masterEncryptionKeyId`|The ID of the encryption key used for server-side encryption of files stored in S3. If provided, this key ensures that all data at rest is encrypted using the specified key.|
|`s3.customMd5Property`| A custom metadata property that stores an MD5 checksum of the uploaded files. This is useful for validating the integrity of the files.|
|`s3.connectionTimeout`| The maximum amount of time (in milliseconds) the client will wait to establish a connection to the S3 service before timing out. This helps manage delays in the network or service availability.|
|`s3.retryLimit`| The number of retries the client will attempt if an operation fails (e.g., an upload). This helps ensure robustness in case of transient issues.
|`s3.sigV4Enabled`| Enables AWS Signature Version 4 for request signing. This is required for certain regions or when using advanced features like server-side encryption with AWS KMS.|
Comment on lines +54 to +64
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 looks great!



> [!NOTE]
> At this time, only a single Azure Blob Storage account (and container) is used. However, since Azure Storage can only store 500 TB per account, the Storage Server will need to manage multiple account/key credentials in the near future. It may also make sense to have multiple containers per account as well. There are suggestions that having many objects in a single container can impose a performance penalty on some operations.

The Storage Server no longer uses ``.meta`` files to track state in the repository. Object Specifications are dynamically generated on the fly for use on the client (to allow downloads to be resumed). Also, the block upload implementation supplied by Microsoft in the Azure Java SDK supercedes the use of this file.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,11 @@
import org.springframework.boot.context.properties.ConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Profile;

/** S3/Ceph Object Gateway configuration. */
@Data
@Slf4j
@Configuration
@Profile({"aws", "collaboratory", "default"})
@ConfigurationProperties(prefix = "s3")
public class S3Config {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,11 +29,9 @@
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Profile;

/** Server level configuration */
@Configuration
@Profile({"aws", "collaboratory", "default"})
public class ServerConfig {

@Value("${upload.partsize}")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,10 @@
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;
import org.springframework.context.annotation.Profile;
import org.springframework.stereotype.Component;

/** check health for object upload service */
@Component
@Profile({"aws", "collaboratory", "default"})
public class BackendHealth implements HealthIndicator {

@Autowired AmazonS3 s3;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,15 +55,13 @@
import lombok.val;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Profile;
import org.springframework.http.HttpStatus;
import org.springframework.stereotype.Service;

/** service responsible for object download (full or partial) */
@Slf4j
@Setter
@Service
@Profile({"aws", "collaboratory", "default"})
public class S3DownloadService implements DownloadService {

/** Constants. */
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,12 @@
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.cache.annotation.Cacheable;
import org.springframework.context.annotation.Profile;
import org.springframework.http.HttpStatus;
import org.springframework.stereotype.Service;

@Slf4j
@Setter
@Service
@Profile({"aws", "collaboratory", "default"})
public class S3ListingService implements ListingService {

/** Configuration. */
Expand Down Expand Up @@ -109,8 +107,10 @@ private List<ObjectInfo> listBucketContents(String bucket) {
}

private void readBucket(String bucketName, String prefix, Consumer<S3ObjectSummary> callback) {
val request = prefix.isBlank() ? new ListObjectsRequest().withBucketName(bucketName) :
new ListObjectsRequest().withBucketName(bucketName).withPrefix(prefix);
val request =
prefix.isBlank()
? new ListObjectsRequest().withBucketName(bucketName)
: new ListObjectsRequest().withBucketName(bucketName).withPrefix(prefix);
log.debug("Reading summaries from '{}/{}'...", bucketName, prefix);

ObjectListing listing;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@
import lombok.val;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Profile;
import org.springframework.http.HttpStatus;
import org.springframework.stereotype.Service;
import org.springframework.web.client.RestClientException;
Expand All @@ -55,7 +54,6 @@
@Slf4j
@Setter
@Service
@Profile({"aws", "collaboratory", "default"})
public class S3UploadService implements UploadService {

/** Constants. */
Expand Down
68 changes: 32 additions & 36 deletions score-server/src/main/resources/application.yml
Original file line number Diff line number Diff line change
Expand Up @@ -130,68 +130,64 @@ server:
---

###############################################################################
# Profile - "amazon"
# Profile - "azure"
###############################################################################

spring:
config:
activate:
on-profile: amazon
on-profile: azure

s3:
endpoint: s3-external-1.amazonaws.com
masterEncryptionKeyId: af628f04-ac12-4b11-bf83-6545fd44ad18
azure:
endpointProtocol: https
accountName: oicricgc
accountKey:

bucket:
name.object: oicr.icgc
name.state: oicr.icgc
name.object: data
policy.upload: UploadPolicy
policy.download: DownloadPolicy

metadata:
url: https://virginia.song.icgc.org
useLegacyMode: false
download:
partsize: 250000000

---

###############################################################################
# Profile - "collaboratory"
# Profile - "s3"
###############################################################################

spring:
config:
activate:
on-profile: collaboratory
on-profile: s3

s3:
endpoint: https://object.cancercollaboratory.org:9080
masterEncryptionKeyId:
# Whether the connection should use HTTPS (true) or HTTP (false)
secured: true

metadata:
url: https://song.cancercollaboratory.org
useLegacyMode: false
# Endpoint URL for the S3-compatible service
# endpoint: your-s3-endpoint

---
# Access key for authentication
# accessKey: your-access-key

###############################################################################
# Profile - "azure"
###############################################################################
# Secret key for authentication
# secretKey: your-secret-key

spring:
config:
activate:
on-profile: azure
# Master encryption key ID for server-side encryption
# masterEncryptionKeyId: your-encryption-key-id

azure:
endpointProtocol: https
accountName: oicricgc
accountKey:
# Custom MD5 checksum property (if needed)
customMd5Property: md5chksum

bucket:
name.object: data
policy.upload: UploadPolicy
policy.download: DownloadPolicy
# Connection timeout in milliseconds
connectionTimeout: 15000

download:
partsize: 250000000
# Retry limit for failed operations
retryLimit: 5

# Whether to use Signature Version 4
sigV4Enabled: true

---

Expand Down