Download a Direct Data File

As an alternative to using the Vault Postman Collection, you can use the example Shell script below to download Direct Data files from your Vault. The script uses your provided credentials and the filters specified to download all file parts of the latest available Direct Data file. Your security profile must have the appropriate permissions to use Direct Data API endpoints.

The script requires the following variables:

vault_dns: The DNS of your Vault.
session_id: An active session ID for the Vault.
extract_type: The type of extract you wish to download: full_directdata, incremental_directdata, or log_directdata
start_time: The start time for which to capture data that will be in the Direct Data file, in the format YYYY-MM-DDTHH:MMZ. Always use 2000-01-01T00:00Z if extract_type=full_directdata.
stop_time: The stop time for which to capture data that will be in the Direct Data file, in the format YYYY-MM-DDTHH:MMZ.

This script runs natively in OSX/UNIX systems. On Windows operating systems, the script requires Bash. If you have Git installed, you can use Git Bash.

Run this script from the directory where you would like to download the Direct Data file. If there are multiple file parts, the script combines them into a single .tar.gz file.

Download File

# Add the vault_dns of your Vault
vault_dns="your-vault.veevavault.com"
# Add in your session ID
session_id="YOUR_SESSION_ID"
# Add "full_directdata", "incremental_directdata", or "log_directdata"
extract_type="full_directdata"
# For "full_directdata" always use 2000-01-01T00:00Z as the start_time
start_time="2000-01-01T00:00Z"
# Add the stop_time
stop_time="2024-06-01T15:15Z"
# Will place the files in the current folder where the script runs
target_directory="$(pwd)"

# Perform the API call to retrieve the list of Direct Data files
direct_data_file_list_response=$(curl -s -X GET -H "Authorization: $session_id" \
                                -H "Accept: application/json" \
                                "https://$vault_dns/api/v24.1/services/directdata/files?extract_type=$extract_type&start_time=$start_time&stop_time=$stop_time")

# Extract the response status from the API response
response_status=$(echo "$direct_data_file_list_response" | grep -o '"responseStatus":"[^"]*' | sed 's/"responseStatus":"//')

# Check if the API call was successful
if [ "$response_status" != "SUCCESS" ]; then
    error_message=$(echo "$direct_data_file_list_response" | grep -o '"message":"[^"]*' | sed 's/"message":"//' | tr -d '"')
    if [ -z "$error_message" ]; then
        printf "Retrieve Available Direct Data Files call failed. Exiting script.\n"
    else
        printf "Retrieve Available Direct Data Files call failed with error: %s\n" "$error_message"
    fi
    exit 1
else
    printf "Retrieve Available Direct Data Files call succeeded.\n"

    # Extract data array
    data=$(echo "$direct_data_file_list_response" | grep -o '"data":\[[^]]*\]' | sed 's/"data":\[//' | tr -d ']')

    # Count file parts
    fileparts=$(echo "$data" | grep -o '"fileparts":[0-9]*' | sed 's/"fileparts"://')

    # Check if fileparts is null or empty
    if [ -z "$fileparts" ]; then
        printf "No Direct Data Extract Files found for '$extract_type' with start_time = '$start_time' and stop_time = '$stop_time'.\n"
        exit 0
    fi

    if [ "$fileparts" -gt 1 ]; then
        printf "Multiple file parts.\n"

        # Handling multiple file parts
        filepart_details=$(echo "$data" | grep -o '"filepart_details":\[{"[^]]*' | sed 's/"filepart_details":\[//' | tr -d ']')
        filepart_details=$(echo "$filepart_details" | sed 's/},{/}\n{/g')
        filename=$(echo "$data" | grep -o '"filename":"[^"]*' | sed 's/"filename":"//' | tr -d '"' | head -n 1)

        while IFS= read -r filepart_detail; do
            filepart_url=$(echo "$filepart_detail" | grep -o '"url":"[^"]*' | sed 's/"url":"//' | tr -d '"')
            output_filepart_name=$(echo "$filepart_detail" | grep -o '"filename":"[^"]*' | sed 's/"filename":"//' | tr -d '"')
            curl -o "$output_filepart_name" -X GET -H "Authorization: $session_id" \
                                  -H "Accept: application/json" \
                                  "$filepart_url"
        done <<< "$filepart_details"

        # Combine file parts
        name=$(echo "$data" | grep -o '"name":"[^"]*' | sed 's/"name":"//' | tr -d '"' | head -n 1)
        cat "$name."* > "$filename"

        full_path="$target_directory/$name"
        if [ ! -d "$full_path" ]; then
            # Directory does not exist, create it
            mkdir -p "$full_path"
            printf "Directory '%s' created.\n" "$full_path"
        else
            printf "Directory '%s' already exists.\n" "$full_path"
        fi

        tar -xzvf "$filename" -C "$full_path"
    else
        printf "Only one file part.\n"

        # Handling single file part
        filepart_detail=$(echo "$data" | grep -o '"filepart_details":\[{"[^]]*' | sed 's/"filepart_details":\[//' | tr -d '{}')
        filepart_url=$(echo "$filepart_detail" | grep -o '"url":"[^"]*' | sed 's/"url":"//' | tr -d '"')
        filename=$(echo "$data" | grep -o '"filename":"[^"]*' | sed 's/"filename":"//' | tr -d '"' | head -n 1)

        curl -o "$filename" -X GET -H "Authorization: $session_id" \
            -H "Accept: application/json" "$filepart_url"

        name=$(echo "$data" | grep -o '"name":"[^"]*' | sed 's/"name":"//' | tr -d '"' | head -n 1)
        full_path="$target_directory/$name"

        if [ ! -d "$full_path" ]; then
            # Directory does not exist, create it
            mkdir -p "$full_path"
            printf "Directory '%s' created.\n" "$full_path"
        else
            printf "Directory '%s' already exists.\n" "$full_path"
        fi

        tar -xzvf "$filename" -C "$full_path"
    fi
fi