Creating & Downloading Documents in Bulk

In this guide we’ll discover two ways of creating documents in bulk:

Using the Create Multiple Documents API
Using the Vault Loader Command Line Interface (CLI).

We’ll also look at how to download documents in bulk.

Creating multiple documents is a two-step process.

First, upload the source files to your Vault’s File Staging with Vault File Manager, Loader CLI, or using the Vault API’s file staging endpoints.
Then, use the Create Multiple Documents API or Load Data Objects endpoint to set the data attributes (field values) of the documents. These API calls also copy files from File Staging to your Vault.

Vault Loader CLI

You can download Vault Loader CLI from the Vault UI. Learn more about Vault Loader CLI in Vault Help.

Create Documents

In the API call, you must provide values for all required attributes, for each document that you’re creating. Learn more about retrieving the required attributes in Understanding Metadata.

The request below creates three documents from source files doc1.txt, doc2.txt, and doc3.txt. The required metadata attributes for these documents are name__v, type__v and lifecycle__v.

The file column and suppressRendition are not required:

The file attribute tells Vault where to look for the source file in File Staging. Omitting this column or leaving the value blank creates document placeholders.
The suppressRendition attribute tells Vault to delay generating a viewable rendition for the document until a user views the document from the UI. This setting reduces the load on your Vault by preventing unnecessary processing.

CSV Input

Both of the requests below use the create_documents.csv input file.

file,name__v,type__v,lifecycle__v,suppressRendition
doc1.txt,doc1,type,eTMF Lifecycle,true
doc2.txt,doc2,type,eTMF Lifecycle,true
doc3.txt,doc3,type,eTMF Lifecycle,true

Request

Using curl:

curl -X POST -H "Authorization: {SESSION_ID}" \
-H "Content-Type: text/csv" \
-H "Accept: text/csv" \
--data-binary @"C:\Vault\Documents\create_documents.csv" \
https://myvault.veevavault.com/api/v15.0/objects/documents/batch

Using the command line interface:

java -jar VaultDataLoader.jar -createdocuments -csv create_documents.csv

Response

You can see from the response that all documents were successfully created. The id column returns the system ID generated for each document and row_id maps each row back to the original input.

responseStatus,id,name__v,type__v,lifecycle__v,errors,row_id
SUCCESS,1634,,,,,1
SUCCESS,1635,,,,,2
SUCCESS,1636,,,,,3

CLI Async Mode

You can run the CLI asynchronously and free up the command prompt. The CLI gives you a job ID that you can use to track the job status. When the job is complete, you can use -jobresults to download the results, which are similar to the CSV above. In async mode, these don’t download automatically.

Example: CLI Async

These examples show the same document creation process as above, but using the CLI in asynchronous mode:

>> java -jar VaultDataLoader.jar -createdocuments -csv createdoc.csv -async
Job 10479 submitted.

>> java -jar VaultDataLoader.jar -jobstatus 10479
ID         Type                      Status          Progress        Start Time      End Time        Expiration Date
10479      create-documents          Success         3 / 0           2016-11-28      2016-11-28      2016-12-14
                                                                     02:23:49AM GMT  02:23:53AM GMT  02:23:53AM GMT

>> java -jar VaultDataLoader.jar -jobresults 10479

Downloading Documents

Vault Loader (UI or CLI) provides the most efficient way to download source files and renditions. This section describes how to accomplish this from the CLI.

Download Location

Vault Loader downloads source files and renditions to File Staging, not your local machine. Once the process is complete, you can download the files from File Staging with Vault File Manager, Loader CLI, or the Vault API. The results indicate where the files reside on the server and which versions were included in the extract.

Vault uses the following directory structure for downloaded files: {job_id}/{doc_id}/{major_version_number__v}/{minor_version_number__v}/

Download Limits

Vault File Staging automatically removes downloaded files after the job’s Expiration Date.

Vault Loader cannot extract more than 2,000 files (source files and renditions) in a single job.

Example: Downloading with CLI

>> java -jar VaultDataLoader.jar -exportdocument -source -where "id contains (1634, 1635, 1636)" -async

>> java -jar VaultDataLoader.jar -jobstatus 10381
ID         Type                      Status          Progress        Start Time      End Time        Expiration Date
10381      export-documents          Success         3 / 0           2016-11-28      2016-11-28      2016-12-14
                                                                     04:24:25AM GMT  04:24:25AM GMT  04:24:25AM GMT
>> java -jar VaultDataLoader.jar -jobresults 10381
file,id,external_id__v,rendition_type__v,major_version_number__v,minor_version_number__v
/10381/1636/0_1/doc3.txt,1636,,,0,1
/10381/1634/0_1/doc1.txt,1634,,,0,1
/10381/1635/0_1/doc2.txt,1635,,,0,1

Next Steps

Now that you know how to create and download documents, you can learn how to query document metadata attributes using VQL.